071103 / Random Shots

previous | next



And one showing extreme motion blur (which is done in-engine with geometry stretching),

Optimization :: Transform Feedback, Vertex Shaders, and Pack/Unpack

One of the primary paths I am trying to optimize logically expands a "particle" into a "motion card" for rendering. Now that I am trying to render this motion card to 6 faces of a cubemap basically I have to generate a few VBOs which will get read from 6 times per frame. A similar pipeline is to be used for the physics/CFD also. Having 6 passes, for the sake of memory bandwidth and vertex attribute cache performance, I am trying to keep the VBOs interleaved instead of separate.

Transform feedback has the limitation of only allowing a maximum of 16 FP32 components as output per vertex. I am using the transform feedback as data expansion to write out 3 points per input vertex (or particle). So I can only output 5 values per point with transform feedback (3x5=15, 16 max). So I need to have multiple vertex passes to generate extra VBOs for the other per point outputs. Now with SM4.0 the assembly level fragment pack/unpack opcodes work in vertex shaders as well. However it appears that I would now have to convert all my GLSL to assembly to use this functionality. I am currently trying to find out if there is a way to do this in either GLSL or Cg. What I want to do is to pack multiple values into a FP32 value, then unpack in the vertex shader before I render my motion card triangles into the 6 cubemaps.

If I have to move from GLSL to Cg or assembly, better to do it now early in the development instead of later...