081230 / GL3 Textures

previous | next

GL3 Textures

Base texture supports mipmapping, filtering, multiple wrap modes, and is accessable only via normalized floating point coordinates [0..1]. ARB_texture_non_power_of_two adds support for non-power-of-two (NPOT) sized regular textures. Some hardware might not store NPOT sized textures in as memory efficient way as POT sized textures.

ARB_texture_rectangle supplies the GL_TEXTURE_RECTANGLE_ARB texture target type. This provides non-normalized texel addressing [0..width]x[0..height], but at the loss of mipmapping and special wrap modes. I believe that pixel addressing in this case is hardware accelerated and doesn't require a hidden multiply to get to normalized coordinates. Otherwise there would be no point to this type of texture given that GLSL 1.30 supplies textureSize() to query texture dimensions from a shader (so you could manually emulate this functionality using standard NPOT textures). Not sure if textureSize() is provided via standard constant buffer fetches, or special hardware instructions. Keep in mind you still have to address rectangle textures using a floating point coordinate. So if the shader computes this coordinate using integer math, I'm expecting a non-free ALU op for the int to float conversion.

The textureOffset() function enables an extra limited range non-normalized integer texel coordinate offset to be added in before wrap during sampling. This offset must be specified with a constant expression (it is compiled into the shader). The range of this offset is GL_MIN_PROGRAM_TEXEL_OFFSET to GL_MAX_PROGRAM_TEXEL_OFFSET, which I believe is [-8,7] on my GPU. This offset cannot be applied to the layer coordinate of texture arrays, and is not supported for cubemaps. The importance of this offset is that it enables one to easily sample extra texels for filtering, without manually doing address calculations. I'm guessing the limited range is due to texel offset getting passed in as a bitfield in the physical texture instruction. So one still needs to do manual addressing if using non-integer texture offsets for the free bilinear filtering filter trick (ie sample precisely inbetween texels to control weighted average).

GLSL 1.30 add texture arrays (excluding cubemaps), a mix of setting LOD bias, LOD, gradient, offset, and doing a projective lookup. Integer textures are also supported, but without texture filtering. Fixed point 8-bit and 16-bit textures clearly support filtering.

GL 3.0 / GLSL 1.30 adds support for texelFetch() and texelFetchOffset() for 1D, 2D, 3D, and 1D, 2D texture arrays. Texel fetch takes an integer coordinate. These functions also support specifing LOD (mip) level. Clearly there would be no reason for texelFetchOffset() if the address calculation for this function was not hardware accelerated. So I assume texelFetch() is just as fast as a regular texture fetch on regular textures. Texel fetches do NOT support filtering, wrap, or LOD clamp, and out of range addressing returns undefined results. I'm also assuming that these texel fetches are cached along with other normal texture fetches, do to the address assumption above but have not done the testing to verify this.

EXT_texture_buffer_object provides a new type of "texture", which is effectively a 1D texel array of up to 2^27 texels accessable by texelFetch() only. Data is supplied via an attached buffer object via glTexBufferEXT(), so texture buffers can fetch data from framebuffer readbacks, transform feedback and more. Probably would be a good idea to check if texture buffers are cached or uncached, and if some formats require hidden ALU cycles for format conversion. I have a suspicion that texture buffers are implemented via direct global memory reads, but haven't done the proper testing to see for sure!

Tessellation of Subdivision Surfaces in DirectX 11

Been looking a little more at tessellation (presentation embedded below) for DX11, and thinking about how this is going to effect the traditional game based rendering engine (not Atom).

Advantages
- 8x savings in memory for subdivision surfaces over regular triangle mesh.
- View dependent level of detail.
- Smaller control mesh for better amortization of expensive vertex work.

Disadvantages
- Each topology combination requires a separate rendering pass (23 in example)!
- Store 4 texture coordinates per vertex to avoid cracks (but small number of verts).
- 3D displacements?

Something about the requirement for separate rendering passes per topology combination presents a red flag for me (more draw calls). Tessellation is a solution for data amplification, but does not solve the problem of data reduction, which IMO is the greater problem (or perhaps the problem I am more interested in). What I mean by reduction is joining of objects and features with a reduction of draw calls to support detailed huge view distances.

However, I haven't yet started to think about using the tessellator in ways it wasn't designed for (ie not triangles or patches). It could very well prove quite useful to think of data amplification as happening from a single root node (so reduction is not needed).