Archive for August, 2009
I thought I’d have a go at implementing some path tracing in CUDA. Let’s start simple: a classical path tracer with explicit direct lighting. Lots of hacks:
- No BVH yet, every ray tests the 30 triangles of the Cornell Box
- Every surface is lambertian (so cosine weighted hemisphere sampling for spawning rays)
- Hardcoded for a single area light (which the camera cannot see)
- Uses copy-pasted Moller intersection test from CPU code
- Random number generation got moved to a texture read (with the texture data updated CPU-side) to avoid absurd register counts
I’m extremely excited about the results of Understanding the Efficiency of Ray Traversal on GPUs, and the related work by NVIDIA on ray traversal. In a programming way of course.
There’s this interesting paradigm shift from a strongly geometric grid model to one where we have persistent threads running small kernels (or actually large kernels due to the way CUDA code is currently linked) and grabbing their own jobs asynchronously. The interesting thing about this shift is that this is the way PS3 developers on Cell have been writing SPU job systems for years. Now I admit that the underlying hardware is radically different (massive hardware threading and wide SIMD vs no hardware threading and more conventional SIMD), but the same simple primitives of a resident kernel using atomic increment to grab from a shared job list still apply. I have no idea where this programming model is going to converge, but I think it certainly looks like it is.
(Atomic increment is actually only CUDA compute 1.1, so even your 1 year old laptop with an NVIDIA mobile chipset can probably run this sort of code. Of course it’s nicer with the 1.3 voting primitives, but you can emulate these through shared memory, so no need to go bargain hunting for a GTX 260 just yet.)