# Simon's Graphics Blog

Work log for ideas and hobby projects.

## Sampling Sun And Sky

with one comment

In this post I will briefly cover how I implemented sampling of external light sources in a path tracing framework, concluding with an observation about sampling multiple external light sources that are non-zero over very different solid angles. I’m going to assume the reader is familiar with path tracing in the Veach framework.

My definition of an external light source, which I’ve also seen called an “infinite” light source since they are considered to be infinitely far away (and infinitely bright as a result), is as follows:

• Radiance always originates from outside of the scene bounds
• Radiance is a function of world space direction only (not sample position)

A simple example would be a cube map considered to be always centered at the sample point.

Written by Simon Brown

March 30th, 2012 at 10:12 pm

## SketchUp Cities

Ray Tracey’s latest blog post has Brigade 2 renders of a nice-looking walled city scene created using Google SketchUp. The model came from this gem of a collection by “LordGood” (who evidently is a big Assassin’s Creed fan) hosted on Google 3D Warehouse.

Currently I only have a Blender exporter, and sadly the SketchUp-Collada-Blender path was producing garbage, but even the free version of SketchUp allows custom ruby plugins. After a bit of hunting around I found this OBJ exporter ruby plugin which worked very well, and now I have much nicer test meshes than my bad Blender programmer art.

The above images are from my usually-being-refactored path tracer with Preetham (et al) sun/sky. It doesn’t render as quickly as Brigade 2 (also I only have a lowly GTX 460 to render on), and yes it’s all diffuse and I haven’t exported any of the textures, and there’s no atmospheric terms or depth of field or shading normals or remote-controlled Stanford bunny, but it’s nice to have some decent public domain data to use.

I’m slowly working on a Bidirectional Instant Radiosity post (hopefully using this scene) but it’ll have to wait until work is less mental.

Written by Simon Brown

March 19th, 2012 at 10:11 pm

## Hybrid Bidirectional Path Tracing

I’d like to share my results from converting a CPU-only bidirectional path tracer into a CPU/GPU hybrid (CPU used for shading and sampling, GPU used for ray intersections). These results are a bit old… I posted them a while ago as a thread on ompf. I found out later that this thread had been cited in Combinatorial Bidirectional Path-Tracing for Efficient Hybrid CPU/GPU Rendering, so let me summarise it here.

Written by Simon Brown

March 21st, 2011 at 9:14 pm

## Now You’re Lighting With Portals

I hate dome lights. You always waste a ton of rays that are occluded by geometry, and the situation gets even worse when lighting indoor scenes with exterior dome lights!

So why not help your renderer out and place portals that, when hit, teleport to the dome light. Then instead of sampling the whole skydome, we just sample the portals, and avoid sending rays where we know they will be occluded.

As an example, here’s the Sponza scene using an exterior (uniform) dome light, rendered using unidirectional path tracing with multiple importance sampling:

Dome Sampling (32spp)

Lots of rays never manage to find the open roof, so we get plenty of noise. Now let’s replace the dome light with a portal that covers the open roof, then allow that to be sampled instead:

Portal Sampling (32spp)

Noise is greatly reduced, for exactly the same number of rays.

The sampling algorithm is simple enough to implement in your GPU path tracer of choice: sample the portal and use the usual conversion between pdf wrt area (the portal) and pdf wrt solid angle (the dome):

$P_\sigma = \frac{P_A \|\mathbf{v}\|^2}{\cos(\theta)} = \frac{P_A \|\mathbf{v}\|^3}{\mathbf{v}.\mathbf{n}}$

Where v is the vector between target point and the portal point, and n is the portal normal.

Written by Simon Brown

January 31st, 2011 at 10:55 pm

## Two-Way Path Tracing

This post is about a path tracing technique that sits between unidirectional path tracing and bidirectional path tracing.

For want of a better name, let’s call this two-way path tracing. It’s defined as follows:

• Trace eye rays, handle light source intersections and sample light sources explicitly
• Trace light rays, handle sensor intersections and sample sensors explicitly
• When computing weights for multiple importance sampling, take both tracing methods into account

So you can think of this technique as either:

1. Unidirectional path tracing in both directions at once
2. Bidirectional path tracing, but we only connect sub-paths if one of the sub-paths has one vertex

So why is this interesting? Because:

• Like unidirectional path tracing, you only need to track a fixed amount of state, regardless of maximum path length. This is potentially nice for GPU implementations where you usually want to avoid hitting memory and have a large number of paths in flight.
• You can efficiently multiple importance sample between forward and reverse paths, so you can get reduced variance compared to unidirectional path tracing for some types of scenes (e.g. caustics).

In this post I’d like to cover how to multiple importance sample between forward and reverse paths, and show some test images.

Written by Simon Brown

January 3rd, 2011 at 2:17 pm

## Adventures in CUDA Path Tracing: Part 2

This is really just teaser post for my next update.  I’ve not done much on traversal yet (hence the world of spheres), but I’ve made some progress on shading.  Here’s a screenshot of a pure CUDA renderer left for 20 seconds or so to get a nice smooth result:

CUDA Path Tracing

This scene contains a few BSDFs:

• Lambertian (the cyan, magenta and “wall” spheres)
• Perfect specular (the mirror sphere)
• Fresnel dielectric (the glass sphere)
• Blinn microfacet (the “floor” sphere)

Everything uses importance sampling to reduce variance, which lets caustics converge quite quickly even on glossy surfaces like the one shown here. I’m preparing a post to go into more details of the rendering algorithm, which is a type of path tracing that I think works quite well on the GPU…

Written by Simon Brown

October 24th, 2010 at 6:21 pm

## CUDA Tips

I’ve been doing a bit more GPU programming recently, here are some things I found when writing CUDA programs. This all refers to the CUDA compiler in the recent 3.2RC, and based on my experiences with GTX 275 hardware. In particular this advice may need to be tweaked for Fermi architecture GPUs, since I have yet to experiment with one.

Written by Simon Brown

October 24th, 2010 at 2:20 pm

Posted in CUDA,GPGPU

## Using Optix

with one comment

Optix version 2.0 was released recently, so I gave it a go by plugging it into an existing multi-core path tracer. This path tracer can submit tens of thousands of ray queries as a batch so should be a good match for Optix and the GPU.

I liked:

• Ease of use. Wow this thing makes GPU ray tracing easy: I wrote a few tiny CUDA functions, the runtime reported nice errors for my bugs, I fixed the bugs and it worked as expected!
• Net performance win. It improved the performance of the path tracer, but not by much (see below).

I disliked:

• Everything is synchronous. All optix calls seem to block for completion, so I couldn’t find a way to pipeline memory transfers with GPU work in a single optix context. Since my use case involved heavy interop between CPU and GPU, this was a big performance loss.
• No CUDA interop. There seems to be no support for using CUDA allocations in Optix kernels. So in particular you can’t use page-locked host memory to remove all those redundant (blocking) copies completely.

In conclusion I have mixed feelings about Optix. I think it’s a great tool for hobby projects or small demos, but I need async calls and much improved CUDA interop before I’d use it for anything larger.

Written by Simon Brown

August 11th, 2010 at 12:41 pm

## CUDA Mersenne Twister

with one comment

I needed a random number generator for a CUDA project, and had relatively few requirements:

• It must have a small shared memory footprint
• It must be suitable for Monte Carlo methods (i.e. have long period and minimal correlation)
• It must allow warps to execute independently when generating random numbers

There seem to be two main approaches to RNG in CUDA:

1. Each thread has its own local history, operates independently. This can be seen in the Mersenne Twister sample in the CUDA SDK (which has a very short history of 19 values). This usually requires an expensive offline process to seed each thread appropriately to avoid correlation. I can’t spare the registers or local memory for this approach.
2. Have a single generator per thread block, parallelise the update between all threads and synchronise using __syncthreads. This is the approach in the recent MTGP CUDA sample. I can’t use this approach because I am allowing each warp in the block to process jobs independently (using persistent threads) – calls to __syncthreads to synchronise every thread in the block are not possible.

What I ended up with is basically a modified version of MTGP (the second approach above), but with each warp able to grab random numbers independently from the shared MT state. This had the nice side-effect of reducing the shared memory footprint to be the same as the equivalent CPU MT implementation. Read the rest of this entry »

Written by Simon Brown

December 13th, 2009 at 11:09 pm