Convergence
I’m extremely excited about the results of Understanding the Efficiency of Ray Traversal on GPUs, and the related work by NVIDIA on ray traversal. In a programming way of course.
There’s this interesting paradigm shift from a strongly geometric grid model to one where we have persistent threads running small kernels (or actually large kernels due to the way CUDA code is currently linked) and grabbing their own jobs asynchronously. The interesting thing about this shift is that this is the way PS3 developers on Cell have been writing SPU job systems for years. Now I admit that the underlying hardware is radically different (massive hardware threading and wide SIMD vs no hardware threading and more conventional SIMD), but the same simple primitives of a resident kernel using atomic increment to grab from a shared job list still apply. I have no idea where this programming model is going to converge, but I think it certainly looks like it is.
(Atomic increment is actually only CUDA compute 1.1, so even your 1 year old laptop with an NVIDIA mobile chipset can probably run this sort of code. Of course it’s nicer with the 1.3 voting primitives, but you can emulate these through shared memory, so no need to go bargain hunting for a GTX 260 just yet.)
Metropolis Light Transport
Here are a collection of papers/links on the topic of Metropolis Light Transport (MLT). The core principle of detailed balance that underpins the Metropolis-Hastings algorithm is extremely neat, and its application to light transport (in particular using Veach’s path integral formulation) is very aesthetically pleasing. This post doesn’t really go anywhere, just provides links for further reading. Read the rest of this entry »
Multiple Importance
At work I wrote a global illumination system from scratch. It used classical ray tracing for the direct lighting, and photon mapping with final gather for the indirect term. I use the past tense since we’ve now switched over to using lightcuts as the main renderer, which due to the work of an awesome colleague, is giving us better results (and faster).
To complete the set, I thought I’d have a go at implementing a bidirectional path tracer, a “full Veach“, if you will… Read the rest of this entry »
OpenCL on the CPU
So the old news is that the OpenCL specification has been done in record time and endorsed by all the major GPU manufacturers.
This is many kinds of awesome, but I’m wondering if any particular vendor is going to concentrate on a CL_DEVICE_TYPE_CPU implementation. I think a CPU implementation of OpenCL is important for two reasons:
- Debugging. Have you ever tried to debug a large CUDA kernel? This is my number 1 reason for a CPU implementation, as we can generate some nice debug info and use our favourite debugger.
- Wider Adoption. Not everyone has access to a machine with a 1 million thread GPU from the future. However, pretty much everyone has multiple SIMD cores, even in one year old laptops. If low/mid performance can be achieved by using SIMD, software fibers, and multiple physical cores, then a developer can write extremely scalable code with minimal requirements for a baseline spec.
Wikipedia states that LLVM are doing the initial implementation of OpenCL, but has no citation. Perhaps I’ve missed some announcement or other, but if I get to read about full-featured CPU OpenCL support for a popular compiler (e.g. gcc, msvc) then I will be very happy!
Spam
If you’ve not received any spam that looks like it’s from this domain, then please ignore this post.
If you have, then you have my sympathy, but please note I had nothing to do with it.
The spam in question has the subject “Your internet access is going to get suspended” and claims to attach a zip of the recipient’s illegal activities. In reality it contains some sort of win32 trojan. I’m trying to get the full message header of the offending emails, but my guess is that either I’ve just been unlucky and picked as the reply-to, or someone broke the password on my mailbox on this domain. Anyhow, I’ve changed all relevant passwords, and I’m in the process of finding out if anything was broken into.
Apologies for a post that has nothing to do with the usual material, but getting 200+ emails from corporate anti-virus software prompted me to try to disassociate this domain from the spam.
Site Blogged
My homegrown php code is ageing badly, so it’s time to ditch the lot and join the WordPress collective.
Hopefully the site survived the transition:
- The squish library has its own page, but this just links to the Google Code project where this is now hosted.
- All other old articles and code should have been converted into posts.
- Found a thoroughly awesome WordPress redirection plugin, so all the old links should still work. Please post a comment if your incoming link didn’t work.
- I have stopped messing around with the site now, permalink structure is final!
Perhaps now I can post more than one item per year!
DXT Compression Techniques
This article presents an explanation of two techniques that can be used to perform DXT colour compression. They were designed during the development of an open source DXT compression library called squish.
Read the rest of this entry »
Spherical Harmonic Basis Functions
Spherical harmonic basis functions can be defined in various different ways depending on your derivation or normalisation requirements. This page defines the real-valued set I use, along with algorithms and code snippets for irradiance estimation and run-time evaluation.
Read the rest of this entry »
A Lua Syntax Highlighter
Lua is a very compact, fast scripting language with great platform support. Editing lua scripts is nice and easy in most editors (such as gvim), but Visual Studio doesn’t highlight the syntax by default. So here’s a plug-in for Visual Studio .NET 2003 that highlights Lua 5.0 syntax and provides auto-completion for keywords and identifiers as you type.
Read the rest of this entry »
Gamma-Correct Rendering
With consumer-level hardware now capable of rendering high dynamic range image data, the days of the 8-bit sRGB framebuffer are numbered. Programmers of next-generation graphics devices are able to model lighting systems to high accuracy, then tone-map these values into a displayable range for conventional 8-bit sRGB equipment, such as PC monitors.
The graphics pipeline from source art to final output is complicated, and requires the programmer to work in several different colour spaces along the way. In this article I’ll give a brief overview of colour spaces, and then detail a commonly overlooked area in the texture pipeline where gamma is important.
Read the rest of this entry »