Friday, January 30, 2009

Rasterization vs. Ray Tracing and the Size of your Data

We all learn about rasterization and ray tracing in introductory computer graphics classes if we are computer graphicists. But, at a fundamental level, what it the difference? Teaching computer graphics has made me think about these things again. As an exercise, challenge yourself to write pseudocode for each algorithm in exactly 4 lines. Then keep reading.

I asked a number of graphics PhD students to do this tonight, and they dove straight into details about projecting triangles and computing viewing transformations, etc. But there is a fundamental difference at a much higher level that interacts with the way we make data to affect the computational complexity of these algorithms. My 4-liner versions of the algorithms look like this:

Rasterization:
for each shape:
for each pixel:
if the shape is visible in the pixel:
compute the color and draw it in the pixel

Ray tracing:
for each pixel:
for each shape:
if the shape is visible in the pixel:
compute the color and draw it in the pixel

These look almost exactly the same. We're really just swapping the order in which we iterate over data. But renderers, in practice, have to process a _lot_ of data. Way more than we can hold in memory, in the case of film quality scenes. And performance is tied to data locality, since data locality lets us avoid having to move stuff in and out of memory a lot, which makes things really slow. That constant on the memory transfer is big.

In either algorithm, we have to make an image, so we bite the bullet there and keep the image data around. It's also of fixed size and usually fits in memory, so it's not much of an issue. Shape data grows without bound, however. My experience in film production tells me to be careless about the size of your shape data until it becomes a real problem for other people, because artist time is really really really expensive and processor time is just expensive (until you hit your memory limit, where it becomes a constant multiplier on the next artist down the pipeline). So we push the size of shape data over the limit of what fits in memory on a regular basis.

Rasterizers deal with this by only processing each shape once, throwing the current shape data away at the end of the inner loop and ignoring global appearance interdependencies. Shapes are expensive to read in (we're accessing disks here), but it's O(n) in terms of how many times we read in shape data.

Ray tracers usually have to read all the shape data multiple times, since a lot of it has to get tossed out of memory _for each pixel_, since many shapes will often push memory limit bounds all on their own. And then ray tracers figure they've taken the performance hit already, so why not try to solve some of the global appearance interdependencies while we're at it. And now we have brute force solutions to the rendering equation, if we have good BRDFs. O(n^recursion depth), if my big oh is right (my math is bad on Fridays). Yikes. So we do things like bounding recursion and probabilistically killing recursive calls to keep this under control. Even if we completely ignore global appearance interdependencies, we have O(kn), where k is our number of pixels.

And when n is big, as it is for high-quality algorithms, that can mean days or weeks instead of minutes.

Thursday, January 29, 2009

Are points and vectors (and normals) different?

I'm teaching a computer graphics course right now. This is an expanded version of an answer to a question posed by both a student and a very smart ta:

Are points and vectors different? We teach coordinate free geometry in our graphics course, so we say yes. The idea doesn't always sit well with people with a strong math background, as opposed to a geometry background.

It really depends on your point of view of the world. In graphics, we need to be careful about how we transform things, to make sure we transform them correctly.

From the mathematical vector space point of view, all points can be represented as vectors, with the implicit assumption that they have an origin at zero. So yes, points are vectors contained in a mathematical vector space.

When we transform geometry, we need to make a distinction between locations in space (points) and directional offsets with magnitude in space ("vectors"). It sort of overloads the terminology a bit, but the direction with magnitude does not change as we translate around, while the location does--this is why we make the distinction. This is all part of the coordinate free geometry point of view. It says that we establish points and vectors as separate entities and make them behave differently with transformations. Then, define the basic operations between them to keep the difference straight. So a location plus an offset in some direction with some magnitude gives you a new location. Normal vectors give you yet a third way of transforming things.

When you write code, you have two choices: make the distinction or don't. If you don't make the distinction, you have one 3-component data type usually called a "vector." And then, whenever you transform stuff, you have to make sure you transform it the correct way, so your code has to keep track what geometric thing you have stored in that vector. So you have to do some kind of type checking and make sure you get it right whenever you write new code that deals with transformations. Something like

if v is a vector:
v' = transformAsVector(v)
else if v is a point
v' = transformAsPoint(v)
else if v is a normal
v' = transformAsNormal(v)

Keeping track of what v is gets way more tedious when you are adding point-vectors to vector-vectors, and so on.

If you make different data types for points, vectors, and normals, you can write one transform operation that does this checking internally or through c++ operator overloading, and you never have to think about it again in your code. And if you implement operations like p+v = p correctly, you don't have to keep track of what geometric thing you have. Geometric algorithms do a lot of point-vector algebra, and keeping track of what's what inside your geometric code is tedious and bug-prone. And you get one line of code to transform something:
v' = transform(v)

We teach the CFG point of view because you need to know the difference to do geometry correctly, it can make your geometric derivations easier to think through, and it can make your code significantly shorter, easier to read, and a lot less bug-prone.

There's also another level to CFG that recommends encoding the space of the coordinates with the point and compensating for spatial difference inside, for example, the point + vector operation. This is rarely implemented (I have only seen it done in the paper below), since most geometric algorithms deal with all their data already in the same space.

In my code, I break one CFG rule. I allow point + point addition. Adding two locations is geometrically meaningless. I have some data filtering code that I want to be type-independent. The filtering, when you think about it, is really doing barycentric combinations of points. But then we're back to the question of "should the filtering code do type checking to call a barycentric combination operation when it encounters points instead of scalars or vectors?" Then the code gets hard to read, and it requires me so hunt down whatever I named the barycentric combination operation when I wrote it two years ago. So points can be added to points, and we still get to write a simple smoothing filter as:
s[i] = 0.5 * s[i] + 0.25 * s[i -1] + 0.25 s[i + 1]
(s is a point sample).


About CFG: A Coordinate Free Geometry ADT. Stephen Mann, Nathan Litke, Tony DeRose. 1997

Friday, January 02, 2009

consumer 3d

I have been pleasantly griping to a number of people lately about the lack of 3d animation software that is approachable to non-professionals. What's interesting is the variety of responses I get back. Academic researchers always point to their favorite paper that has a sketch based interface, but these usually only solve part of the problem (by nature, as they are research projects). Engineers working on 3d software seem to be of two generations: the older group thinks people who want to use it will learn it and that's that. The younger group of engineers knows better, since they had to learn what the older group built, and they know it is not at all easy to learn. Surprisingly, animators seem to argue against the need for it, supporting the notion that you need professional training to be an animator.

Think of apple's ilife software. Pretty much anyone can make something cool with it without any training and without the manual. If you want to go professional, you can buy final cut pro or logic or aperture or whatever the fancy version of photoshop is called. But it won't make you a professional. You still need training to learn the professional skills. Why not for 3d animation? As it stands, you need training just to learn how to look around.

It's a tough problem to make something like this since you have to safeguard people from making bad animation, but the variety of opinions on the idea is interesting. These aren't all-encompassing claims about the groups of people mentioned and certainly do not apply to everyone, but they are trends.