In the previous article we learned about spaces and how to position and orient objects in world space by applying transformation matrices on them. We also learned about camera space, that is simply another coordinate system within the world space.
Recall that during rendering, a vertex first gets mapped from world space into camera space and then projected onto the 2D view plane using a projection matrix (roughly speaking). So in this post, we deal with the question how to set up the camera space by positioning and orienting the camera and how to derive a matrix from it so that we can map a vertex from world space into camera space. In OpenGL, this matrix (called the "view matrix") plays a big role and requires to be specified in every OpenGL program.
The coordinate system of the camera, as discussed in the previous section, is spaned by three orthonormal vectors . The position of the camera is defined by its focal point (or eye) named in world space. Note that is also the origin of the camera coordinate system. By convention the camera looks into a direction , which is most often calculated by subtracting a "look-at" point from the focal point
Now we compute and with help of an "up-vector", (0, 0, 1), that basically points upwards. With the help of the cross-product we can now compute the vectors
that spans our new orthonormal camera coordinate system.
Now that we set up the camera space, we need to construct a matrix that maps from world space into camera space. More concretely, to map a given vertex from world space to camera space, we apply the following two steps:
- translate with respect to the camera position, and then
- map the translated point into the coordinate system .
These two steps will later be combined into one matrix which are then together called camera transformation.
The translation part is fairly easy. With our given camera position , we use a translation matrix to move relative to the camera position
Ok. Now for the mapping part, we have two options how to proceed: Either we try to set up a rotation matrix that rotates the vertex into place in camera space. This would require to determine the angles between the axis coordinates so that we can use them to rotate the dimensions of the point. Or, we make use of a wonderful trick that is applicable when dealing with orthonormal coordinate systems.
We are going to do the latter.
Let us quickly review definition of the dot product, which says that for two given vectors , , where the dot product yields
and is the angle between the two vectors. The dot product basically computes the scaling factor of both and to the point where is orthogonally projected on (and vice versa). This may be a little confusing which is why I tried to elaborate on the dot product a bit deeper in the math appendix.
Now, , and are orthogonal to each other, meaning that
holds true, so that they span a so-called orthonormal basis in world space. This allows to set up a mapping matrix that "rotates" a point from world space into camera space by simply computing the dot product between the point and each coordinate axis vector
Now that we have both transformations ready, we are able to multiply them into one matrix . This is called the view matrix.
The result of applying the view matrix on in world space is a new set of coordinates in camera space. It is important to understand that the coordinates are just scaling coefficients for the axes which itself lay in world space. The linear combination describes the position of the point in camera space. But the point has not been "moved", its position is just now been described relative to the origin and orientation of the camera space.
Let us apply the above example in code. I make use of my own vector and matrix implementation which are very similar to those other countless implementations that can be found on the web.
Note that I start using heterogenious coordinates right from the beginning.
// the position of the camera, called 'eye' Vector3f c = new Vector3f(5, -5, 8); Vector3f u = new Vector3f(); Vector3f v = new Vector3f(); Vector3f w = new Vector3f(); // compute "negative" look direction by substracting // c from the look-at point (3,4,0) w.subAndAssign(c, new Vector3f(3, 4, 0)); // w = c - (3,4,0) w.normalize(); // compute cross product u.crossAndAssign(new Vector3f(0, 0, 1), w); // side = (0,0,1) x w u.normalize(); v.crossAndAssign(w, u); // up = side x look v.normalize(); Matrix4f rotation = new Matrix4f(); // identity rotation.setIdentity(); // note the format: set(COLUMN, ROW, value) // it may be different for your matrix implementation rotation.set(0, 0, u.x); rotation.set(1, 0, u.y); rotation.set(2, 0, u.z); rotation.set(0, 1, v.x); rotation.set(1, 1, v.y); rotation.set(2, 1, v.z); rotation.set(0, 2, w.x); rotation.set(1, 2, w.y); rotation.set(2, 2, w.z); Matrix4f translation = new Matrix4f(); // identity translation.set(3, 0, -c.x); translation.set(3, 1, -c.y); translation.set(3, 2, -c.z); // view matrix Matrix4f view = new Matrix4f(); view.multAndAssign(rotation, translation); // view = rotation * translation // print matrix on console view.print();
At the end of the code snipped we print out the matrix on the console. This is what it says.
0.9761871 0.21693046 0.0 -3.7962832 -0.1421731 0.6397789 0.75529456 -2.1325965 0.16384639 -0.73730874 0.65538555 -9.748859 0.0 0.0 0.0 1.0
If you are building your own Matrix class, I recommend to incorporate a lookAt method that sets up the matrix. Also, I recommend creating a camera object that caches the , and vectors and provides general heper methods to deal with positioning, pitch, yaw, roll and other stuff that you may need (eg. a matrix stack).