How a pinhole camera works (part 1)
Reading time: 18 mins.What Will You Learn in this Lesson?
In the previous lesson, we learned about some key concepts involved in the process of generating images, however, we didn't speak specifically about cameras. 3D rendering is not only about producing realistic images by the mean of perspective projection. It is also about being able to deliver images similar to that of real-world cameras. Why? Because when CG images are combined with live-action footage, images delivered by the renderer need to match images delivered by the camera with which that footage was produced. In this lesson, we will develop a camera model that allows us to simulate results produced by real cameras (we will use with real-world parameters to set the camera). To do so, we will first start to review how film and photographic cameras work.
More specifically, we will show in this lesson how to implement a camera model similar to that used in Maya and most (if not all) 3D applications (such as Houdini, 3DS Max, Blender, etc.). We will show the effect each control that you can find on a camera has on the final image and how to simulate these controls in CG. This lesson will answer all questions you may have about CG cameras such as what the film aperture parameter does and how the focal length parameter relates to the angle of view parameter.
While the optical laws involved in the process of generating images with a real-world camera are simple, they can be hard to reproduce in CG, not because they are complex but because they are essentially and potentially expensive to simulate. Hopefully, though you don't need very complex cameras to produce images. It's quite the opposite. You can take photographs with a very simple imaging device called a pinhole camera which is just a box with a small hole on one side and photographic film lying on the other. Images produced by pinhole cameras are much easier to reproduce (and less costly) than those produced with more sophisticated cameras, and for this reason, the pinhole camera is the model used by most (if not all) 3D applications and video games. Let's start to review how these cameras work in the real world and build a mathematical model from there.
It is best to understand the pinhole camera model which is the most commonly used camera model in CG, before getting to the topic of the perspective projection matrix that reuses concepts we will be studying in this lesson such as the camera angle of view, the clipping planes, etc.
Camera Obscura: How is an Image Formed?
Most algorithms we use in computer graphics simulate how things work in the real world. This is particularly true of virtual cameras which are fundamental to the process of creating a computer graphics image. The creation of an image in a real camera is pretty simple to reproduce with a computer. It mainly relies on simulating the way light travels in space and interacts with objects including camera lenses. The light-matter interaction process is highly complex but the laws of optics are relatively simple and can easily be simulated in a computer program. There are two main parts to the principle of photography:
-
The process by which an image is stored on film or in a file.
-
The process by which this image is created in the camera.
In computer graphics, we don't need a physical support to store an image thus simulating the photochemical processes used in traditional film photography won't be necessary (unless like the Maxwell renderer, you want to provide a realistic camera model but this is not necessary to get a basic model working).


Now let's talk about the second part of the photography process: how images are formed in the camera. The basic principle of the image creation process is very simple and shown in the reproduction of this illustration published in the early 20th century (Figure 1). In the setup from Figure 1, the first surface (in red) blocks light from reaching the second surface (in green). First, however, make a small hole (a pinhole). Light rays can then pass through the first surface at one point and, by doing so, form an (inverted) image of the candle on the other side (if you follow the path of the rays from the candle to the surface onto which the image of the candle is projected, you can see how the image is geometrically constructed). In reality, the image of the candle will be very hard to see because the amount of light emitted by the candle passing through point B is very small compared to the overall amount of light emitted by the candle itself (only a fraction of the light rays emitted by the flame or reflected off of the candle will pass through the hole).
A camera obscura (which in Latin means dark room) works on the same principle. It is a lightproof box or room with a black interior (to prevent light reflections) and a tiny hole in the center on one end (Figure 2). Light passing through the hole forms an inverted image of the external scene on the opposite side of the box. This simple device led to the development of photographic cameras. You can perfectly convert your room into a camera obscura, as shown in this video from National Geographic (all rights reserved).
To perceive the projected image on the wall your eyes first need to adjust to the darkness of the room, and to capture the effect on a camera, long exposure times are needed (from a few seconds to half a minute). To turn your camera obscura into a pinhole camera all you need to do is put a piece of film on the face opposite the pinhole. If you wait long enough (and keep the camera perfectly still), light will modify the chemicals on the film and a latent image will form over time. The principle for a digital camera is the same but the film is replaced by a sensor that converts light into electrical charges.
How Does Real Camera Work?
In a real camera, images are created when light falls on a surface that is sensitive to light (note that this is also true for the eye). For a film camera, this is the surface of the film and for a digital camera, this is the surface of a sensor (or CCD). Some of these concepts have been explained in the lesson Introduction to Ray-Tracing, but we will explain them again here briefly.

In the real world, light comes from various light sources (the most important one being the sun). When light hits an object, it can either be absorbed or reflected into the scene. This phenomenon is explained in detail in the lesson devoted to light-matter interaction which you can find in the section Mathematics and Physics for Computer Graphics. When you take a picture, some of that reflected light (in the form of packets of photons) travels in the direction of the camera and passes through the pinhole to form a sharp image on the film or digital camera sensor. We have illustrated this process in Figure 3.
Many documents on how photographic film works can be found on the internet. Let's just mention that a film that is exposed to light doesn't generally directly create a visible image. It produces what we call a latent image (invisible to the eye) and we need to process the film with some chemicals in a darkroom to make it visible. If you remove the back door of a disposable camera and replace it with a translucent plastic sheet, you should be able to see the inverted image that is normally projected onto the film (as shown in the images below).

Pinhole Cameras
The simplest type of camera we can find in the real world is the pinhole camera. It is a simple lightproof box with a very small hole in the front which is also called an aperture and some light-sensitive film paper laid inside the box on the side facing this pinhole. When you want to take a picture, you simply open the aperture to expose the film to light (to prevent light from entering the box, you keep a piece of opaque tape on the pinhole which you remove to take the photograph and put back afterward).

The principle of a pinhole camera is simple. Objects from the scene reflect light in all directions. The size of the aperture is so small that among the many rays that are reflected off at P, a point on the surface of an object in the scene, only one of these rays enter the camera (in reality it's never exactly one ray, but more a bundle of light rays or photons composing a very narrow beam of light). In Figure 3, we can see how one single light ray among the many reflected at P passes through the aperture. In Figure 4, we have colored six of these rays to track their path to the film plane more easily; notice one more time by following these rays how they form an image of the object rotated by 180 degrees. In geometry, the pinhole is also called the center of projection; all rays entering the camera converge to this point and diverge from it on the other side.
To summarize: light striking an object is reflected in random directions in the scene, but only one of these rays (or, more exactly, a bundle of these rays traveling along the same direction) enters the camera and strikes the film in one single point. To each point in the scene corresponds a single point on the film.
In the above explanation, we used the concept of point to describe what's happening locally at the surface of an object (and what's happening locally at the surface of the film); however, keep in mind that the surface of objects is continuous (at least at the macroscopic level) therefore the image of these objects on the surface of the film also appears as continuous.
What we call a point for simplification, is a small area on the surface of an object or a small area on the surface of the film. It would be best to describe the process involved as an exchange of light energy between surfaces (the emitting surface of the object and the receiving surface or the film in our example), but for simplification, we will just treat these small surfaces as points for now.


The size of the aperture matters. To get a fairly sharp image each point (or small area) on the surface of an object needs to be represented as one single point (another small area) on the film. As mentioned before, what passes through the hole is never exactly one ray but more a small set of rays contained within a cone of directions. The angle of this cone (or more precisely its angular diameter) depends on the size of the hole as shown in Figure 6.


The smaller the pinhole, the smaller the cone and the sharper the image. However, a smaller pinhole requires a longer exposure time because as the hole becomes smaller, the amount of light passing through the hole and striking the film surface decreases. It takes a certain amount of light for an image to form on the surface of a photographic paper; thus, the less light it receives, the longer the exposure time. It won't be a problem for a CG camera, but for real pinhole cameras, a longer exposure time increases the risk of producing a blurred image if the camera is not perfectly still or if objects from the scene move. As a general rule, the shorter the exposure time, the better. There is a limit, though, to the size of the pinhole. When it gets very small (when the hole size is about the same as the light's wavelength), light rays are diffracted, which is not good either. For a shoe-box-sized pinhole camera, a pinhole of about 2 mm in diameter should produce optimum results (a good compromise between image focus and exposure time). Note that when the aperture is too large (Figure 5 bottom), a single point on the image, if you keep using the concept of point or discrete lines to represent light rays (for example, point A or B in Figure 5), appears multiple times on the image. A more accurate way of visualizing what's happening in that particular case is to imagine the footprints of the cones overlapping each over on the film (Figure 6 bottom). As the size of the pinhole increases, the cones become larger, and the amount of overlap increases. The fact that a point appears multiple times in the image (in the form of the cone's footprint or spot becoming larger on the film, which you can see as the color of the object at the light ray's origin being spread out on the surface of the film over a larger region rather than appearing as a singular point as it theoretically should) is what causes an image to be blurred (or out of focus). This effect is much more visible in photography when you take a picture of very small and bright objects on a dark background, such as fairy lights at night (Figure 8). Because they are small and generally spaced away from each other, the disks they generate on the picture (when the camera hole is too large) are visible. In photography, these disks (which are not always perfectly circular but explaining why is outside the scope of this lesson) are called circles of confusion or disks of confusion, blur circles, blur spots, etc. (Figure 8).
To better understand the image formation process, we created two short animations showing light rays from two disks passing through the camera's pinhole. In the first animation (Figure 9), the pinhole is small, and the image of the disks is sharp because each point on the object corresponds to a single point on the film.

The second animation (Figure 10) shows what happens when the pinhole is too large. In this particular case, each point on the object corresponds to multiple points on the film. The result is a blurred image of the disks.

In conclusion, to produce a sharp image we need to make the aperture of the pinhole camera as small as possible to ensure that only a narrow beam of photons coming from one single direction enters the camera and hits the film or sensor in one single point (or a surface as small as possible). The ideal pinhole camera has an aperture so small that only a single light ray enters the camera for each point in the scene. Such a camera can't be built in the real world though for reasons we already explained (when the hole gets too small, light rays are diffracted) but it can in the virtual world of computers (in which light rays are not affected by diffraction). Note that a renderer using an ideal pinhole camera to produce images of 3D scenes outputs perfectly sharp images.

In photography, the term depth of field (or DOF) defines the distance between the nearest and the farthest object from the scene that appears "reasonably" sharp in the image. Pinhole cameras have an infinite depth of field (but lens cameras have a finite DOF). In other words, the sharpness of an object does not depend on its distance from the camera. Computer graphics images are most of the time produced using an ideal pinhole camera model, and similarly to real-world pinhole cameras, they have an infinite depth of field; all objects from the scene visible through the camera are rendered perfectly sharp. Computer-generated images have sometimes been criticized for being very clean and sharp; the use of this camera model has certainly a lot to do with it. Depth of field however can be simulated quite easily and a lesson from this section is devoted to this topic alone.
Very little light can pass through the aperture when the pinhole is very small, and long exposure times are required. It is a limitation if you wish to produce sharp images of moving objects or in low-light conditions. Of course, the bigger the aperture, the more light enters the camera; however, as explained before, this also produces blurred images. The solution is to place a lens in front of the aperture to focus the rays back into one point on the film plane, as shown in the adjacent figure. This lesson is only an introduction to pinhole cameras rather than a thorough explanation of how cameras work and the role of lenses in photography. More information on this topic can be found in the lesson from this section devoted to the topic of depth of field. However, as a note, and if you try to make the relation between how a pinhole camera and a modern camera works, it is important to know that lenses are used to make the aperture as large as possible, allowing more light to get in the camera and therefore reducing exposure times. The role of the lens is to cancel the blurry look of the image we would get if we were using a pinhole camera with a large aperture by refocusing light rays reflected off of the surface of objects to single points on the film. By combining the two, a large aperture and a lens, we get the best of both systems, shorter exposure times, and sharp images (however, the use of lenses introduces depth of field, but as we mentioned before, this won't be studied or explained in this lesson). The great thing about pinhole cameras, though, is that they don't require lenses and are, therefore, very simple to build and are also very simple to simulate in computer graphics.
