Graphics pipeline
The computer graphics pipeline, also known as the rendering pipeline, or graphics pipeline, is a framework within computer graphics that outlines the necessary procedures for transforming a three-dimensional scene into a two-dimensional representation on a screen. Once a 3D model is generated, the graphics pipeline converts the model into a visually perceivable format on the computer display. Due to the dependence on specific software, hardware configurations, and desired display attributes, a universally applicable graphics pipeline does not exist. Nevertheless, graphics application programming interfaces, such as Direct3D, OpenGL and Vulkan were developed to standardize common procedures and oversee the graphics pipeline of a given hardware accelerator. These APIs provide an abstraction layer over the underlying hardware, relieving programmers from the need to write code explicitly targeting various graphics hardware accelerators like AMD, Intel, Nvidia, and others.
The model of the graphics pipeline is usually used in real-time rendering. Often, most of the pipeline steps are implemented in hardware, which allows for special optimizations. The term "pipeline" is used in a similar sense for the pipeline in processors: the individual steps of the pipeline run in parallel as long as any given step has what it needs.
Concept
The 3D pipeline usually refers to the most common form of computer 3-Dimensional rendering called 3D polygon rendering, distinct from Raytracing and Raycasting. In Raycasting, a ray originates at the point where the camera resides, and if that ray hits a surface, the color and lighting of the point on the surface where the ray hit is calculated. In 3D polygon rendering the reverse happens – the area that is given the camera is calculated and then rays are created from every part of every surface given the camera and traced back to the camera.Structure
A graphics pipeline can be divided into three main parts: Application, Geometry, and Rasterization.Application
The application step is executed by the software on the main processor. During the application step, changes are made to the scene as required, for example, by user interaction using input devices or during an animation. The new scene with all its primitives, usually triangles, lines, and points, is then passed on to the next step in the pipeline.Examples of tasks that are typically done in the application step are collision detection, animation, morphing, and acceleration techniques using spatial subdivision schemes such as Quadtrees or Octrees. These are also used to reduce the amount of main memory required at a given time. The "world" of a modern computer game is far larger than what could fit into memory at once.
Geometry
The geometry step, which is responsible for the majority of the operations with polygons and their vertices, can be divided into the following five tasks. It depends on the particular implementation of how these tasks are organized as actual parallel pipeline steps.Definitions
A vertex is a point in the world. Many points are used to join the surfaces. In special cases, point clouds are drawn directly, but this is still the exception.A triangle is the most common geometric primitive of computer graphics. It is defined by its three vertices and a normal vector – the normal vector serves to indicate the front face of the triangle and is a vector that is perpendicular to the surface. The triangle may be provided with a color or with a texture. Triangles are preferred over rectangles because any three points in 3D space always create a flat triangle. On the other hand, four points in a 3D space may not necessarily create a flat rectangle.
The World Coordinate System
The world coordinate system is the coordinate system in which the virtual world is created. This should meet a few conditions for the following mathematics to be easily applicable:- It must be a rectangular Cartesian coordinate system in which all axes are equally scaled.
- Whether a right-handed or a left-handed coordinate system is to be used may be determined by the graphic library to be used.
First, we need three rotation matrices, namely one for each of the three aircraft axes.
We also use a translation matrix that moves the aircraft to the desired point in our world:
Now we could calculate the position of the vertices of the aircraft in world coordinates by multiplying each point successively with these four matrices. Since the multiplication of a matrix with a vector is quite expensive, one usually takes another path and first multiplies the four matrices together. The multiplication of two matrices is even more expensive but must be executed only once for the whole object. The multiplications and are equivalent. Thereafter, the resulting matrix could be applied to the vertices. In practice, however, the multiplication with the vertices is still not applied, but the camera matrices are determined first.
The order in which the matrices are applied is important because the matrix multiplication is not commutative. This also applies to the three rotations, which can be demonstrated by an example: The point lies on the X-axis, if one rotates it first by 90° around the X- and then around The Y-axis, it ends up on the Z-axis. If, on the other hand, one rotates around the Y-axis first and then around the X-axis, the resulting point is located on the Y-axis. The sequence itself is arbitrary as long as it is always the same. The sequence with x, then y, then z is often the most intuitive because the rotation causes the compass direction to coincide with the direction of the "nose".
There are also two conventions to define these matrices, depending on whether you want to work with column vectors or row vectors. Different graphics libraries have different preferences here. OpenGL prefers column vectors, DirectX row vectors. The decision determines from which side the point vectors are to be multiplied by the transformation matrices.
For column vectors, the multiplication is performed from the right, i.e., where vout and vin are 4x1 column vectors. The concatenation of the matrices also is done from the right to left, i.e., for example, when first rotating and then shifting.
In the case of row vectors, this works exactly the other way around. The multiplication now takes place from the left as with 1x4-row vectors and the concatenation is when we also first rotate and then move. The matrices shown above are valid for the second case, while those for column vectors are transposed. The rule applies, which for multiplication with vectors means that you can switch the multiplication order by transposing the matrix.
In matrix chaining, each transformation defines a new coordinate system, allowing for flexible extensions. For instance, an aircraft's propeller, modeled separately, can be attached to the aircraft nose through translation, which only shifts from the model to the propeller coordinate system. To render the aircraft, its transformation matrix is first computed to transform the points, followed by multiplying the propeller model matrix by the aircraft's matrix for the propeller points. This calculated matrix is known as the 'world matrix,' essential for each object in the scene before rendering. The application can then dynamically alter these matrices, such as updating the aircraft's position with each frame based on speed.
The matrix calculated in this way is also called the world matrix. It must be determined for each object in the world before rendering. The application can introduce changes here, for example, changing the position of the aircraft according to the speed after each frame.
Camera Transformation
In addition to the objects, the scene also defines a virtual camera or viewer that indicates the position and direction of view relative to which the scene is rendered. The scene is transformed so that the camera is at the origin looking along the Z-axis. The resulting coordinate system is called the camera coordinate system and the transformation is called camera transformation or View Transformation.Projection
The 3D projection step transforms the view volume into a cube with the corner point coordinates and ; Occasionally other target volumes are also used. This step is called projection, even though it transforms a volume into another volume, since the resulting Z coordinates are not stored in the image, but are only used in Z-buffering in the later rastering step. In a perspective illustration, a central projection is used. To limit the number of displayed objects, two additional clipping planes are used; The visual volume is therefore a truncated pyramid. The parallel or orthogonal projection is used, for example, for technical representations because it has the advantage that all parallels in the object space are also parallel in the image space, and the surfaces and volumes are the same size regardless of the distance from the viewer. Maps use, for example, an orthogonal projection, but oblique images of a landscape cannot be used in this way – although they can technically be rendered, they seem so distorted that we cannot make any use of them. The formula for calculating a perspective mapping matrix is:The reasons why the smallest and the greatest distance have to be given here are, on the one hand, that this distance is divided to reach the scaling of the scene, and on the other hand to scale the Z values to the range 0..1, for filling the Z-buffer. This buffer often has only a resolution of 16 bits, which is why the near and far values should be chosen carefully. A too-large difference between the near and the far value leads to so-called Z-fighting because of the low resolution of the Z-buffer. It can also be seen from the formula that the near value cannot be 0 because this point is the focus point of the projection. There is no picture at this point.
For the sake of completeness, the formula for parallel projection :
For reasons of efficiency, the camera and projection matrix are usually combined into a transformation matrix so that the camera coordinate system is omitted. The resulting matrix is usually the same for a single image, while the world matrix looks different for each object. In practice, therefore, view and projection are pre-calculated so that only the world matrix has to be adapted during the display. However, more complex transformations such as vertex blending are possible. Freely programmable geometry shaders that modify the geometry can also be executed.
In the actual rendering step, the world matrix * camera matrix * projection matrix is calculated and then finally applied to every single point. Thus, the points of all objects are transferred directly to the screen coordinate system.