RSX Reality Synthesizer
The Reality Synthesizer is a proprietary graphics processing unit developed jointly by Nvidia and Sony for the PlayStation 3 video game console. Based on Nvidia's GeForce 7 series, specifically the 7800 GTX, the RSX utilizes a hybrid design incorporating elements of the G70 and G71 architecture. It features separate vertex and pixel shader pipelines and supports advanced graphics rendering features such as high dynamic range, anti-aliasing, and S3 texture compression, with a theoretical floating-point performance of 192 GFLOPS.
The RSX includes 256 MB of GDDR3 SDRAM, clocked at 650 MHz with an effective transmission rate of 1300 MT/s. It can also access up to 224 MB of the console’s XDR DRAM main memory through the Cell Broadband Engine, the PlayStation 3's CPU, allowing for a combined maximum of 480 MB of usable memory.
While the RSX handles the majority of graphics processing tasks, the Cell processor assists with graphics-related computations, offering a complementary role in rendering workloads.
Specifications
Specifications
Unless otherwise noted, the following specifications are derived from Sony’s press materials released at the E3 2005 conference, slides presented at the same event, and a Sony presentation at the 2006 Game Developers Conference.- Floating-point performance: 192 GFLOPS
- Core clock: 500 MHz
- Manufacturing process: 90 nm, 65 nm, 40 nm, and 28 nm
- Transistor count: Over 300 million
- Architecture: Based on NV47
- Endianness: Little-endian
- Texture units: 24 texture filtering, 8 vertex texture addressing
- Texel fillrate: 12.0 gigatexels/sec
- Texture sampling: 32 unfiltered samples/clock
- Render output units : 8
- Peak pixel fillrate : 4.0 gigapixels/sec
- Z-buffering rate: 8.0 gigasamples/sec
- Dot product operations: Up to 51 billion/sec
- Pixel precision: 128-bit, supporting high-dynamic-range rendering
- Interfaces: Cell FlexIO
- API support: PSGL
- Texture compression: Support for S3TC
Additional features
- Texture filtering: Bilinear, trilinear, anisotropic, quincunx
- Antialiasing: Quincunx, up to 4× MSAA, SSAA, Alpha to Coverage, Alphakill
Memory architecture
- GDDR3 SDRAM memory:
- * 256 MB at 650 MHz
- * 128-bit interface
- * 20.8 GB/s bandwidth
- * Structure: 2 partitions
- ** Bus width: 64-bit per partition
- ** Banks: 8 per partition
- ** Pages per bank: 4,096
- ** Row address: 12-bit
- ** Column address: 9-bit
- ** Minimum access granularity: 8 bytes
- Access to CPU Rambus XDR DRAM via Cell FlexIO bus interface
- * 56-bit serial out of 64-bit)
- * Bandwidth: 20 GB/s read, 15 GB/s write
- 576 KB texture cache
RSX memory map
Although the RSX has 256 MB of GDDR3 RAM, not all of it is usable. The last 4 MB is reserved for keeping track of the RSX internal state and issued commands. The 4 MB of GPU Data contains RAMIN, RAMHT, RAMFC, DMA Objects, Graphic Objects, and the Graphic Context. The following is a breakdown of the address within 256 MB of the RSX.| Address Range | Size | Comment |
| 0000000-FBFFFFF | 252 MB | Framebuffer |
| FC00000-FFFFFFF | 4 MB | GPU Data |
| FF80000-FFFFFFF | 512 KB | RAMIN: Instance Memory |
| FF90000-FF93FFF | 16 KB | RAMHT: Hash Table |
| FFA0000-FFA0FFF | 4 KB | RAMFC: FIFO Context |
| FFC0000-FFCFFFF | 64 KB | DMA Objects |
| FFD0000-FFDFFFF | 64 KB | Graphic Objects |
| FFE0000-FFFFFFF | 128 KB | GRAPH: Graphic Context |
Besides local GDDR3 memory, main XDR memory can be accessed by RSX too, which is limited to either:
- 0 MB – 256 MB
- 0 MB – 512 MB
Speed, bandwidth and latency
System bandwidth :- Cell to/from 256 MB XDR : 25.6 GB/s
- Cell to RSX : 20 GB/s
- Cell from RSX : 15 GB/s
- RSX to/from 256 MB GDDR3 : 20.8 GB/s
Speed table
Because of the very slow Cell Read speed from the 256 MB GDDR3 memory, it is more efficient for the Cell to work in XDR and then have the RSX pull data from XDR and write to GDDR3 for output to the HDMI display. This is why extra texture lookup instructions were included in the RSX to allow loading data from XDR memory.RSX libraries
The RSX is dedicated to 3D graphics, and developers are able to use different API libraries to access its features. The easiest way is to use high level PSGL, which is basically OpenGL|ES with programmable pipeline added in, however this is unpopular due to the performance overhead on a relatively weak console CPU.At a lower level developers can use LibGCM, which is an API that builds RSX command buffers at a lower level.. This is done by setting up commands and DMA Objects and issuing them to the RSX via DMA calls.
Differences with the G70 architecture
The RSX 'Reality Synthesizer' is based on the G70 architecture, but features a few changes to the core. The biggest difference between the two chips is the way the memory bandwidth works. The G70 only supports rendering to local memory, while the RSX is able to render to both system and local memory. Since rendering from system memory has a much higher latency compared to rendering from local memory, the chip's architecture had to be modified to avoid a performance penalty. This was achieved by enlarging the chip size to accommodate larger buffers and caches in order to keep the graphics pipeline full. The result was that the RSX only has 60% of the local memory bandwidth of the G70, making it necessary for developers to use the system memory in order to achieve performance targets.| Difference | RSX | Nvidia 7800GTX |
| GDDR3 Memory bus | 128bit | 256bit |
| ROPs | 8 | 16 |
| Post Transform and Lighting Cache | 63 max vertices | 45 max vertices |
| Total Texture Cache Per Quad of Pixel Pipes | 96 kB | 48kB |
| CPU interface | FlexIO | PCI-Express 16x |
| Technology | 28 nm/40 nm/65 nm/90 nm | 110 nm |
Other RSX features/differences include:
- More shader instructions
- Extra texture lookup logic
- Fast vector normalize
Press releases
Sony staff were quoted in PlayStation Magazine saying that the "RSX shares a lot of inner workings with Nvidia 7800 which is based on G70 architecture." Since the G70 is capable of carrying out 136 shader operations per clock cycle, the RSX was expected to feature the same number of parallel pixel and vertex shader pipelines as the G70, which contains 24 pixel and 8 vertex pipelines.Nvidia CEO Jensen Huang stated during Sony's pre-show press conference at E3 2005 that the RSX is twice as powerful as the GeForce 6800 Ultra.
Bumpgate
The RSX GPU in early models of the PlayStation 3 was initially fabricated using a 90 nm process and was affected by reliability issues related to its packaging and thermal behavior—a problem commonly referred to as "Bumpgate". The high operating temperatures of the chip could weaken the solder joints in the ball grid array connecting the die to the interposer, leading to degraded performance or complete hardware failure over time.Several factors contributed to these failures:
- Mismatched coefficients of thermal expansion between materials in the die and interposer caused differential expansion during thermal cycles, placing mechanical stress on the BGA.
- Uneven heat distribution across the die, due to variable transistor densities and localized workloads, led to thermal hotspots. These caused parts of the die to expand at different rates, increasing mechanical fatigue in certain regions of the BGA.
- Electromigration within the solder joints led to the formation of voids, further weakening the BGA connections.
- The RSX was packaged using a flip-chip process. The underfill material used in early versions had a relatively low glass transition temperature, which could be exceeded during prolonged gameplay. Once the Tg was surpassed, the underfill lost structural integrity and provided less mechanical support to the solder joints, increasing the risk of failure.