Where are the bottlenecks in graphics?
At 400mhz P-II, about 60-80mpix/sec required for CPU-limited WinBench3D98 score at 800x600 resolution
For a typical application: to deliver 1M triangles/second, 100M 32bit pixels/second, 2 textures/pixel requires:
- 1M triangles * 3 vertices/triangle * 32 bytes/vertex = 100 Mbytes
& triangle data crosses the bus 3-5 times, so requires 300-500 Mbytes/second
(we’ve exhausted system memory bandwidth now - AGP 2X is maxed out!)
- 100M pixels * 8 bytes/pixel (32bit RGBA, 32bit Z/stencil) = 800 Mbytes
& pixel data crosses the bus ~1.5 times (RMW), so requires 1.2 Gbytes/second
- 2 textures/pixel * 4 texels/texture * 2 bytes/texel * 100M pixels = 1.6 Gbytes
& texture cache creates 4X reuse efficiency, so requires 400 Mbytes/second
Currently, transferring triangle vertex data to the graphics is the bottleneck, not transform and lighting, not fill rate, not texture rate