090121 / Killzone 2 Bean Spilling previous | next Just to make sure my personal bias is clear, I've never been tempted to pre-order a game before, I'm actively waiting to pickup my pre-ordered KZ2. I'd pay my $60 just to visually dissect the tech, regardless if the game is good. From what I've seen thus far (which is little more than internet vids), KZ2 sets the bar for game technology. Killzone 2 Tech View the 40 minute behind the scenes making of KZ2 video at Game Kings! Refresh with their previous deferred rendering presentation. I tossed out a guess on the Beyond3D Forums that their post would be between 15-30% of the GPU time depending on frame rate, and it seems as if this video interview confirms that moving GPU work (post processing, perhaps more) to the SPUs has resulted in an amazing 20 to 40% performance gain. Apparently tech was being polished late in development. The "4xMSAA trick" low resolution particle artifacts are clear on this set of August images (which appear to be direct framebuffer grabs). Look at the first muzzle flash image at 720P. Then check out the rocket image (2x2 pixel blocks). Also seems as if the small high detail particles are directly blending (perhaps into the G-Buffer) at full resolution. R2 does something similar with high detail particles at full resolution. Post August, I don't see any KZ2 shots with the 4xMSAA artifacts. Could either be doctored for public consumption shots, or that they indeed changed the particle rendering. I see evidence in at least one shot that they might now be drawing to a downsampled buffer and later merging back. View this shot at 720P. Look where the explosion is behind the sandbags, looks more like the Battlefield Bad Company particle upsizing. Perhaps the change was to provide downsize particle rendering which adapts to overdraw by changing resolution. How About Some Numbers Notes below are my best guess of the debug info gathered from many frames of the above linked video. It took a bunch of frames to make out the numbers, so don't expect sums to add up properly. CPU TIME -------- Unknown .......... 1.24% SPU Sync ......... 0.06% AI Manager ....... 0.47% Game Logic ....... 9.52% Script ........... 0.80% Physics .......... 1.57% Representation ... 10.46% Draw ............. 20.18% HUD .............. 2.19% Sound ............ 0.65% Profile HUD ...... 25.17% GPU Sync ......... 37.99% ---------- Total Time ....... 36.85% SPU TIME ---------------------------- AI.Cover ................... ........ 0.00% AI.LineOfFire .............. ........ 0.00% Anim.EdgeAnim .............. 33 ..... 2.01% Anim.Skinning .............. 152 .... 30.68% Gfx.DecalUpdate ............ 9 ...... 0.78% Gfx.LightProbes ............ 396 .... 9.00% Gfx.PB.DeferredSchedule .... 1 ...... 0.60% Gfx.PB.Forward ............. 2 ...... 1.69% Gfx.PB.Geometry ............ 1 ...... 18.67% Gfx.PB.Lights .............. 1 ...... 0.66% Gfx.PB.ShadowMap ........... 1 ...... 4.20% Gfx.Particles.ManagerJob ... 1 ...... 3.14% Gfx.Particles.UpdateJob .... 130 .... 12.33% Gfx.Particles.VertexJob .... 70 ..... 20.64% Gfx.Post.BloomCapture ...... 12 ..... 2.80% Gfx.Post.BloomIntegrate .... 8 ...... 1.52% Gfx.Post.DepthOfField ...... 64 ..... 12.12% Gfx.Post.DepthToFuzzy ...... 8 ...... 0.67% Gfx.Post.Downsample ........ 29 ..... 0.61% Gfx.Post.GrainWeight ....... 1 ...... 0.51% Gfx.Post.HBlur ............. 45 ..... 3.02% Gfx.Post.ILR ............... 1 ...... 0.63% Gfx.Post.Modulate .......... 27 ..... 1.3?% Gfx.Post.MotionBlur ........ 46 ..... 11.31% Gfx.Post.Unlock? ........... 1 ...... 0.01% Gfx.Post.Upsample .......... 108 .... 9.47% Gfx.Post.VBlur ............. 46 ..... 3.73% Gfx.Post.Vg??lle ........... 1 ...... 1.18% Gfx.Post.Zero .............. 16 ..... 0.64% Gfx.Scene.Portals .......... 3 ...... 30.72% Mesh.Decompression ......... ........ 0.00% Physics.Collide ............ 4 ...... 2.48% Physics.Integrate .......... 4 ...... 2.11% Physics.KdTree ............. 8 ...... 20.50% Physics.Raycast ............ ........ 0.00% Snd.MP3.Stereo ............. 2 ...... 2.60% Snd.MP3.Surround ........... 2 ...... 7.51% Snd.?Synth ................. 35 ..... 3.23% Snd.Reverb ................. 14 ..... 4.02% ---------------------------- Total Time ................. 1232 ... 227.46% GRAPHICS -------- FPS ................. 30 GPU Stall by CPU .... 0.123 ?s CPU stall by GPU .... 12.231 ?s GPU TIME -------------------------- Unknown ....... 0.2?/ ... 3.43% Geometry ....... 1.8?/ ... 43.37% Lighting ....... 1.7?/ ... 14.??% Effects ........ 8.5?/ ... 8.4?% Post process ............. 18.31% -------------------------- Total Time ............... 81.??% GPU Stall ................ 0.??% PRIMS / TRI ----------- Totals ..... 1431/ ... 344,634 Prime? ..... 0/ ...... 0 Geometry ... 619/ .... 161,231 Shadow ..... 683/ .... 170,??? Effects .... 121/ .... 14,3?? MEMORY STATS ------------------------ Pushbuffer ???? ........ 0.15 MB Pushbuffer High ........ 0.15 MB VRAM Free .............. 23.43 MB Host Free .............. 80.?? MB Heap Free .............. 134.?? MB Render Mem ???? ........ 0 Render Mem Used ........ 12.00 MB Render Mem Watermark ... 12.00 MB MAIN RAM ....... 101.00 MB ---------------- Physics ........ 5.30 MB Collision ...... 3.72 MB Sound .......... 16.25 MB Mesh ........... 21.20 MB Graphics ....... 6.53 MB Animation ...... 34.45 MB Texture ........ 0.56 MB Shader ......... 1.46 MB AI Data ........ 2.75 MB Various ........ 3.32 MB Waste .......... 5.27 MB ---------------- Total .......... 97.17 MB Main RAM ??? ... 97 / 101 VIDEO RAM .. 190.04 MB ------------ Mesh ....... 15.99 MB Texture .... 156.87 MB Waste ...... 1.20 MB Total ...... 174.08 MB Interesting Just a few bits which can be gathered from this, SPUs are getting used for all sorts of stuff: AI, animation, skinning, push buffer, post processing, visibility, physics, and audio. SPUs are doing motion blur, depth of field, and more. SPU list suggests that they have portal based visibility as well as a KDTree for physics and raycasts (very interesting). Primitive stats suggest shadows take around 50% of triangles, show a relatively low number of triangles per frame (thanks to SPU culling?), and that perhaps particles (and decals?) are upwards of 10-20K triangles. Streamed in texture budget looks to be about 160-180MB. Lighting appears to be 1/3 as expensive as G-Buffer creation. | Atom ©2009-2007 Timothy Farrar Latest Blog Entries 090407 . dxt tip 090320 . gdc 2009 090318 . re-attachable code 090311 . atom tri soup 090305 . voxels 090219 . r600 090218 . arm vfp 090212 . iphone atom 090208 . iphone 090207 . kz2 ii 090129 . gt3xx speculation 090121 . killzone 2 090110 . hole filling 090108 . structure synth 090105 . nv gpu prg + tes 081230 . gl3 textures 081224 . larrabee 081223 . 3d ifs art 081219 . gl3 driver 081218 . reprojection 2 081217 . reprojection 081216 . pc gpu stats 081209 . opencl 081115 . r2 081106 . arm vfp11 081102 . gl3 on linux 081030 . p r d a 081020 . temporal binned ring buffer 081014 . octahedron map 081010 . temporal locality 081008 . future hardware 080926 . changed email 080918 . general purpose 080826 . olick paper 080814 . otoy, braid 080813 . opengl 3 II 080811 . opengl 3 080806 . random stuff 080718 . nv perf kit 080709 . antialiasing 080704 . micro polys II 080628 . micro polys 080524 . triangles 080426 . parallel II 080319 . beyond the vacuum 080223 . human head + parallel 080114 . xp install
Index 000000 . index
Graphics 090311 . atom tri soup 090110 . hole filling 081218 . reprojection 2 081217 . reprojection 081209 . opencl 081014 . octahedron map 081010 . temporal locality 080709 . antialiasing 080704 . micro polys II 080628 . micro polys 080524 . triangles 080319 . beyond the vacuum 071130 . GPU only 071121 . deferred 3 071116 . deferred 2 071103 . random shots 071025 . motion cards 071018 . cubemap concepts 071015 . drawing reverse II 070926 . drawing in reverse 070822 . new pipeline progress 070819 . high dynamic range 070817 . video update 070810 . engine lighting 070809 . engine videos 070731 . screen shots 070713 . micro impostors 070711 . infinite LOD 070710 . graphics engine intro
Interaction 071204 . GPU only 2 071018 . cubemap concepts 070816 . CFD videos 070730 . CFD code 070715 . self healing
Networking 070708 . breaking firewalls 070707 . management servers 070706 . 510 players / 128Kbps 070705 . UDP player bandwidth 070704 . network latency 070703 . cost of bandwidth
Sound 070709 . 3D audio / KEMAR
Language 090318 . re-attachable code 081030 . p r d a 070921 . assembler in atom4th 070919 . editor working 070915 . chicken and egg 070912 . font making 070910 . 2 4th | !2 4th
Elsewhere andrew selle adrian crook alex champandard angelo pesce aras pranckevicius brian karis cedrick collomb christer ericson chris hecker craig reynolds dave moore david lenihan ignacio castano jeremy shopf jonas risbrandt ke-sen huang marco salvi mikael christensen mike acton mingw naty hoffman nick porcino oss pete shirley pierre terdiman pixar papers realtime rendering ron fedkiw tom forsyth vincent scheib wolfgang engel All Blog Entries 090407 . dxt tip 090320 . gdc 2009 090318 . re-attachable code 090311 . atom tri soup 090305 . voxels 090219 . r600 090218 . arm vfp 090212 . iphone atom 090208 . iphone 090207 . kz2 ii 090129 . gt3xx speculation 090121 . killzone 2 090110 . hole filling 090108 . structure synth 090105 . nv gpu prg + tes 081230 . gl3 textures 081224 . larrabee 081223 . 3d ifs art 081219 . gl3 driver 081218 . reprojection 2 081217 . reprojection 081216 . pc gpu stats 081209 . opencl 081115 . r2 081106 . arm vfp11 081102 . gl3 on linux 081030 . p r d a 081020 . temporal binned ring buffer 081014 . octahedron map 081010 . temporal locality 081008 . future hardware 080926 . changed email 080918 . general purpose 080826 . olick paper 080814 . otoy, braid 080813 . opengl 3 II 080811 . opengl 3 080806 . random stuff 080718 . nv perf kit 080709 . antialiasing 080704 . micro polys II 080628 . micro polys 080524 . triangles 080426 . parallel II 080319 . beyond the vacuum 080223 . human head + parallel 080114 . xp install 080108 . 2008 071207 . G84 071204 . GPU only 2 071130 . GPU only 071126 . opt+more 071121 . deferred 3 071116 . deferred 2 071115 . critic 2 071112 . critic 071108 . GPU assembly 2 071104 . GPU assembly 071103 . random shots 071031 . cubemap seams 071026 . transform feedback 071025 . motion cards 071024 . GS woes 071019 . cubemap woes 071015 . drawing reverse II 070930 . porting to sm3.0? 070926 . drawing in reverse 070921 . assembler in atom4th 070919 . editor working 070915 . chicken and egg 070912 . font making 070910 . 2 4th | !2 4th 070822 . new pipeline progress 070819 . high dynamic range 070818 . DFES 070817 . video update 070816 . CFD videos 070810 . engine lighting 070809 . engine videos 070731 . screen shots 070730 . CFD code 070715 . self healing 070713 . micro impostors 070712 . fragment raytracer 070711 . infinite LOD 070710 . graphics engine intro 070709 . 3D audio / KEMAR 070708 . breaking firewalls 070707 . management servers 070706 . 510 players / 128Kbps 070705 . UDP player bandwidth 070704 . network latency 070703 . cost of bandwidth 070702 . market research
|