motorsep@Posted: Sun Nov 11, 2012 1:29 am :
My PC consists of AMD Phenom x3 2.2Ghz + dual channel 8 Gb of DDR2 + GF 670 GTX 2Gb DDR5

I placed some barricades in a test box room for Doom 3 plus 12 zsecs with machine guns and 8 imps.

After recording a demo killing most of the enemies, I get ~50 fps on timedemo in 1280 x 720 windowed ( in 1920 x 1200 full screen ~40 fps ). However, when I play, actual fps is ~18. What "eats up" the difference in fps? Just rendering doesn't seem to be that bad with so many enemies simultaneously firing and muzzle flash dlights casting shadows.

Download benchmark map http://www.steel-storm.com/files/zsecs_benchmark_map.zip



simulation@Posted: Sun Nov 11, 2012 2:42 am :
20 AI scripts running isn't going to be so great. Try switching com_speeds to 1 and look where the time is being spent. Guessing that it will be in the gamecode rather than the renderer.



motorsep@Posted: Sun Nov 11, 2012 2:47 am :
Ok, I did that and I have no idea what I am looking at :/ What all those running and spamming console numbers mean ?

Here are conDumps with com_speeds 1 when,

playing timeDemo: http://www.steel-storm.com/files/com_sp ... medemo.txt

playing the game: http://www.steel-storm.com/files/com_sp ... meplay.txt



The Happy Friar@Posted: Sun Nov 11, 2012 4:01 am :
info about com_speeds: http://www.iddevnet.com/doom3/

about 50% of the rendering time is processing code, the other 50% is the graphics.

Increasing the brushcount helps out with the fps some too. There's so many muzzle flashes & projectile lights though it doesn't help much, most of the surfaces are always lit once they start attacking.



motorsep@Posted: Sun Nov 11, 2012 4:25 am :
But you see that rendering takes way less time than game code.

I finished Darksiders 2 recently. Fps would drop to 25-30 on the final levels with no enemies in sight. So if only game code would eat up less CPU time (or was threaded) and skeletal animation was on GPU and rendering would be deferred, it would be perfect :)



The Happy Friar@Posted: Sun Nov 11, 2012 5:07 am :
Eliminating the scripts & putting everything in the .exe like bfg edition would help too.

If you put the skeletal stuff in the GPU you're cutting off a large % of users. The latest Steam data says 40% of all steam users use a DX9 card. The other 60% is split between DX 10 & 11 on various platforms. while 40% isn't as much as the DX 10/11 users combined, the people with the top end generally don't want a game that won't tax their system.



motorsep@Posted: Sun Nov 11, 2012 5:16 am :
Putting game logic into binary doesn't solve the core issue - loading single core of CPU. It will work faster, but not significantly faster.

As far as skeletal on GPU, according to http://http.developer.nvidia.com/GPUGem ... _ch04.html it works on GF5900, which is _old_. So I don't see how it will affect majority of users.

Could you please link that Steam data about 40% of users being on DX9 hardware?

EDIT: There ya go:

Steam Hardware & Software Survey: October 2012

DirectX 11 GPUs 53.47% (+1.36%)
DirectX 10 GPUs 40.19% (-0.97%)
DirectX 9 Shader Model 2b and 3.0 GPUs 2.99% (-0.24%)
DirectX 9 Shader Model 2.0 GPUs 1.58% (-0.09%)

Not to mention Steam is the largest digital distributor and controls over 70% of the PC market (and growing) ;)

So 93.5% of PC gamers use DX10/11 GPUs.



The Happy Friar@Posted: Sun Nov 11, 2012 5:41 am :
I was wrong, it's ~50% of users: http://store.steampowered.com/hwsurvey?platform=pc



motorsep@Posted: Sun Nov 11, 2012 5:52 am :
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.



Tron@Posted: Sun Nov 11, 2012 10:06 am :
motorsep wrote:
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.


XP only supports DX9



reckless@Posted: Sun Nov 11, 2012 12:46 pm :
Quote:
XP only supports DX9


Atleast OpenGL does not have that problem :lol: But yeah its the main reason many are stuck with deprecated functions on D3D games cause even if there gfx cards support it the underlying Directx runtime does not so they cant use it. I worked for the Danish IT apartment á little while back and it was quite scary how many users still
had XP. Microsoft should be proud :P its there longest lasting OS hehe.



motorsep@Posted: Sun Nov 11, 2012 4:20 pm :
Tron wrote:
motorsep wrote:
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.


XP only supports DX9


Yet, idTech 4 uses OpenGL, which has nothing to do with OS users use. So users can be on WinXP with DX11 video cards and use latest OpenGL features, including tessellation, deferred rendering and skeletal on GPU :)



Sikkpin@Posted: Mon Nov 12, 2012 6:17 pm :
The biggest performance problems I've found when dealing with a lot of enemies is not the scripts exactly but the path finding/obstacle avoidance routines and also, believe it or not, sounds. Simplifying things in these two areas will greatly increase performance with large numbers of AI.



motorsep@Posted: Mon Nov 12, 2012 6:23 pm :
Well, for once, 64bit dhewm3 has no SIMD optimization :/ I ran dhewm3 32bit and fps was slightly higher. Also it sure felt fine when playing and com_speeds 1 shows much better results (64bit build feels like slo-mo with that number of enemies). So I guess first task is to add SIMD optimization for 64bit build, then thread it (which seems to be a royal pita).

Fun fact - after building 64bit build on Windows with MSVC 2010 Express, dhewm3 shows that it uses generic SIMD instructions. Same code compiled with gcc on Linux shows it uses MMX/SSE and SSE2. Go figure.



reckless@Posted: Mon Nov 12, 2012 6:43 pm :
64 bit optimization is pretty hard stuff especially on code that was not built with a 64 bit OS in mind from the start.
Also a lot of the integral stuff like int's long's etc. have other sizes on 64 bit which can fubar pretty much anything if not corrected.
Luckily most modern compilers come with pointer types of those that set the correct size depending on what arch you build for (uintptr_t -> unsigned int intptr_t -> int etc).
Its still a heck of a load of work though :)



motorsep@Posted: Mon Nov 12, 2012 6:49 pm :
@reckless: I am sure you got my pm ;) Could you please reply to it in pm ?

dhewm3 ported to 64bit platforms properly. Not 100% perhaps since there is a collision bug that only happens in 64bit build. However, all the types that needed to be converted were converted. I guess it was rather painful to write SIMD assembly code for 64bit platform, thus dhewm3 devs decided to leave it for someone else :)



reckless@Posted: Mon Nov 12, 2012 7:41 pm :
My guess also ;)

Im not at home atm helping a friend move but i read your pm and atm my hands are pretty much full :(
Ill see if i can help when i got my current stuff worked out.



motorsep@Posted: Sun Nov 11, 2012 1:29 am :
My PC consists of AMD Phenom x3 2.2Ghz + dual channel 8 Gb of DDR2 + GF 670 GTX 2Gb DDR5

I placed some barricades in a test box room for Doom 3 plus 12 zsecs with machine guns and 8 imps.

After recording a demo killing most of the enemies, I get ~50 fps on timedemo in 1280 x 720 windowed ( in 1920 x 1200 full screen ~40 fps ). However, when I play, actual fps is ~18. What "eats up" the difference in fps? Just rendering doesn't seem to be that bad with so many enemies simultaneously firing and muzzle flash dlights casting shadows.

Download benchmark map http://www.steel-storm.com/files/zsecs_benchmark_map.zip



simulation@Posted: Sun Nov 11, 2012 2:42 am :
20 AI scripts running isn't going to be so great. Try switching com_speeds to 1 and look where the time is being spent. Guessing that it will be in the gamecode rather than the renderer.



motorsep@Posted: Sun Nov 11, 2012 2:47 am :
Ok, I did that and I have no idea what I am looking at :/ What all those running and spamming console numbers mean ?

Here are conDumps with com_speeds 1 when,

playing timeDemo: http://www.steel-storm.com/files/com_sp ... medemo.txt

playing the game: http://www.steel-storm.com/files/com_sp ... meplay.txt



The Happy Friar@Posted: Sun Nov 11, 2012 4:01 am :
info about com_speeds: http://www.iddevnet.com/doom3/

about 50% of the rendering time is processing code, the other 50% is the graphics.

Increasing the brushcount helps out with the fps some too. There's so many muzzle flashes & projectile lights though it doesn't help much, most of the surfaces are always lit once they start attacking.



motorsep@Posted: Sun Nov 11, 2012 4:25 am :
But you see that rendering takes way less time than game code.

I finished Darksiders 2 recently. Fps would drop to 25-30 on the final levels with no enemies in sight. So if only game code would eat up less CPU time (or was threaded) and skeletal animation was on GPU and rendering would be deferred, it would be perfect :)



The Happy Friar@Posted: Sun Nov 11, 2012 5:07 am :
Eliminating the scripts & putting everything in the .exe like bfg edition would help too.

If you put the skeletal stuff in the GPU you're cutting off a large % of users. The latest Steam data says 40% of all steam users use a DX9 card. The other 60% is split between DX 10 & 11 on various platforms. while 40% isn't as much as the DX 10/11 users combined, the people with the top end generally don't want a game that won't tax their system.



motorsep@Posted: Sun Nov 11, 2012 5:16 am :
Putting game logic into binary doesn't solve the core issue - loading single core of CPU. It will work faster, but not significantly faster.

As far as skeletal on GPU, according to http://http.developer.nvidia.com/GPUGem ... _ch04.html it works on GF5900, which is _old_. So I don't see how it will affect majority of users.

Could you please link that Steam data about 40% of users being on DX9 hardware?

EDIT: There ya go:

Steam Hardware & Software Survey: October 2012

DirectX 11 GPUs 53.47% (+1.36%)
DirectX 10 GPUs 40.19% (-0.97%)
DirectX 9 Shader Model 2b and 3.0 GPUs 2.99% (-0.24%)
DirectX 9 Shader Model 2.0 GPUs 1.58% (-0.09%)

Not to mention Steam is the largest digital distributor and controls over 70% of the PC market (and growing) ;)

So 93.5% of PC gamers use DX10/11 GPUs.



The Happy Friar@Posted: Sun Nov 11, 2012 5:41 am :
I was wrong, it's ~50% of users: http://store.steampowered.com/hwsurvey?platform=pc



motorsep@Posted: Sun Nov 11, 2012 5:52 am :
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.



Tron@Posted: Sun Nov 11, 2012 10:06 am :
motorsep wrote:
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.


XP only supports DX9



reckless@Posted: Sun Nov 11, 2012 12:46 pm :
Quote:
XP only supports DX9


Atleast OpenGL does not have that problem :lol: But yeah its the main reason many are stuck with deprecated functions on D3D games cause even if there gfx cards support it the underlying Directx runtime does not so they cant use it. I worked for the Danish IT apartment á little while back and it was quite scary how many users still
had XP. Microsoft should be proud :P its there longest lasting OS hehe.



motorsep@Posted: Sun Nov 11, 2012 4:20 pm :
Tron wrote:
motorsep wrote:
The details: http://store.steampowered.com/hwsurvey/videocard/

Still, I don't see where you got 40% of people being on DX9 ;) ~5% yeah, that would be more accurate.


XP only supports DX9


Yet, idTech 4 uses OpenGL, which has nothing to do with OS users use. So users can be on WinXP with DX11 video cards and use latest OpenGL features, including tessellation, deferred rendering and skeletal on GPU :)



Sikkpin@Posted: Mon Nov 12, 2012 6:17 pm :
The biggest performance problems I've found when dealing with a lot of enemies is not the scripts exactly but the path finding/obstacle avoidance routines and also, believe it or not, sounds. Simplifying things in these two areas will greatly increase performance with large numbers of AI.



motorsep@Posted: Mon Nov 12, 2012 6:23 pm :
Well, for once, 64bit dhewm3 has no SIMD optimization :/ I ran dhewm3 32bit and fps was slightly higher. Also it sure felt fine when playing and com_speeds 1 shows much better results (64bit build feels like slo-mo with that number of enemies). So I guess first task is to add SIMD optimization for 64bit build, then thread it (which seems to be a royal pita).

Fun fact - after building 64bit build on Windows with MSVC 2010 Express, dhewm3 shows that it uses generic SIMD instructions. Same code compiled with gcc on Linux shows it uses MMX/SSE and SSE2. Go figure.



reckless@Posted: Mon Nov 12, 2012 6:43 pm :
64 bit optimization is pretty hard stuff especially on code that was not built with a 64 bit OS in mind from the start.
Also a lot of the integral stuff like int's long's etc. have other sizes on 64 bit which can fubar pretty much anything if not corrected.
Luckily most modern compilers come with pointer types of those that set the correct size depending on what arch you build for (uintptr_t -> unsigned int intptr_t -> int etc).
Its still a heck of a load of work though :)



motorsep@Posted: Mon Nov 12, 2012 6:49 pm :
@reckless: I am sure you got my pm ;) Could you please reply to it in pm ?

dhewm3 ported to 64bit platforms properly. Not 100% perhaps since there is a collision bug that only happens in 64bit build. However, all the types that needed to be converted were converted. I guess it was rather painful to write SIMD assembly code for 64bit platform, thus dhewm3 devs decided to leave it for someone else :)



reckless@Posted: Mon Nov 12, 2012 7:41 pm :
My guess also ;)

Im not at home atm helping a friend move but i read your pm and atm my hands are pretty much full :(
Ill see if i can help when i got my current stuff worked out.