Source SDK

Source SDK

Spirrwell Nov 6, 2017 @ 7:12pm
Significant Performance Increase
Hello there!

So I did some performance profiling for a mod I work on and found that some of the biggest performance drains were from stricmp() calls. I'm focusing particularly on ones used for animation sequence lookup. If you have a multiplayer mod, or a mod with a lot of animations, this could be incredibly performance saving. I'm talking about potentially 60+ FPS boosts.

This isn't going to be a full blown tutorial on implementation, but should give you a good idea on how it could be improved, and I would love to know if it could be taken even farther.

So if you look at animation.cpp https://github.com/ValveSoftware/source-sdk-2013/blob/master/mp/src/game/shared/animation.cpp

In LookupSequence() starting at line 486, you'll notice this for loop:

for (int i = 0; i < pstudiohdr->GetNumSeq(); i++) { mstudioseqdesc_t &seqdesc = pstudiohdr->pSeqdesc( i ); if (stricmp( seqdesc.pszLabel(), label ) == 0) return i; }

stricmp() is really expensive. If you have a multiplayer mod with a lot of players with longer animation names it becomes incredibly expensive.

It's basically doing multiple stricmp() calls every frame per animated model requesting a sequence lookup, and the for loop itself is also creating a copy with GetNumSeq() every frame.

Then if that fails it goes onto LookupActivity, which does the same thing but for activities.

I came up with a solution that's a bit C++11ish which may not work well for you if your mod is multi-platform due to compiler nonsense, but that can always be worked around.

In the CStudioHdr class in studio.h, I added an std::unordered_map of < std::string, int > called m_SequenceMap;

At the end of CStudioHdr::Init() in studio.cpp I added this:

int numSeq = GetNumSeq(); for (int i = 0; i < numSeq; i++) { mstudioseqdesc_t &seqdesc = pSeqdesc( i ); m_SequenceMap[ std::string( seqdesc.pszLabel() ) ] = i + 1; }

NOTE: If you use $includemodel for animations, you may need to go a bit further than this to deal with ResetVModel() calls

The reason for the +1 there is because std::unordered_map will auto-init values that don't exist to 0, so we can use that to check if the sequence is valid.

I added a function called GetSequenceMapIndex() to CStudioHdr which just does

return m_SequenceMap[ label ];

Then the new LookupSequence function would look something like this:

int LookupSequence_New( CStudioHdr *pstudiohdr, std::string label ) { if (! pstudiohdr) return 0; if (!pstudiohdr->SequencesAvailable()) return 0; // // Look up by sequence name. // int index = pstudiohdr->GetSequenceMapIndex( label ); if ( index != 0 ) return index - 1; // // Not found, look up by activity name. // int nActivity = LookupActivity( pstudiohdr, label.c_str() ); if (nActivity != ACT_INVALID ) { return SelectWeightedSequence( pstudiohdr, nActivity ); } return ACT_INVALID; }

And there. This was done VERY quick and dirty. I didn't put a lot of effort into it. You could make this a lot cleaner, and you could do the same thing for LookupActivity(). With a server filled with 24 players, this saved me easily 60+ FPS which is kind of insane.

You should change it up however you like, make it cleaner and whatnot.

I just had to share as it kind of blew my mind.
Last edited by Spirrwell; Nov 6, 2017 @ 7:14pm
< >
Showing 1-6 of 6 comments
Misuune Nov 7, 2017 @ 5:22am 
id wish this was for singleplayer too
DDDDD Nov 7, 2017 @ 7:28am 
Excellent work. Would you be able to show us differences in scene budget time taken by Client_Animation with +showbudget in console for a given number of NPCs / players? Raw FPS improvement is a poor indicator as we don't know the baseline.

I did some performance testing a couple years ago and in NMRiH the particular pain points I noticed were in Client_Animation, Shadow_Rendering (gets very expensive on high density displacements), and prop static rendering (likely largely due to high cost of draw calls). I never really pored through the game source code, however.

EDIT: Also, pass your string by const ref instead of copying it into the function.
Last edited by DDDDD; Nov 7, 2017 @ 8:19am
Spirrwell Nov 7, 2017 @ 12:19pm 
Originally posted by |NMRiH Dev| Deadhand:
EDIT: Also, pass your string by const ref instead of copying it into the function.

I know XD I just did this just to test it out, didn't really put much effort into it.I also noticed I ran into a few case sensitivity issues due to the way our animations were named in the QC files, so you could use use std::transform to change the case to lower in that Init() function.

I'll see what I can do with showbudget and whatnot.
Spirrwell Nov 7, 2017 @ 6:35pm 
Alright, so I wanted to use plain Source SDK 2013 for this, but I found that the changes that I made there made basically no difference. Why? Well it seems that in plain Source SDK 2013, sequence lookup calls don't really happen for player models. That and the animations are rather buggy in stock SDK 2013.

The reason why it had such an impact on the mod I work on is because we have custom player anim state code that calculates what animations we should use based on some things that happen during games.

Regardless though, that's not uncommon, and it still highlights that the performance of the lookup functions is rather bad. Props use it as well.

https://imgur.com/a/Pd3mn

That shows the budget difference. You can see the ConVar I change in the lower left. Depending on the map, the number of players (which I had filled the server with bots), and the prop animations, the difference can be as low as like 15 FPS or as high as over 100 FPS. And you can see the client animation budget was cut by 2/3. That's a lot.
AniCator Nov 22, 2017 @ 4:00am 
You generally only need to get the index for a sequence once so you usually don't end up having to call LookupSequence all that often. The fetched index is saved to a member variable or stored in some other way which keeps you from having to call it every frame.

Of course it is better to use a map/dictionary to do the lookups. You may want to consider using Source's own library functions though instead of the STL ones. CUtlDict should do the job.

In some other versions of the code pstudiohdr already supports a dictionary lookup and has its own LookupSequence function that does so.

This change won't significantly increase performance for most games/mods though.
Spirrwell Nov 22, 2017 @ 6:01am 
Yeah, I know using the STL clashes with the Valve style. As I explained in my previous post, it has to do with how our mod specifically handles animations, which probably isn't the best way.

But anyway, we have an implementation now that uses CUtlStringMap because our other active programmer is averse to using STL as well. Though it feels like it doesn't have the same performance impact as the STL version did.

Me I care less and less for the Valve C style the more I work in Source.

Source is an unholy mess that I love to hate, but also can't seem to stop working with. XD
< >
Showing 1-6 of 6 comments
Per page: 1530 50