Optimizing AI in v0.0.32


In my last release I optimized the traffic AI to shave off about 1ms of game thread time on my dev machine.  From the stat counters I added in my AI services and spline classes, it was clear that the traffic spline avoidance and the "FindNearestSplineState" calls were the leading contributors to CPU usage.  Unreal Insights profiling confirmed that that AI traffic service consumed over 10x as much CPU as the next leading AI function.  So far I have not been able to get the stats counters to show up in Unreal Insights but I can show them via a function key in the development build.  

I first optimized that particular "FindNearestSplineState" function to remove spline component calls that would get thrown away later in the processing and also use distance squared for comparisons.  This yielded a barely noticeable benefit.  Most of the calls to that function came from calculating the turn start location and turn end location for the traffic making a left and right turn.  The AI needed to switch over from the straight lane spline to the turn spline, complete the turn along the turn spline, and then switch back to the straight spline in that lane.   

It then dawned on me that I could easily cache these results!  The calculation did not depend on anything dynamic in the game, it was using the same inputs every time, aside for one small calculation that could be derived from the input and intermediate results.  So instead of calculating it literally millions of times during gameplay, I would just use a TMap and map the spline component to the start and end state on BeginPlay of the spline actor class I created. Then I could just use those pre-calculated results each time it was requested.

Unreal Insights is great but sometimes I want to see function level flame graphs and call trees.  I was able to gather even more diagnostic data by using the "Diagnostic Tools" in Visual Studio 2022.  It turns out you can set two breakpoints and profile the CPU in between them.  This Youtube video showed me how.   The caveat is you need to run the debugger to capture the performance trace and you want to profile with as many optimizations turned on as possible.  So one slight adjustment I needed to make for UE 5 is that instead of running the solution with the "Debug" configuration, I wanted to run with the "Development" configuration so that all optimizations are turned on but debug symbols are still available.  This is the same as the "Development" packaged game from the C++ perspective.   The trouble is that you will have a hard time getting a breakpoint to hit that is placed in the IDE as the optimizer aggressively re-arranges and inlines code.   So I found that Windows has the __debugBreak() function that you can add to create your "profiling zone".  This function will not be optimized away and it did trigger the breakpoint when running the game.  I placed one call when the game started and another when it ended and was able to capture hotspots from the "Startup Thread" which is the game thread and show the call hierarchy.   This is another useful tool in your toolbox to help zero in even more on performance hotspots without needing to add additional instrumentation other than the temporary breakpoint functions.

The function that was identified was USplineComponent::FindInputKeyClosestToWorldLocation which agreed with the previous coarser grained peformance data gathered from Unreal Insights but now I am able to see how I use that in the callee functions to potentially further optimize as needed.

Files

PoliceChaseSimulator-0.0.32
External
Aug 17, 2023

Get Police Chase Simulator

Leave a comment

Log in with itch.io to leave a comment.