Dwarf Fortress

Dwarf Fortress

Fletch Jun 25, 2022 @ 8:24am
Multi-threading and embarrassingly parallelizable problems
There are certain aspects of DF that are "embarrassingly parallelizable" (that's a technical term, look it up). Things like searching for an embark location based on your filters (river, temperate climate, soil, etc) are very trivial to parallelize -- the calculations for one location has absolutely ZERO EFFECT on the calculations for another location (hence why it is called "embarrassingly parallelizable"). On my 6-CPU rig, you should have 6 threads (1 per CPU) each searching 1/6th the locations, and have the entire "search for embark location" finish in 1/6th the time (minus a tiny bit of overhead). As it is today, only one CPU is used (fully maxed out) and the other five CPUs sit idle -- because the entire game is single-threaded.

I've been playing DF since 2008, and the topic of multi-threading / parallelization has been brought more times than can be counted. With the free game, it is hard to complain about DF's lack of multi-threading support -- you get what you pay for. However, with a paid steam release, I hope the community revolts and forces these way long overdue performance fixes to the base game's engine.

My advice to the developer is to start by just parallelizing the loops that are TRIVIAL to parallelize (the ones that are called ""embarrassingly parallelizable"), like searching for the embark location example I already brought up. Learn these programming concepts from the 1960's and just apply them to the trivial loops first.

A couple of man-hours of work (to parallelize the trivial loops) will save man-YEARS (decades) of your customer's time waiting on the poor performing game engine while they are playing.

FWIW: I've been programming since the 1980's -- so no need for the snarky comments that parallelizing a game is hard and that it would require rewriting the entire game from scratch. Those assertions are absolutely false. Parallelizing trivial loops is something that a high-school intern can do.

Rant over. *Steps off soapbox.*
Originally posted by VoidEngineer:
Originally posted by Fletch:
Zach: "Who would've known that it is possible to multi-thread ... all this time"
Fletch: * raises hand *

Thank you, Tarn, for taking the dive into multithreading -- and to stop listening to all the naysayers that kept saying that it would not be possible. Parallelizing trivial loops is trivial to do. And many thanks to Putnam for driving optimizations and improvements throughout the codebase!

https://www.youtube.com/watch?v=kanUBppoQpQ&t=675s

Yeah I was listening to this when I started reading this thread. Ignore the critics. You always have a few people show up to take a crap on you no matter what you say in a public forum.

There are trivial pathways to implementing multithreading into the game.

Regardless of whether more systems gets MT'd, it looks like Putnam has a good handle on how to use profilers on the code and find those FPS-crushing bottlenecks.
< >
Showing 1-15 of 52 comments
Hanuman Jun 25, 2022 @ 9:14am 
My programming experience ended with Turbo Pascal in the 1990s, that having been said, I was totally assuming that this updated Steam version would have multi-threading, I guess I can always run the game and cook steaks on my overheating laptop when my population goes over 200....
Last edited by Hanuman; Jun 25, 2022 @ 9:14am
Necrosian Jun 25, 2022 @ 11:28am 
Originally posted by Username_449:
My programming experience ended with Turbo Pascal in the 1990s, that having been said, I was totally assuming that this updated Steam version would have multi-threading, I guess I can always run the game and cook steaks on my overheating laptop when my population goes over 200....

From what i remember multi threading would require a total rewrite of the games code.
mwirkk Jun 25, 2022 @ 11:40am 
That's my understanding as well. It's not like they went back and rewrote it all from scratch.
Fletch Jun 25, 2022 @ 12:03pm 
Originally posted by Necrosian:
Originally posted by Username_449:
My programming experience ended with Turbo Pascal in the 1990s, that having been said, I was totally assuming that this updated Steam version would have multi-threading, I guess I can always run the game and cook steaks on my overheating laptop when my population goes over 200....

From what i remember multi threading would require a total rewrite of the games code.

Yeah, that's the excuse that has been given for over the past 10+ year. I'm saying that reason is bunk. I've already given an example for a section that is trivially (very easily) parallelizable. Another part of the game, and probably the part that is most impactful is the AI pathfinding, which is the cause of the FPS death spiral and rage-quitting the game.

Two dorfs should each calculate their paths from their unique origins to their unique destinations in parallel (in separate threads). Dorf-A's calculations to his destination has zero effect on Dorf-B's calculations to his destination. Instead, this game calculates all creatures AI paths sequentially (in order, one at a time) in a big time-wasting 'for' loop. This means every dwarf, goblin, clown, and kitten on the map each take turns calculating their unique paths from their origin to their destination, on every in-game 'tick'.

Then, every creature takes one step.

Then, all those pathing calculations are repeated again for the next frame. One. Creature. At. A. Time. Sequentially.

It would not require a total rewrite of the code, that's why the title of this post is that these are "Embarrassingly Parallelizable" easy problems to fix. 80% of my CPU is sitting idle while this game just grinds to a halt. I should easily be able to have 600+ dwarfs in a fortress with my current CPU (not 100). I want all my CPU's maxed out, not just 1 core.

Now, doing what "They Are Billions" devs did would require a lot more dev-effort to offload all that pathfinding on to the GPU (with its thousands of cores). I'm just asking for the really easy stuff from the DF developer.
Teemo Jun 25, 2022 @ 6:18pm 
Originally posted by Fletch:
Originally posted by Necrosian:

From what i remember multi threading would require a total rewrite of the games code.

Yeah, that's the excuse that has been given for over the past 10+ year. I'm saying that reason is bunk. I've already given an example for a section that is trivially (very easily) parallelizable. Another part of the game, and probably the part that is most impactful is the AI pathfinding, which is the cause of the FPS death spiral and rage-quitting the game.

Two dorfs should each calculate their paths from their unique origins to their unique destinations in parallel (in separate threads). Dorf-A's calculations to his destination has zero effect on Dorf-B's calculations to his destination. Instead, this game calculates all creatures AI paths sequentially (in order, one at a time) in a big time-wasting 'for' loop. This means every dwarf, goblin, clown, and kitten on the map each take turns calculating their unique paths from their origin to their destination, on every in-game 'tick'.

Then, every creature takes one step.

Then, all those pathing calculations are repeated again for the next frame. One. Creature. At. A. Time. Sequentially.

It would not require a total rewrite of the code, that's why the title of this post is that these are "Embarrassingly Parallelizable" easy problems to fix. 80% of my CPU is sitting idle while this game just grinds to a halt. I should easily be able to have 600+ dwarfs in a fortress with my current CPU (not 100). I want all my CPU's maxed out, not just 1 core.

Now, doing what "They Are Billions" devs did would require a lot more dev-effort to offload all that pathfinding on to the GPU (with its thousands of cores). I'm just asking for the really easy stuff from the DF developer.

This is not an entirely correct assumption that dorf A doesn't care about dorf B because lots of things interact with each other, so they very much do care what other people are doing when taking just 1 step. However, I think your general assessment that certain aspects are easier to multi-thread is correct and could be done. Also, I really hope they aren't calculating their path every single tick because that would be a huge waste. They should be calculating the overall ideal path once and then just checking every step to make sure the next few steps are good, then maybe doing another full look once every 10 or 20 steps to see if there was a drawbridge raised/wall built etc.
Fletch Jun 26, 2022 @ 6:06pm 
Originally posted by Teemo:

This is not an entirely correct assumption that dorf A doesn't care about dorf B because lots of things interact with each other, so they very much do care what other people are doing when taking just 1 step. However, I think your general assessment that certain aspects are easier to multi-thread is correct and could be done. Also, I really hope they aren't calculating their path every single tick because that would be a huge waste. They should be calculating the overall ideal path once and then just checking every step to make sure the next few steps are good, then maybe doing another full look once every 10 or 20 steps to see if there was a drawbridge raised/wall built etc.

The pathing calculations for Dorf A (wanting to go get a beer), and Dorf B (needing to fetch a stick of wood to make a bed) are entirely independent things -- they can be calculated in parallel in different threads. How does Dorf-A's pathing calculations have any affect whatsoever on Dorf-B's pathing calculations?

Think about how the real world works: do I need to wait (block) for you to figure out the best path for you to get to your store, before I can start figuring out my path to go sit down and type this at my computer?

The real world is massively parallel, and two dorf's needing to figure out their paths for their individual origin/destination should be done in parallel too. I also agree that once the paths are calculated, they should not need to recalculate those paths unless something physically changes in the world: like building a wall blocking a hallway that is on a dorf's pre-calculated path. Recalculating all paths after a new construction is fine to do, since it is rare (new things aren't built on every tick, 99.9% of ticks the "physical" world affecting paths does not change and the game should be optimized for that).
Last edited by Fletch; Jun 26, 2022 @ 6:07pm
Teemo Jun 26, 2022 @ 7:05pm 
Originally posted by Fletch:
Originally posted by Teemo:

This is not an entirely correct assumption that dorf A doesn't care about dorf B because lots of things interact with each other, so they very much do care what other people are doing when taking just 1 step. However, I think your general assessment that certain aspects are easier to multi-thread is correct and could be done. Also, I really hope they aren't calculating their path every single tick because that would be a huge waste. They should be calculating the overall ideal path once and then just checking every step to make sure the next few steps are good, then maybe doing another full look once every 10 or 20 steps to see if there was a drawbridge raised/wall built etc.

The pathing calculations for Dorf A (wanting to go get a beer), and Dorf B (needing to fetch a stick of wood to make a bed) are entirely independent things -- they can be calculated in parallel in different threads. How does Dorf-A's pathing calculations have any affect whatsoever on Dorf-B's pathing calculations?

Think about how the real world works: do I need to wait (block) for you to figure out the best path for you to get to your store, before I can start figuring out my path to go sit down and type this at my computer?

The real world is massively parallel, and two dorf's needing to figure out their paths for their individual origin/destination should be done in parallel too. I also agree that once the paths are calculated, they should not need to recalculate those paths unless something physically changes in the world: like building a wall blocking a hallway that is on a dorf's pre-calculated path. Recalculating all paths after a new construction is fine to do, since it is rare (new things aren't built on every tick, 99.9% of ticks the "physical" world affecting paths does not change and the game should be optimized for that).

If Dorf A is next to Dorf B and they both want to go to the same square, then it matters.
If Dorf A and Dorf B are both in a squad together as part of a raid where the AI is using a "stick together algorithm" then they need to know if they are staying close enough
If Dorf A is fighting Dorf B they both need to know where the other one is
If Dorf A and Dorf B both want to get the same wheelbarrow to get the different ores they want to pick up, they need to know about it (especially if it's the last one)

This list is longer than this. There are lots of reasons that they do need to talk to each other in the game. You're right that most of the time you can get use out of multi-threading, it's just not as simple as you laid it out because there are a lot of interactions in this game and that would need to be figured out. Something simple like starting with dividing the units up by Z-level might be a start, or coming up with a way to block the packs of units out and giving each cpu it's own section. It could be done, it's just a bit more complicated than you make it out to be.

Also, 99.9% of ticks being fine but only 0.1% breaking your game means you are crashing constantly. That's a troublesome-game breaking level error every 10 seconds at 100 FPS.
Hanuman Jun 27, 2022 @ 5:26pm 
Meanwhile dorf C's laptop will overheat, he will enter a fell mood (<dorf> looses a roaring laughter, fell and terrible!), commandeer the butcher shop and kill a ghost.
Skinny Pete Jun 27, 2022 @ 6:19pm 
Originally posted by Fletch:
FWIW: I've been programming since the 1980's -- so no need for the snarky comments that parallelizing a game is hard and that it would require rewriting the entire game from scratch. Those assertions are absolutely false. Parallelizing trivial loops is something that a high-school intern can do.

Rant over. *Steps off soapbox.*
It takes me something like 20 seconds to search an entire large world, maybe 40 if I get really nuts. That's longer than it takes me to think about what I actually want on the embark. Exactly how much effort is it worth to fix something that has this little to do with the vast majority of play?

How much should they rewrite to save 10 seconds of an entire game that goes on for months at times, when there are so many other problems that outright kill a game?
Originally posted by Fletch:
Recalculating all paths after a new construction is fine to do, since it is rare (new things aren't built on every tick, 99.9% of ticks the "physical" world affecting paths does not change and the game should be optimized for that).
You got a source for that 99.9% number? Seems pretty pulled out of nowhere. And if you have flowing liquids, especially waterfalls, yes it very well can change every tick.

It might be worth it for them to hire someone to do that at some point the game is relatively stable but now? Just rewrite the entire engine in the next couple months?
Last edited by Skinny Pete; Jun 27, 2022 @ 6:23pm
MadAsgardian Jun 27, 2022 @ 8:55pm 
Two hours to fix a game that’s been under development for 15+ years… I think you may be slightly underestimating what a mess it’s likely to be under the hood.
Last edited by MadAsgardian; Jun 27, 2022 @ 8:57pm
Hanuman Jun 28, 2022 @ 5:13am 
Originally posted by MadAsgardian:
Two hours to fix a game that’s been under development for 15+ years… I think you may be slightly underestimating what a mess it’s likely to be under the hood.

We are mere mortals, it is not our place to question these things anyways imo.
Flaming Badger Jul 2, 2022 @ 7:39am 
I'm all for some parallelisation happening, but I'm not in a rush and I don't get why the community is so obsessed with it.

There's a cost/benefit analysis that needs to be done on optimisation. How much of a speedup would you actually get from parallelising everything possible? Like, great, you've sped up pathfinding by a factor of 4, but what fraction of each tick is spent pathfinding? A third? So you've actually only sped the game up by less than 50%. That's not really enough to save you from FPS death on a big fort.

There is likely a lot more low hanging fruit than parallelisation out there that will get you more optimisation for your buck.

Given the spaghetti code built up over 15 years I also think you're wildly optimistic about how much time it would take to implement, not to mention bug fix and maintain.

The philosophy with regards to optimisation over the years has generally been "when it's needed" and "features first, optimisation later" because if you optimise too early you'll likely break the optimisation again later.

Don't get me wrong, optimisation is important and I expect to see that philosophy of putting it off change for the Steam release. I'm just not worried whether or not parallelisation is a part of the optimisations we see.
Last edited by Flaming Badger; Jul 5, 2022 @ 5:05am
Tronchi Jul 4, 2022 @ 2:50pm 
and as a player, what measures can be taken in the game to avoid death by FPS in large fortresses?
Teemo Jul 4, 2022 @ 6:17pm 
Originally posted by $Tron:
and as a player, what measures can be taken in the game to avoid death by FPS in large fortresses?

Here are some simple ones:

1) Wall off/lock off areas you don't need, to include almost always walling off the caverns (until you get good at managing FPS and the dangers there).

2) Keep embarks to 2x2 or 3x3 and possibly lower the dwarf count. I personally do 3x3/120.

3) Don't engrave everything, just the things that really need it to get to quality levels you need.

4) Although it's technically an exploit, use quantum stockpiles to cut down on how many stacks of items there are. This will also get double the savings due to the pathfinding improvements as well.

5) Use the path settings under d->o to make your most traveled areas high to speed pathfinding calcs.

6) Use a dump next to magma to destroy everything you don't need to cut down on the total items in your game
Sabin Stargem Jul 5, 2022 @ 12:12pm 
Apparently there is a boost if you use the 5800x3D CPU. This is because there is extra cache to work with, which reduces a bottleneck on the game's performance. Downside is that it will cost you plenty of dough, and bigger worlds will lose that improvement.

Anandtech - Ryzen 5800x3D review, slides include Dwarf Fortress performance.
https://www.anandtech.com/show/17337/the-amd-ryzen-7-5800x3d-review-96-mb-of-l3-3d-v-cache-designed-for-gamers/6
Last edited by Sabin Stargem; Jul 5, 2022 @ 12:16pm
< >
Showing 1-15 of 52 comments
Per page: 1530 50

Date Posted: Jun 25, 2022 @ 8:24am
Posts: 52