Install Steam
login
|
language
简体中文 (Simplified Chinese)
繁體中文 (Traditional Chinese)
日本語 (Japanese)
한국어 (Korean)
ไทย (Thai)
Български (Bulgarian)
Čeština (Czech)
Dansk (Danish)
Deutsch (German)
Español - España (Spanish - Spain)
Español - Latinoamérica (Spanish - Latin America)
Ελληνικά (Greek)
Français (French)
Italiano (Italian)
Bahasa Indonesia (Indonesian)
Magyar (Hungarian)
Nederlands (Dutch)
Norsk (Norwegian)
Polski (Polish)
Português (Portuguese - Portugal)
Português - Brasil (Portuguese - Brazil)
Română (Romanian)
Русский (Russian)
Suomi (Finnish)
Svenska (Swedish)
Türkçe (Turkish)
Tiếng Việt (Vietnamese)
Українська (Ukrainian)
Report a translation problem
Satisfactory is mostly a single threaded process. Also, since you have AMD and its mainly a single process. If you watch carefully, you will see that the speed of your two best cores are always running faster and that the Satisfactory thread is bouncing between these cores.
Perfectly normal.
EDIT:
Multicore for gaming is extremely overrated as most games still don't use more then 4 cores. Part of this is because companies don't program for more cores and the other is the simple fact that anything outside of number crunching and video rendering almost never needs more then 4 cores.
Thanks for the info!
Your welcome and should you take a look at my screenshots, many of them have a overlay that shows my 5800x behaving the same way.
https://steamcommunity.com/sharedfiles/filedetails/?id=2898373748
All cores are used
If your using software to force the game to run on all cores, your undermining AMD's Zen algorithm's along with windows task scheduler.
Forcing all your cores to work on something that doesn't require all cores results in your cores running at or near max speed causing unnecessary heat.
to me - and this is just personal opinion - if the devs have the same understanding of parallelization as Snutt with his completely wrong "1 baby takes 9 months - but 9 mothers can't do it in 1" it makes sense as this kinda wrong
the idea behind parallelization is not to speed up one process but to execute several processes at the same time - or to correct Snutt once again: the correct analogy would be: "1 mother takes 9x 9months to make 9 babies - but 9 mothers can to it in just 1x 9months"
or in terms of the game: although big factories have some combine spots much of the game does in fact run in parallel idependent of eachother - which can be calculated by multiple cpu cores at the same time which only leaves the connected ones as one big tree - but the devs seem to have made the mistake to try to calculate everything in one big sequential tree
Since what I've said is obvious to anyone who has written multithreaded code, I'm going to surmise that you aren't a programmer who has written such code. Is my assumption right?
This gave me a headache and I'm not a programmer but even I can tell you have no idea what your talking about.
The idea of parallelization is in fact to speed up a single task by splitting it across several cpu cores to speed up the results or run several tasks at the same time as you mentioned.
In real life, it doesn't matter how many mothers you have because it's physical impossible to speed up the process of having a baby. Making the point make perfect sense.
Programming software has limits and conditions that need to be followed, not all tasks can be split and/or should be because it actually degrades performance of other parts of the program or causes more headaches then what the performance gain is worth.
World simulation games (like satisfactory) are one of the best targets for parallel processing within the larger game industry. It requires forethought when designing the information set so that multiple processes can act on it simultaneously while maintaining efficiency. It sounds like past decisions for this game have likely made introduction of multiple threads a much larger lift than it should be.
Most developers struggle with designing, coding, and testing multi-threaded applications because they simply don't understand the underlying mechanisms that make the technique possible. I can't tell you how often I find an integer or discrete structure of atomic elements wrapped with a global locking object. The proper way to share atomic elements between threads is to leverage the processor's built in atomic memory operations, but most developers don't even know they exist.
Take the splitter building today. It has a single internal buffer space for items. If each destination conveyor belt were processed on independent threads, you would need to coordinate access to that buffer across all three threads. However, if you create separate buffers for 1 input and 3 outputs, you can then migrate the buffer to buffer movement of objects into it own discrete task. Then the conveyor tasks could operate independently of each other as they each have a dedicated buffer to access without a single lock.
If your information set is designed for parallel processing you can eliminate most of the possible thread contention from ever being required. Its this initial design effort that is often lacking which causes knock-on effects that increase the complexity and subsequently the overall fragility of multi-threaded systems.
Most games aren't designed devoid of any starting context, such as the choice of Satisfactory using Unreal Engine 4. Having your devs reinvent the wheel from scratch means they are writing generic engine code rather than focusing on the game-specific code.
Simple does not mean easy. Hitting a chisel with a hammer is simple.
Carving Michelangelo's David wasn't easy-- it required a great deal of skill.
Developers struggle with writing multithreaded code because parallelized code is extremely hard to deal with, and not because they fail to understand the basics of atomic operations. Atomic operations are simple, but not easy to use correctly.
It is not uncommon for global interpreter locks to exist in interpreted languages like Python and Ruby. It is almost universal that higher-level object-oriented languages create a mutex lock per instance of each object.
Your assumption only works on platforms with total store order, and is not sufficient even then because any CPU which implements speculative execution can change the order of operations. Platforms which implement partial store order need membars, fences, or equivalent operations to ensure consistency even for what the processor considers to be an atomic operation.
https://dev.to/kprotty/understanding-atomics-and-memory-ordering-2mom
Using fine-grained locking often hurts performance rather than helping.
Your proposal would change a single lock for the splitter into 4 locks: one for the input buffer and 3 for the buffers to the output belts. Each simulation tick would always need to acquire two locks, namely the input and the chosen output lock. You are most likely going to see better performance using a single lock rather than trying to create fine-grained locks because of the need to acquire two locks every time.
You can improve performance by using thread local storage for data structures when possible, not by refactoring code into fine-grained locks regardless of the usage patterns.
Separate miners or constructors are examples of things which can safely run independently, and could be evaluated using parallel algorithms because they do not interact. By contrast, a splitter needs locking because the distribution of incoming items to the output belts is an interaction which requires thread safety.
As I admitted, I'm not a programmer. Tired back in my late teens/ early 20s as hobby. What a headache...
Out of curiosity and stupidity, why is a constructor considered to not interact compared to a splitter when it consumes and produces materials from in and outputs? Is that not considered interacting?
Again, you could create a pair of locks for the constructors' input and output buffers.
But you'd tend to have to obtain both of them whenever the constructor builds a new item, so using multiple fine-grained locks would very likely not help.
The simulation should be separate and independent of the render model that informs the underlying game engine of model updates. The render model reflects changes made during simulation but is not the same thing (hopefully).
Most of these locks are lazy created by the VM at runtime the first time they are requested now to avoid having all those locks just laying around in memory and not being used. Object locks should be the last resort when all other options have been explored and failed.
Interesting article. The spec exec example used two data items: one that is accessed atomically and the other which is accessed raw from multiple threads. The author is trying to illustrate that some sections of code operate on many different data items to perform their work and all data access must be taken into account individually and as a group when dealing with multiple threads.
It's my fault for not clearly describing the scenario. The tasks to move items from the input buffers to the output buffers of splitter buildings would execute and complete prior to executing the task set for pulling items from the output buffers onto the destination conveyor belts. The task sets as a group are executed in sequence with each other so that multiple tasks accessing those buffers are never concurrently running. There would be 0 locks in this scenario beyond the task scheduling locks.
This is my point exactly. All buildings (including conveyor lengths), not just miners can be calculated independently of each other with some minor modification to how the internal buffers are allocated and correct sequencing of their related task sets. All locks can be removed from the information set itself. The only locks being those used to manage task scheduling.
The Unreal engine is not just a render system, and Satisfactory uses Unreal's UObjects as the base class of the Satisfactory buildings and so forth.
In point of fact, the Python GIL was designed to facilitate reference counting memory management, rather than per-instance locks and refcounts.
It is a lot more common for enterprise software developers to write heavily threaded code in Java or the like (Scala? Rust?) using per-instance object locks than it is for folks to roll their own fine-grained locks in native C or C++.
How do you enforce this ordering of operations?
Kernel programmers might use continuations, but those are heavy-weight compared to mutexes or RWLocks.
If you implement a global lock over the task of processing items moving on belts, then you've prevented multiple threads from doing work in parallel.
Even for such a simple case, it is not easy to determine how to divide the work up onto many threads without using per-object locks.
Basically you still have a primary game loop thread that is responsible for executing the same task sequencing it already does, it just doesn't end up doing all the actual task work as well. Each section of the game loop that iterates over similar objects, such as moving items on conveyor belts over one space, the task space is divided into discrete tasks. These tasks are then allocated to a number of pre-existing worker threads that were created based on available system cores.
The main loop thread also keeps a number of tasks to execute itself so it's not just waiting around for the other tasks to complete. When it completes its work, it checks to see if the remaining tasks are also complete or waits for their completion. Once all work associated with that section of the game loop is finished it moves on to the next section using the same map-reduce style of processing.
Since each task is discrete and no portion of the underlying information set is shared between tasks, they can be completed without locking. The only locking is the work threads contending with each other to pick up available tasks off the task queue placed there by the main loop thread. The main loop thread ensures that the entire task set is complete before continuing on to the next. This is how you avoid collision without locks.
The main benefit is that you don't have to completely redesign the entire game. You are doing the exact same thing you used to do on a single thread but allowing the work to proceed in parallel where possible. You typically only need to make minor changes to the information set (like dividing buffers into discrete elements) to be compatible with the approach.