Installer Steam
connexion
|
langue
简体中文 (chinois simplifié)
繁體中文 (chinois traditionnel)
日本語 (japonais)
한국어 (coréen)
ไทย (thaï)
Български (bulgare)
Čeština (tchèque)
Dansk (danois)
Deutsch (allemand)
English (anglais)
Español - España (espagnol castillan)
Español - Latinoamérica (espagnol d'Amérique latine)
Ελληνικά (grec)
Italiano (italien)
Bahasa Indonesia (indonésien)
Magyar (hongrois)
Nederlands (néerlandais)
Norsk (norvégien)
Polski (polonais)
Português (portugais du Portugal)
Português - Brasil (portugais du Brésil)
Română (roumain)
Русский (russe)
Suomi (finnois)
Svenska (suédois)
Türkçe (turc)
Tiếng Việt (vietnamien)
Українська (ukrainien)
Signaler un problème de traduction
But thats the thing, you can not balance it out, if you do not know the actual task-load you will have to handle. In this gamne, every step out in the world, gives you a different environment incl. the load on the CPU by the eco-simulation. If we had a fixed looping eco-simulation with always the same amount of assets and objects to handle, they would have quite the optimasation headroom to make use of, but that "stable ground" is not give, thats why everything has to be dynamical and in expectation of the "worst case". Timings are a factor too, but also nothing you can controll if you do not know, what the total process load will be.:
Global load tells you nothing. You can have 10% global load and still are CPU bottlenecked because the one core/thread the game/app runs on, its already hitting its 100% capacity. It does not matter if the other 9 cores are free to do something, if the one core/thread is fully loaded, there is nothing to "fix" that CPU bottleneck.
Have a look at this: https://www.youtube.com/watch?v=-uricga09EA
There you can see the thread/core specific load. Now pay attention to thread 2 in this benchmark. You see it always on 70-78% load. This is the mainthread the game runs on. Why is it not 100%? Because it waits for the data of other support threads. As you can see in the top-post of this thread, by overclocking the CPU in total clock-speed and PBO, i managed to lift the bottleneck and managing the GPU is always 100% loaded. Result: 20fps more with no settings change. Now some may come to the conclusion that the PBO on the "main thread" may be the reason for it, but no, its the "support-threads" that are faster done with their work, that make the main-thread not throttle itself.
Give it a try and review your own benchmark with rivertuner and make out your own threadload on this. You will see that the global load info is worthless informations since a very long time.
But simply makes no sense for me personally ^^
I could do this right now but if have to go down to 1080p to even find that bottlneck so it's simply not worth it because thats not the resolution i would ever play on on my setup ^^
So most basically everything i said is totally theoretical and only for the idea to express that the game is obviously handling threads not really good which clearly is a problem on weaker CPUs.
It doesn't really matter if we are talking about main- or sidethreads. At least one of them should be at high load to signify a bottleneck overall.
If we don't see any high load at all the whole threadmanagement most likely communication between threads is simply not efficient.
Wirth's law is not moving forward. These details are optimizable, and should be, as it's how you move forward. The less you tax the modern hardware, the more you can add to it. You don't put square wheels on your car to push your motor's limits.
It's really important to understand that the newest CPUs are not overclock friendly, either. They already have mechanisms in them to clock themselves based on their temperature.
This was meant more as an explaination on why the things are and not to fall for the false reading in metrics. You do you.
I remember back from the Metal gear Solid V development (Fox Engine) they integrated a dynamic thread-handler depending on the core/thread amount of the system, so systems with weaker CPU's get more differently loaded then their mass-core counterparts. But that was with legacy hardware support in mind. This engine is truly designed for a PS5 type 8core+ system with target for future hardware and not legacy.
Yes and no, they can already auto-overclock with a well support chipset and motherboard by themself, but there is also quite the headroom depending on the core-quality (golden cores) or chiplet stability.
That's not my problem, and even that's not actually the wisest idea long term. They're aiming for an AMD, which indeed the PS5 is using. I don't know how long you've been in this, and we've had discussions before so i know you're a little smarter than the average bear on this board, so i'll give you a bit of a history lesson. AMD has had a history of focusing on individual instruction optimizations, like figuring out how to make things like the DIV instruction faster (easily one of the slowest). Intel's a bit more adventurous being the big fish in the small pond.
Intel noticed that when compilers compile files, functions tend to get thrown into their own segments, which are usually 512 bytes. Intel noticed that this is the case, even if you have a small function of only 10 bytes in size (which is surprisingly common, which is a whole other discussion). Intel realized that this resulted in a lot of cache misses. This was around the time we moved from 32bit to 64bit as the current standard. I remember at this time there was alot of talk of intel maybe driving AMD out of business, and AMD kinda disappeared. It was still there, but no one took it seriously. Apparently, this "cache is king" rule you cite was made at about that time. For some reason, and i don't know why, this was thrown out. AMD had to have been traumatized by this, because they made a comeback in the Zen generation and recently they've thrown their hat into the 3D v-cache technology, which effectively makes dramatic improvements in cache size. Intel has no answer to this, in fact i went to see what they changed, and Intel's website is doing the nVidia thing: focusing on AI technologies.
What this means is the future is going to be in cache optimized code if Intel's going to continue to be competitive. On the flip side, the concept i offered before about high-attention and low-attention modes for everything will increase the cache usage and be a reasonable middle-ground. On the flip side, throwing a thread at everything will increase cache utilization, which only looks good for a massive L3 cache. But here's the wild bit: PS5s aren't using these. These are brand new AMD exclusive tech. Excessive threading like this is going to bottleneck the RAM long term.
How much headroom we talking? After hearing they're using the temp to control that, i stopped reading any further into it. That's some bad mojo to risk, but if you've done it i'll listen.
Yeah there is a lot of truth in it, but the problem for AMD was a very long time also that AMD was not able to go down to 10nm processing where INTEL already was for years, hopelessly trying to play catchup with INTEL efficency. Only with AMD stopping its own production and ordering TSMC to produce its new chips with 7nm down to 4nm now, they did not only manage to get competetive with INTEL, but also overtake them for now. The more cycles you can do by architecture and smaller chiplet/core build, the more processing speed you have.
You see i am a software developer myself (VB.NET, C++, C#, ASPX, xamarin, Maui, some others less demanded - not games (with one exception), but buisness client desktop/mobile and server software). About 95% or all tasks or opperations one needs to process, does not require that amount of a "X3D-cache" that these specific CPU's offer. You only have advantage of that, if you have masses of data to process, that can not be split up (load massive loaded assets).
What i want to say here, in 95% of all the bachground threads this game has, maybe 5 of them can make use of the L3-cache of the X3D CPU's. The rest is processing speed depending and bound to it. What also is clearly shown that a X3D CPU in combination with the same hardware as its non X3D counterpart, does only have a fps advantage of 1-3%.
So future apps will also only be "cache optimised" if the task/process itself requires it. Most of all future developments will still be "legacy" developed of main and multithreading loads and this will not change as long this massice L3-cache will not became a standard.
Well depending on the "silicon lottery" of the core/chiplet quality you have, it can be anything from 3-25% overclock headroom. But i must strongly mention that the CPU can not stand alone on itself like in the old days of AM3 or AM4. You now do need a proper motherboard and compatible RAM to uphold these frequencys.
Have some examples of the OC potential, one air-cooled (like ine) and one "shock frosted":
https://www.youtube.com/watch?v=3hNgEgmLJ8Q
https://www.youtube.com/watch?v=pYWtP4tZe30
Its not for everyone and almost all should be happy with what they get "out of the box" as long a proper cooling solution is given, but in cases like these, i wanted to show (see OP post) what can be achived by just one 1 or 2 future generations.
I'm a "cowboy coder" having learned C++ back in 2003 or 4, x86 assembly around 2007, and the usual stack of VB.net, C#, etc, but I'm a nobody in a garbage bag factory. And my experience says, indeed you're right, just not for games. In fact, the reviews of the X3D cores actually tell you to get the 7800X3D for better gaming performance than the 7950X3D, because of the need for fewer threads and better cache usage (the chache isn't as easily accessible for some of the cores, i forget why off hand but for gaming performance there is a barely noticeable drop in performance as a result). The reason i didn't take the advice was because I knew i was going to be compiling a lot of code and projects, and i wanted to do more than just games efficiently.
I'm curious how you came to that conclusion.
Which has surprised me as well, which tells me you might be right on what i previously quoted, but i'd like to know why. This is far from the norm for gaming.
Contrarily, the v-cache makes cache optimization less of a concern. The cache optimization of code will continue to be a problem so long as v-cache is not standard. But given how i know exactly what is at fault for it, and seeing what the solution is (and it increases compile times by a non-trivial amount), you aren't wrong that the cache optimization isn't likely to happen.
If you're curious, the solution in C++ (since that's the language we share that i know most) is to break the usual convention of only including .h files and include .cpp files as well instead of having the make file tell the compiler to make a bunch of intermediate files. This gives the compiler all the code at once so that it can optimize away functions that are otherwise exported and given their own pages. And we both know that'll never happen.
That's fair. Frankly, CPUs are expensive right now, and i sure as hell don't want to risk things like liquid cooling for marginal gains. 25% does sound nice, though, but if it's a lottery, i'm not melting my equipment over it (all my previous computers had overclocking locked, so it might be far safer than I imagine, but i've not done it and i'm not doing it with a CPU that costs over half a paycheck and that i'm still making payments on). It is interesting, though, I appreciate the work.
Getting a lower end GPU should not be something you do with intention. I don't see the point in grabbing an old GPU. I guess in theory you could get some marginal gains in transfer rates or something but that high end GPU still is going to come out on top. It's just in a CPU bottleneck situation, the GPU doesn't matter as much, so you can afford to get an older one if you're upgrading from an even older one.
EDIT: TO be clear, though, GPUs and their upgrades are inflated. You won't get much more out of an RX 7800XT than you would an RX 6800XT, for example (and it's even worse with nVidia). It's one of the ways i kept my build way cheaper. Don't go buying anything older than that, please.
So Windows updates (LOL.. jk) or driver updates COULD POTENTIALLY bring big CPU uplifts, but that's just me waffling.
Core scheduling in both Windows and Linux (which is what i'm using) isn't that bad. What would be nice would be the ability to manually set threads to specific cores, then we can start having "fast cores" and "slow cores" which can better manage certain kinds of loads. RIght now it feels random (it's not), and it makes it hard for everyone except the game developer.
As much as i like to trash on windows, i doubt people like me on linux are getting our benefits from simply being on linux. There's alot of bloat on windows, but the pipeline of application to driver is more or less the same.
I have tested my hardware on both Windows and Linux and can say that when a game is GPU limited I usually don't see much of an improvement. However, games that are CPU limited in some capacity almost always perform better on Linux.
This was comparing Windows 11 to CachyOS with the eevdf-lto kernel.
I see. What kind of improvement are we talking? Have you found the specific source? my GF is building her computer right now because MH Wilds was a wakeup call to her (i know, very bad time, but i said now or a year from now, becausei t's going to get worse before it gets better, and she is in a hurry).