Este tema ha sido cerrado
8========~D 20 ENE 2023 a las 15:33
RTX 4090 is not for Native 4K gaming with the latest game engine
with a huge price tag of 3000$ (if youre lucky 2000$ + tax 2400$ Approximately) almost for a card its only able to barely touch 4K 60 fps on the Latest Game engine UNREAL ENGINE 5.1 ? so is it safe to assume ALL gpu's are designed to run games at 1440p and we should all be OK paying 3000$ just to be satisfied at 1080p and 1440p ?? it was not too long ago i remember a GTX 1080 TI was able to run games at 4k 60 yes it was based on games from that generation but shouldnt a GPU thats released now be able to run all games at atleast 4k 120 fps with the latest game engine IF its the top end GPU of current generation??

Sure if you use old generation engines it does reach that fps but what about the future games using the latest game engine ?? are we expected to pay 3000$ every year for the next gpu ??

Here is the benchmark and keep in mind i am only referring to NATIVE 4K not DLSS : https://www.youtube.com/watch?v=dr7LpP7Dm4E


Update January 27 2023 : Newly released game Forespoken running at 4K Native at 43 FPS its not even been a year since 4090 release and this is its performance : https://www.youtube.com/watch?v=U0u9l4Wkh9s

Update February 07 2023

Newly released game as of the date hogwarts legacy doesnt even get 60 fps : https://www.youtube.com/watch?v=5dKUpcMckBg

Lets make sure to blame every single game developer from all different companies and pretend the RTX 4090 is actually a 4k card and all developers are doing everything wrong from all studios and all companies. Its DEFINITELY NOT NVIDIA pretending and Lying publicly about RTX 4090 being a true "Native 4k" GPU.

Lets all keep pretending.
Última edición por 8========~D; 7 FEB 2023 a las 15:16
< >
Mostrando 406-420 de 616 comentarios
Komarimaru 29 ENE 2023 a las 4:25 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
The video shows over double performance for a few games, 70% for the average with the lowest being over 50% increase of performance vs a 3090. The largest generation increase in performance ever.

And seriously, you think when they say Creators, they are talking about the workloads for computing and scientific study?! LOLOLOL

Creators... Content creators... Streaming, videos. The whole AV1 encoder... My god are you dumber than you post after all.

I'll ask a final time.

How is 1:1 Fp16, not gimped?
Better question, explain to me how it is gimped. Since you obviously have no idea what is does.
[Jokeeece]Nightmare 29 ENE 2023 a las 4:26 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

I'll ask a final time.

How is 1:1 Fp16, not gimped?
Better question, explain to me how it is gimped. Since you obviously have no idea what is does.

So you're dodging the question then?
Komarimaru 29 ENE 2023 a las 4:40 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
Better question, explain to me how it is gimped. Since you obviously have no idea what is does.

So you're dodging the question then?
You're the one who claims it is gimped, not me. The throughput speed alone makes up for it, and yet while not being 2:1, due to the power of the card, it's faster at calculating than the 7900XTX, which I may add, is at 122.8 TFLOPS with the 2:1.

Let that sink in... It's faster, yet has slower theoretical performance... Hmmm. Why is that though? Could it be because most run mixed, since FP16 is lower accuracy and limited due to being half precision. This means a maximum of 65504 bits. And while super fast, mixed is generally used more. Why though?

Well, let's see... Maybe because FP16 suffers from poor weight updating, gradient underflow, and god forbid activation underflow/loss. How do we fix it? Using FP32 to do the calculation to prevent over/underflow, FP32 handles the scaling factor so weights are made more precise, then sent back to FP16.

So, now that you've been educated, tell me how it's gimped. Since even being "Gimped" as you claim, it's out performing the TitanV, RTX Titan, and 3090TI in such areas. Barely losing to the H100 due to memory limitations.

I really don't see how this is gimped, but by all means, show me how its FP16 performance gimps gaming, since it's sold as a gaming and content creator GPU.

And again...
https://youtu.be/K8_QPx-IN-o
It's a beast at what they advertising it does.
Última edición por Komarimaru; 29 ENE 2023 a las 4:41
[Jokeeece]Nightmare 29 ENE 2023 a las 5:04 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

So you're dodging the question then?
You're the one who claims it is gimped, not me. The throughput speed alone makes up for it, and yet while not being 2:1, due to the power of the card, it's faster at calculating than the 7900XTX, which I may add, is at 122.8 TFLOPS with the 2:1.

Let that sink in... It's faster, yet has slower theoretical performance... Hmmm. Why is that though? Could it be because most run mixed, since FP16 is lower accuracy and limited due to being half precision. This means a maximum of 65504 bits. And while super fast, mixed is generally used more. Why though?

Well, let's see... Maybe because FP16 suffers from poor weight updating, gradient underflow, and god forbid activation underflow/loss. How do we fix it? Using FP32 to do the calculation to prevent over/underflow, FP32 handles the scaling factor so weights are made more precise, then sent back to FP16.

So, now that you've been educated, tell me how it's gimped. Since even being "Gimped" as you claim, it's out performing the TitanV, RTX Titan, and 3090TI in such areas. Barely losing to the H100 due to memory limitations.

I really don't see how this is gimped, but by all means, show me how its FP16 performance gimps gaming.

By gimped, I mean it's a driver limitation forcing only 1:1 conversions, so they literally limit it.

So for AI elements the FP16 is actually severely limiting, and you're not actually better off using FP32 than you'd think due to with FP16 being gimped, in the TC operations it limits the FP32 accumulation.

FP16 isn't really the greatest for mixed precision training, but for raw number crunching it's actually okay. But for image processing and neural networks, FP16 is great.
Última edición por [Jokeeece]Nightmare; 29 ENE 2023 a las 5:05
Komarimaru 29 ENE 2023 a las 5:07 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
You're the one who claims it is gimped, not me. The throughput speed alone makes up for it, and yet while not being 2:1, due to the power of the card, it's faster at calculating than the 7900XTX, which I may add, is at 122.8 TFLOPS with the 2:1.

Let that sink in... It's faster, yet has slower theoretical performance... Hmmm. Why is that though? Could it be because most run mixed, since FP16 is lower accuracy and limited due to being half precision. This means a maximum of 65504 bits. And while super fast, mixed is generally used more. Why though?

Well, let's see... Maybe because FP16 suffers from poor weight updating, gradient underflow, and god forbid activation underflow/loss. How do we fix it? Using FP32 to do the calculation to prevent over/underflow, FP32 handles the scaling factor so weights are made more precise, then sent back to FP16.

So, now that you've been educated, tell me how it's gimped. Since even being "Gimped" as you claim, it's out performing the TitanV, RTX Titan, and 3090TI in such areas. Barely losing to the H100 due to memory limitations.

I really don't see how this is gimped, but by all means, show me how its FP16 performance gimps gaming.

By gimped, I mean it's a driver limitation forcing only 1:1 conversions, so they literally limit it.

So for AI elements the FP16 is actually severely limiting, and you're not actually better off using FP32 than you'd think due to with FP16 being gimped, in the TC operations it limits the FP32 accumulation.

FP16 isn't really the greatest for mixed precision training, but for raw number crunching it's actually okay. But for image processing and neural networks, FP16 is great.
And yet, you claim it's gimped, yet can't prove on why. If it's so gimped, why does it outdo the 7900XTX, with it's 'ungimped' performance in calculation, even if don't run mixed?
[Jokeeece]Nightmare 29 ENE 2023 a las 5:08 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

By gimped, I mean it's a driver limitation forcing only 1:1 conversions, so they literally limit it.

So for AI elements the FP16 is actually severely limiting, and you're not actually better off using FP32 than you'd think due to with FP16 being gimped, in the TC operations it limits the FP32 accumulation.

FP16 isn't really the greatest for mixed precision training, but for raw number crunching it's actually okay. But for image processing and neural networks, FP16 is great.
And yet, you claim it's gimped, yet can't prove on why. If it's so gimped, why does it outdo the 7900XTX, with it's 'ungimped' performance in calculation, even if don't run mixed?

I literally said how it's gimped. They force 1:1 ratio in the drivers.
Komarimaru 29 ENE 2023 a las 5:10 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
And yet, you claim it's gimped, yet can't prove on why. If it's so gimped, why does it outdo the 7900XTX, with it's 'ungimped' performance in calculation, even if don't run mixed?

I literally said how it's gimped. They force 1:1 ratio in the drivers.
https://lambdalabs.com/blog/nvidia-rtx-4090-vs-rtx-3090-deep-learning-benchmark
And yet...
[Jokeeece]Nightmare 29 ENE 2023 a las 5:12 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

I literally said how it's gimped. They force 1:1 ratio in the drivers.
https://lambdalabs.com/blog/nvidia-rtx-4090-vs-rtx-3090-deep-learning-benchmark
And yet...

Wow shocker FP and TF not the same thing.

This just in, the sky is blue.
Última edición por [Jokeeece]Nightmare; 29 ENE 2023 a las 5:13
Komarimaru 29 ENE 2023 a las 5:13 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
https://lambdalabs.com/blog/nvidia-rtx-4090-vs-rtx-3090-deep-learning-benchmark
And yet...

Wow shocker FP and TF not the same thing.
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp16.csv
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp32.csv

Wanna keep going?

You're showing you don't know things again. And the card was never advertised to do deep learning, ya showing how dumb you are.

And well, look at that... It's still the most affordable for even something it's not meant to do compared to how well it performs. SHOCKER!
Última edición por Komarimaru; 29 ENE 2023 a las 5:16
[Jokeeece]Nightmare 29 ENE 2023 a las 5:19 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

Wow shocker FP and TF not the same thing.
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp16.csv
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp32.csv

Wanna keep going?

You're showing you don't know things again.

Wanna keep going? Notice how on tacotron and waveglow they're practically identical?

That's the TC operation limit that the driver is forcing.

But hey have if you have core sparsity on regular tasks it does a 4:2. Go ♥♥♥♥♥♥♥ figure
Komarimaru 29 ENE 2023 a las 5:23 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp16.csv
https://github.com/lambdal/deeplearning-benchmark/blob/22.09-py3/pytorch/pytorch-train-throughput-fp32.csv

Wanna keep going?

You're showing you don't know things again.

Wanna keep going? Notice how on tacotron and waveglow they're practically identical?

That's the TC operation limit that the driver is forcing.

But hey have if you have core sparsity on regular tasks it does a 4:2. Go ♥♥♥♥♥♥♥ figure
Ok, and? It's nearly identical between all the cards shown there between FP16 and FP32... OR did you not notice that?
[Jokeeece]Nightmare 29 ENE 2023 a las 5:28 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

Wanna keep going? Notice how on tacotron and waveglow they're practically identical?

That's the TC operation limit that the driver is forcing.

But hey have if you have core sparsity on regular tasks it does a 4:2. Go ♥♥♥♥♥♥♥ figure
Ok, and? It's nearly identical between all the cards shown there between FP16 and FP32... OR did you not notice that?

That's literally my whole point this entire time. Even on the Hopper GPU's Nvidia decided a 1:1 FP16 on non sparse matrices was a good idea.

Guess how many times in real world cases do you have a sparse matrice? If you guessed virtually none, then you'd be correct.
Última edición por [Jokeeece]Nightmare; 29 ENE 2023 a las 5:29
[Jokeeece]Nightmare 29 ENE 2023 a las 5:31 
I need to mention that on the DGX solutions that Nvidia "offers" This isn't the case.
Komarimaru 29 ENE 2023 a las 5:37 
Publicado originalmente por JokeeeceNightmare:
Publicado originalmente por Komarimaru:
Ok, and? It's nearly identical between all the cards shown there between FP16 and FP32... OR did you not notice that?

That's literally my whole point this entire time. Even on the Hopper GPU's Nvidia decided a 1:1 FP16 on non sparse matrices was a good idea.

Guess how many times in real world cases do you have a sparse matrice? If you guessed virtually none, then you'd be correct.
And yet, it's the same results for a Quadro RTX 8000, that has the 2:1. You really don't know what you're talking about.

I think we can all see now, that the 4090 is a fine card for what it offers and is designed for.

People in this thread just broke as a joke, as it were, and can't afford high end hardware as it's released. Instead they try and claim said hardware is crap, and performs poorly, even when proven wrong consistently.

May I recommend for those people, to buy used, or wait for next generation to release and then upgrade when current gen becomes not so current, this way you'll save money.

Time to let the thread die and the two cry babies to wallow in their own misery.
[Jokeeece]Nightmare 29 ENE 2023 a las 5:47 
Publicado originalmente por Komarimaru:
Publicado originalmente por JokeeeceNightmare:

That's literally my whole point this entire time. Even on the Hopper GPU's Nvidia decided a 1:1 FP16 on non sparse matrices was a good idea.

Guess how many times in real world cases do you have a sparse matrice? If you guessed virtually none, then you'd be correct.
And yet, it's the same results for a Quadro RTX 8000, that has the 2:1. You really don't know what you're talking about.

I think we can all see now, that the 4090 is a fine card for what it offers and is designed for.

People in this thread just broke as a joke, as it were, and can't afford high end hardware as it's released. Instead they try and claim said hardware is crap, and performs poorly, even when proven wrong consistently.

May I recommend for those people, to buy used, or wait for next generation to release and then upgrade when current gen becomes not so current, this way you'll save money.

Time to let the thread die and the two cry babies to wallow in their own misery.

The true 2:1 explains why it more than doubles on a lot of the tests.

And then tacotron and waveform were designed to run on the V100 DGX. which conveniently wasn't on that list. (no i'm not actually saying that sarcastically, why wasn't it on the list?)

And then we finally agree on something. I'd call the 4090 a "fine" card at best. but not a GREAT card.
Última edición por [Jokeeece]Nightmare; 29 ENE 2023 a las 5:48
< >
Mostrando 406-420 de 616 comentarios
Por página: 1530 50

Publicado el: 20 ENE 2023 a las 15:33
Mensajes: 616