Install Steam
login
|
language
简体中文 (Simplified Chinese)
繁體中文 (Traditional Chinese)
日本語 (Japanese)
한국어 (Korean)
ไทย (Thai)
Български (Bulgarian)
Čeština (Czech)
Dansk (Danish)
Deutsch (German)
Español - España (Spanish - Spain)
Español - Latinoamérica (Spanish - Latin America)
Ελληνικά (Greek)
Français (French)
Italiano (Italian)
Bahasa Indonesia (Indonesian)
Magyar (Hungarian)
Nederlands (Dutch)
Norsk (Norwegian)
Polski (Polish)
Português (Portuguese - Portugal)
Português - Brasil (Portuguese - Brazil)
Română (Romanian)
Русский (Russian)
Suomi (Finnish)
Svenska (Swedish)
Türkçe (Turkish)
Tiếng Việt (Vietnamese)
Українська (Ukrainian)
Report a translation problem
I would not find it hard to think AMD's latest stuff is too buggy to be worth dealing with which one of the main reasons I went with Nvidia (even though I actually feel all their gpu's across the board are have high rates of problematic behavior). But, I really haven't said that or don't want to say it because I remember you giving them the benefit of the doubt and that those stories were just anecdotal and I still don't want to risk increasing tension because I'm not trying to berate them. Just agreeing with their history.
But, I will say this in defense of AMD. I do feel standardization of hardware is a thing and that only so many configurations of hardware can really be focused on and optimized. That being said, I wouldn't blame them if they didn't catch a system with four ram slots occupied and as many drives as you have, and that screen resolution and your gpu+the 5800x3d in the mix and fix something that about it that caused a 7800 xt to bug out.
I don't think your issues is typical of a bug caused by things as simple as an odd screen res, of just having more than typical ram slots occupied, or anything simple like that but I do think it is possible.
This is just things I'm thinking right now.
I also mentioned this is a relatively young GPU from a less popular brand, and those results all turned up from a short time span on one web site alone.
Contrast that to the RTX 4070 Ti which has been out longer and likely has more users (could be wrong on that second part?). Now also remove the number of incidents where it actually is some system issue, unlike all these reports that basically describe ruling out every variable but the video card and are left without resolution.
How do they compare then?
We don't know.
My point is I'm not trying to make a factual statement to the frequency of the issues 7800 XT so much as I'm saying "I've personally tried almost everything else, evidence points to GPU, and now also there's a lot of other deja vu stories". It's just... pretty suggestive for what is likely the cause of my issue is all I'm saying.
You suggested it yourself, right? How long should I be expected to "continue to let myself to live among the issues" troubleshooting before I just rid myself of what introduced those issues and going with something else?
I indeed gave them the benefit of the doubt because that's how I felt at the time; I feel everything deserves a chance. I haven't any issues with ATI or AMD GPUs in the past, but it's been a while since I used one in my primary PC. And I think even if drivers are involved here, the 7800 XT might be a bit more problematic than usual, even for AMD's reputation. But that certainly could be coming from a place of being someone dealing with it. Maybe it gets resolved in time. Maybe the GPu or drivers are not even the issue and its something else for me (I'm really doubting this more and more though).
Either way, I'm certainly suffering an issue that really seems to be the GPU, and it's pretty eye opening to see there's other with stories that almost mirror mine.
Those don't seem to be a common variable. Others with less DIMMs, less storage, and typical aspect ratios are also having the issue. The 7800 XT is the constant. Everything else varies. (Edit: Actually I just noticed there could be one other constant as this seems to most commonly be happening on Ryzen platforms too... that would be really strange if an AMD GPU is free of issues on an Intel platform, or an nVidia GPU is free of issues on an AMD platform, but AMD paired with itself isn't?)
Of course I can't speak from a place of fact; I don't have the numbers of these GPUs sold versus people having issues so i can't say. But I see no reason those things in particular (the DIMM count, number of storage drives, or screen resolution) have any bearing here.
Particularly, did you find even one story "I rma'd the card, got it fixed/swapped, now everything is dandy"? Or even one that actually swapped it to nvidia not only wrote about the wish.
The engineering way is to look at working solutions, not the problems -- reports are easy to submit, especially on the internet and only create a crowd, not progress.
Also media amplifies like crazy. I recall some fancy card with vapor chamber issue that was supposed to kill the provider for good. It was all around with videos and everything. Few weeks later the world was still around -- also some real number rolled in on frequency that was in the lottery win range. Some random batch was not properly filled. those got replaced and no one heard about it since.
Some bad hw happens no matter the brand. Driver issues are common, but also the common manifestations that are driver related usually got fixed in hew months time. Maybe gaining new ones.
Back in time (~2010) I was strictly "nvidia only" for the driver reasons -- not so much the drivers themselves, but it was crystal clear that game studios test exclusively on nvidia cards and everything else is up to luck. But that very soon changed: nvidia went full ♥♥♥♥♥♥♥ mode around when the 10xx series went out. And the field got pretty even. The legends keep up in heads, but I don't see any reality behind it. I'd definitely would have stayed with nvidia if though there's any remaining edge.
the rx7800 drivers are expected to improve for being too fresh -- if the black screen is pure software, that is in doubt, it may go away. But you can't trigger machine check exception from gpu driver in any way. with some rowhammer-like attack might make some bit errors and vild crashing, but not cache hierarchy error and not consistently. Well, certainly not counting indirect effect from just using the engines in the gpu for work and so consuming power.
And bad interactions just happen. I doubt samsung makes worse ram than hx, yet early ryzens had lots of issues with the former (while the intel didn't care). And some edge still lingers. When we push everything to the edge it is probably expected -- and falling back is not necessarily a solution.
I'm not new to dealing with PCs or having to troubleshoot with them at times, and I'm certainly not new to dealing with lesser issues. My keyboard has a broken "Tab" key, so that key and the "|" were switched since I never use the latter. The LEDs for the WASD and arrows started intermittently going out, and now the key that controls the LED brightness for it also stopped being functional, so when they do rarely come on, they are blindingly light as they are stuck at the highest brightness level where I use either the dimmest or second dimmest for the rest. My speakers sound like they are raising the dead if adjust the volume while they are on. They are also getting harder to turn on and stay on (first time turning them on might need a few attempts or they go off by themself). My display has it quirks too. On and on. This is sort of why it upset me when you accused me of creating my own issues by not dealing with this. I "settle" probably more than many people here do. And this is certainly a show stopping issue, not a minor one I can just deal with (certainly not longer term at least). So I definitely '"did my dues" to try and work through this first.
I just hope that doesn't come back to make things worse for me in the end, because if I had returned it while there was a chance, then if this actually is an issue with the graphics cards or drivers, then I wouldn't be at the mercy of hoping they improve just to get a properly functioning card. And reading people who went into the RX 6000 series say they stuck with it for two years and are still suffering is... not reassuring on that front.
But hindsight is 20-20, as they say. You won't know unless you try, and sometimes you win the bad experience lottery, even with otherwise good parts.
In any case, by referencing these other issues, I'm not so much as trying to proclaim "there's an objective issue with these and this proves it" so much as I'm saying "I have an issue, the behavior and the things I've ruled out on my own already heavily suggest one particular thing, and these other happenings sure seem to match my issue well and support the same". I hope you can understand the difference.
It's still early so I haven't seen anyone mention a successful RMA and it being a solution to this problem yet, no.
But then again, that wouldn't solve it if the issue was the drivers anyway. That would solve it only if that specific GPU sample had a hardware fault. So that not being reported as a fix doesn't absolve the drivers as being a possible cause.
One person apparently got turned away by Sapphire and directed to take their issues to the AMD forums because their issue sounds like driver issues to Sapphire instead of a hardware issue.
I did see others saying they didn't have the issue on a prior GPU (ranging from both AMD to nVidia), then had it on the 7800 XT they switched to, and either went back to the prior GPU or just switched/were intending to switch to something else entirely. And yes, some of those said that stopped the issue indeed. This also matched my experience where my GTX 1060 hasn't yet shown the issue before the change, and has yet to afterwards (though I'm still needing to confirm the issue hasn't followed me back to the GTX 1060, which is what I'm going to be in the process of trying to rule out now, but I have a guess here that they won't).
How certain are you of this?
From what I know, machine check exceptions do indeed seem to usually be a hardware issue as opposed to a software one, which is why I was doubtful most of my software troubleshooting would get me anywhere (but I did it anyway to formally rule it out and not have to question it). But while that's usually the norm, I'm finding it's mentioned that drivers can sometimes be a cause of them?
The Watchdog logs are giving me these.
VIDEO_ENGINE_TIMEOUT_DETECTED (141)
VIDEO_TDR_TIMEOUT_DETECTED (117)
VIDEO_MINIPORT_BLACK_SCREEN_LIVEDUMP (1b8)
Those seem to point to the GPU and/or the drivers? Ergo, maybe a fault with the GPU or drivers is why the machine check exception/fatal hardware exception is getting thrown?
But since I was asked about it elsewhere and didn't want to post about my issues in someone else's thread, here's an inconclusive (perhaps key phrase) update.
After some more troubleshooting (and making another thread on another forum), I did end up at the eventual step of sending the RX 7800 XT out for RMA. I got it back a few days ago. The return from RMA was labeled a "replacement" and has a new serial number so I presume that means Sapphire found something wrong with the one I sent in, but they didn't state that nor what it might have been.
I haven't yet had enough time with it to conclusively see if it resolved the Black screen to restart issue (less than a week, and it's a busier holiday time), and I feel like I'd need up to a solid month or two, maybe even three, to fully be assured of that particular behavior being gone anyway, but I've seen some other concerning things in the little time I have had to use it.
1. I'm now having some TDR/driver crashes where I wasn't having any before. Thus far these are limited to one game, and its one that a lot of its players back in the day said had crashing issues on AMD GPU hardware/drivers. That was years and years ago, but it's not a good sign.
2. I'm also experiencing reduced performance, at least in Minecraft. It's basically lower performance across the board and has a lot more stuttering and hitching. I noticed utilization is also strangely staying around ~80% (plus or minus some) utilization almost all of the time, regardless of the scene (this seems somewhat odd...), and it wasn't like that before. It seldom drops below this, and if it does, it's not often nor by much.
I could greatly elaborate on the details with Minecraft but I won't. I don't need to. The key thing is it's just basically much worse performance now, with an oddly "nearly locked" utilization level. It's either the newest 23.12.1 drivers with Minecraft (I never tried these before as they released after I sent it out for RMA), or there's something different wrong with this individual RX 7800 XT, because those were the only two changes relative to before.
It's at the point where even if the Black screen to restart issues are gone (which is still not verified), I'm concerned and upset between the TDR/driver crash issues appearing and the massive lower performance in a major game that I play. It makes me want to just consider taking a massive loss on it by selling it second hand, and taking a further loss by going with whatever highway robbery nVidia ends up charging for their cheapest upcoming RTX 40 series Super variant that has 16 GB VRAM. I was originally considering buying a new PSU before giving up on the graphics card, but that was if the replacement 7800 XT behaved exactly as the first. With these changes in behavior with the second 7800 XT, I'm lost.
If there's one word to describe this experience, it's "exhausting". That's it. It's just tiring. I'm four months in, I'm ~$700 in, and there's even been some "lost" data, and I'm still in a position where something isn't measuring up or working properly and I'm looking at more wasted money and time ahead? In my nearly twenty or so years at this, I've never had as bad of an experience as this one. Never. I'm not sure if I'm one of the ones Radeon just doesn't work for for whatever reason, or if there's something else really deep going on with my PC somewhere that only showed up with the graphics card change. Either way it's gotten frustrating long ago.
No more compromises.
But thank you for the acknowledgement I've put enough time into this! I didn't want to return it right away because I felt like that might be overreacting to what could be a smaller issue, but... nobody can say I didn't put the proper time and effort into it after all this.
Unfortunately, I might have to sell it second hand on a loss and go back to nVidia. And oh joy, those power cables on the RTX 40 series! Out of the frying pan and into the fire...
Someone please put me out of my misery now. And some people have the gall to say the graphics card market has never been better for consumers...
No offense taken, so don't worry. If anything, I've been questioning my ability to diagnose this the further it goes on because like I said, I've never dealt with anything like this before.
I don't think that's worthwhile at this point though. Partly because it's going to come down "swap parts until it gets working right" either way, and the rest of the stuff is known to work when the 7800 XT/AMD's drivers is removed from the equation. I've already put a lot of money into this, like ~$700 counting the graphics card, tax, and shipping for the RMA. Then there's the possibility I'm looking at the better part of a grand (!) if I have to go back to nVidia to rectify this. Alternatively, I could try buying another PSU if I wanted to jump to something other than the graphics card, but if that fails, I'm then that much more in the hole. Either way, either of those make sense to jump to versus paying someone the better half of what a PSU would cost to try myself.
The strange utilization (being "stuck near 80%") in Minecraft is gone, hopefully for good. I noticed yesterday that this "stuck utilization" appeared to be what Afterburner was showing, but Minecraft itself showed a more fluctuating utilization. So maybe it was just some driver or Afterburner thing. Note that Minecraft's utilization itself is known to be wrong more often than not.
What I did was reinstall the drivers. I wanted to try to go back to 23.11.1, but the installer I downloaded was for 23.12.1 I guess. Oh well, I went with the latest again but it worked.
The whole reason I reinstalled them now? I saw this on startup...
https://i.imgur.com/3hlMxqC.png
https://www.amd.com/en/support/kb/faq/gpu-sacerkl
I have no idea what that means.
During driver install, my keyboard/mouse completely cut out (keyboard LEDs went off and my mouse which is connected to the keyboard passthrough also sdtopped working) and the display lost signal for about half a minute, then all came back. That never happened before, but hopefully there's nothing more to it.
Upon restart, the wallpaper was missing and the right click context menu took some seconds to show up instead of being instant, and this "fixed itself" when I brought the personalization menu up which made the wallpaper show up.
So... drivers reinstalled, had some weird behavior, utilization is now fixed.
Still don't know if Minecraft's performance is back to normal or not (Edit: Performance still seems lower based on an informal "recording performance test"), and the previous TDR issues need checked. Still need to see if the Black screen to restart issues are gone as well. Will update as I know more, and happy holidays everyone. Hope they're going well.
Do you always have afterburner running? Like all the time? Is it on your startup and it was present through all these issues or it was for the most part closed and you only run it rarely?
It means that something updated your drivers and reverted them to an older version of the driver or version that wasn't compatible with Addrenalin (basically a driver installed without your permission that is not what installed). Some software like MB software, Windows Update, benchmarking software, or other tools might update your driver automatically.
Make sure you disable driver downloads from Windows Update, that is important on AMD because if you don't sometimes Windows will replace the driver with the one in the Windows Update branch -- that one can be a very old basic driver, or a testing / debug driver if you are in a Windows Insider build.
It's a very annoying bug and issue to pin down sometimes because even motherboard software or software like HP OMEN GAMING HUB (even if you have it for your keyboard) might modify your GPU drivers without telling you.
For Windows:
https://www.makeuseof.com/windows-stop-automatic-driver-updates/
In some cases the above may not work, you can post here
https://answers.microsoft.com/en-us
and problem looked into more.
But what is strange is this:
Never heard of the wallpaper going missing or seen that happen as the driver should never touch it, at least I have never seen any report of it. There may be other software that is interfering with the driver download that resulted in that behavior (modified start with basic driver and no personalization services -- then load with GPU driver + UI services on trigger).
But if it only happened once, perhaps it is nothing to worry about.
Also, the stuck at ~80% utilization returned right after I posted that. I notice it only occurs in full screen mode and not windows.
I guess I'm not too concerned with it. It's the TDR issues, the maybe-gone-maybe-still-there Black screen of death issue, and the seemingly reduced performance issues with Minecraft (23.12.1 lowered it maybe?) that I'm more concerned with.
Thanks for the information. If it returns, I'll look further into it. Everything seemed to work fine, and I initially installed the drivers with a DDU run beforehand (and I had it set to disable Windows installing the drivers). Adrenalin itself opened and worked fine and didn't complain about a version mismatch.
Yeah, it was strange to me too as I never saw it, but if it was a one off then oh well. I was mostly just noting it.
When the keyboard first went out I thought it was Black screen restarting actually, because my display did lose signal and start restarting on repeat. This has been another issue I have had with it. The wallpaper was fine until I restarted, and then it was Black (the "background color" I have chosen) with a delayed context menu response, and then when i chose to go into "personalize" it went as it should.
I've restarted the PC twice since then and it hasn't done it.
Small stuff like this is normally a non-issue for me if it happens once so it's not a big deal, but with everything else going on it has me second guessing everything. Like my thoughts race on "what could cause this" when stuff happens now. My thought was the graphics drivers initializing, which happens through the PCI Express stuff on the CPU, caused a CPU-side instability and it brought the USB down for a moment? Like I said I am basically second guessing my whole PC ever since I tried upgrading my freaking graphics card. Because the issue came with the GPU but changing platform side stuff (basically XMP on versus off) impacted it. It was unstable either way but worse with XMP off. With the old graphics card it's stable with XMP on or off.
Seriously makes me want to cry, throw the whole PC out, and start over if these issues remain. This stuff is getting old.
*sigh* Sorry for ranting, just going to take it a day at a time now. If the Black screen reboot issues don't return, that's a big improvement right there and signifies the first 7800 XT just had something wrong. Then I can go from there.
Time will tell if I end up encountering the Black screen to restart issues, and I also need to figure out what the TDR crashes in the other game were (also 23.12.1?), but Minecraft is a big thing I do so that alone was a pleasant improvement. Hopefully whatever changed things in 23.12.1 for me doesn't continue to be the case in future drivers because I hate being locked to an older one (but I ended up stuck with older drivers for many years with nVidia after I got my GTX 1060, so... it's whatever?).
If the Black screen to restart issues return, I will go back to 23.12.1 and see if that resolves them (and if not it's new PSU time). If it does resolve them, then I'll finally have a conclusion to this whole ordeal. So, time to let time do its thing and see if the issue is still there or if its gone.