Rare crash w/ screen going black along with buzzing noise, then reboots

All Discussions > Steam Forums > Hardware and Operating Systems > Topic Details

Hoppled Jan 20, 2024 @ 8:28am

This is probably my 3rd time now having this crash and it worries me since most of my components are only a year old, except the PSU which is about 4 years old.

The crash seems to happen on average once every two weeks. I play Fortnite often and it has happened every single time on that game, to my knowledge.

No weird under or overclocks going on. Only XMP enabled for my RAM. No real noticeable settings or lack of updates that would be an immediate culprit.

My temps are fine. Only thing in the Event Viewer logs I can find says critical error Event 41, Kernel-Power: "The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."

Specs:

CPU: Ryzen 5 5600X

GPU: RX 6600 XT

Mobo: MSI MPG B550 Gaming Plus

RAM: G.Skill Ripjaws 16GBx2 DDR4

PSU: EVGA 600W GD+

OS: Win 10

< >

Showing 1-15 of 24 comments

Illusion of Progress Jan 20, 2024 @ 12:59pm

Event ID 41 isn't very useful in diagnosing an issue on its own because it just means as it says; that Windows restarted and wasn't expecting it. It's a symptom of the issue, not the cause.

See if there's any other logs being made in event viewer from around the time (of the crash, or just after the following restart). Namely, I'm expecting you might see Event ID 18. When a PC catches a machine check exception condition, this is one of the things it will do, and on AMD platforms it gets logged under Event ID 18 (no idea what Intel logs it under) if its a CPU core that catches the exception (either a CPU core or RAM can catch a machine check exception condition).

Also, do these two directories have files in them that correspond with the time of any of the issues?

Windows/LiveKernelReports/WHEA

Windows/LiveKernelReports/WATCHDOG

If there are .dmp files present, you can use WinDbg to open and analyze them.

I had a similar issue after upgrading my video card, and I went through a process trying to rule out any issues my PC might have despite it starting after the video card was changed. In the end, doing an RMA on the video card has (thus far, anyway) resolved it. That's not to say this is your problem, as if it is a machine check condition being caught, there's a wide range of issues it could be. Usually it's hardware faulting (either from innately bad hardware, or a power issue) or rarely a driver can be a cause but it's less likely.

Hoppled Jan 20, 2024 @ 2:16pm

Originally posted by Illusion of Progress:
Event ID 41 isn't very useful in diagnosing an issue on its own because it just means as it says; that Windows restarted and wasn't expecting it. It's a symptom of the issue, not the cause.

See if there's any other logs being made in event viewer from around the time (of the crash, or just after the following restart). Namely, I'm expecting you might see Event ID 18. When a PC catches a machine check exception condition, this is one of the things it will do, and on AMD platforms it gets logged under Event ID 18 (no idea what Intel logs it under) if its a CPU core that catches the exception (either a CPU core or RAM can catch a machine check exception condition).

Also, do these two directories have files in them that correspond with the time of any of the issues?

Windows/LiveKernelReports/WHEA

Windows/LiveKernelReports/WATCHDOG

If there are .dmp files present, you can use WinDbg to open and analyze them.

I had a similar issue after upgrading my video card, and I went through a process trying to rule out any issues my PC might have despite it starting after the video card was changed. In the end, doing an RMA on the video card has (thus far, anyway) resolved it. That's not to say this is your problem, as if it is a machine check condition being caught, there's a wide range of issues it could be. Usually it's hardware faulting (either from innately bad hardware, or a power issue) or rarely a driver can be a cause but it's less likely.

thanks for the reply, I actually have a .dmp file associated with the crash last night in the WHEA folder. Should I report what WinDbg says here?

EDIT: Sharing the debug below and I also ran the analysis.

Last edited by Hoppled; Jan 20, 2024 @ 2:25pm

Hoppled Jan 20, 2024 @ 2:25pm

************* Preparing the environment for Debugger Extensions Gallery repositories **************
ExtensionRepository : Implicit
UseExperimentalFeatureForNugetShare : true
AllowNugetExeUpdate : true
AllowNugetMSCredentialProviderInstall : true
AllowParallelInitializationOfLocalRepositories : true

-- Configuring repositories
----> Repository : LocalInstalled, Enabled: true
----> Repository : UserExtensions, Enabled: true

>>>>>>>>>>>>> Preparing the environment for Debugger Extensions Gallery repositories completed, duration 0.000 seconds

************* Waiting for Debugger Extensions Gallery to Initialize **************

>>>>>>>>>>>>> Waiting for Debugger Extensions Gallery to Initialize completed, duration 0.015 seconds
----> Repository : UserExtensions, Enabled: true, Packages count: 0
----> Repository : LocalInstalled, Enabled: true, Packages count: 36

Microsoft (R) Windows Debugger Version 10.0.25921.1001 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\Windows\LiveKernelReports\WHEA\WHEA-20240120-0017.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

************* Path validation summary **************
Response Time (ms) Location
Deferred srv*
Symbol search path is: srv*
Executable search path is:
Windows 10 Kernel Version 19045 MP (12 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS Personal
Kernel base = 0xfffff801`65400000 PsLoadedModuleList = 0xfffff801`6602a790
Debug session time: Sat Jan 20 00:17:24.274 2024 (UTC - 5:00)
System Uptime: 0 days 0:00:04.868
Loading Kernel Symbols
...............................................................
................................................................
...........
Loading User Symbols
PEB is paged out (Peb.Ldr = 00000082`8d9c2018). Type ".hh dbgerr001" for details
Mini Kernel Dump does not contain unloaded driver list
For analysis of this file, run !analyze -v
nt!LkmdTelCreateReport+0x13e:
fffff801`65d856c6 488b03 mov rax,qword ptr [rbx] ds:002b:ffffa901`632a1f70=????????????????
6: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
nt!_WHEA_ERROR_RECORD structure that describes the error condition. Try !errrec Address of the nt!_WHEA_ERROR_RECORD structure to get more details.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: ffffa90163411840, Address of the nt!_WHEA_ERROR_RECORD structure.
Arg3: 00000000baa00000, High order 32-bits of the MCi_STATUS value.
Arg4: 000000000002010b, Low order 32-bits of the MCi_STATUS value.

Debugging Details:
------------------

*************************************************************************
*** ***
*** ***
*** Either you specified an unqualified symbol, or your debugger ***
*** doesn't have full symbol information. Unqualified symbol ***
*** resolution is turned off by default. Please either specify a ***
*** fully qualified symbol module!symbolname, or enable resolution ***
*** of unqualified symbols by typing ".symopt- 100". Note that ***
*** enabling unqualified symbol resolution with network symbol ***
*** server shares in the symbol path may cause the debugger to ***
*** appear to hang for long periods of time when an incorrect ***
*** symbol name is typed or the network symbol server is down. ***
*** ***
*** For some commands to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: hal!_WHEA_PROCESSOR_GENERIC_ERROR_SECTION ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Either you specified an unqualified symbol, or your debugger ***
*** doesn't have full symbol information. Unqualified symbol ***
*** resolution is turned off by default. Please either specify a ***
*** fully qualified symbol module!symbolname, or enable resolution ***
*** of unqualified symbols by typing ".symopt- 100". Note that ***
*** enabling unqualified symbol resolution with network symbol ***
*** server shares in the symbol path may cause the debugger to ***
*** appear to hang for long periods of time when an incorrect ***
*** symbol name is typed or the network symbol server is down. ***
*** ***
*** For some commands to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: hal!_WHEA_PROCESSOR_GENERIC_ERROR_SECTION ***
*** ***
*************************************************************************

KEY_VALUES_STRING: 1

Key : Analysis.CPU.mSec
Value: 2546

Key : Analysis.Elapsed.mSec
Value: 2542

Key : Analysis.IO.Other.Mb
Value: 0

Key : Analysis.IO.Read.Mb
Value: 0

Key : Analysis.IO.Write.Mb
Value: 0

Key : Analysis.Init.CPU.mSec
Value: 264

Key : Analysis.Init.Elapsed.mSec
Value: 4876

Key : Analysis.Memory.CommitPeak.Mb
Value: 82

Key : Bugcheck.Code.LegacyAPI
Value: 0x124

Key : Dump.Attributes.AsUlong
Value: 18

Key : Dump.Attributes.KernelGeneratedTriageDump
Value: 1

Key : Failure.Bucket
Value: LKD_0x124_0_AuthenticAMD_PROCESSOR__UNKNOWN_IMAGE_AuthenticAMD.sys

Key : Failure.Hash
Value: {f59f17e7-f24e-04f5-3f16-e9425b2acba5}

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffa90163411840

BUGCHECK_P3: baa00000

BUGCHECK_P4: 2010b

FILE_IN_CAB: WHEA-20240120-0017.dmp

DUMP_FILE_ATTRIBUTES: 0x18
Kernel Generated Triage Dump
Live Generated Dump

PROCESS_NAME: smss.exe

STACK_TEXT:
ffffbe8a`4992f150 fffff801`65d6095f : ffffa901`63411820 00000000`00000000 ffffa901`63411840 00000000`00000022 : nt!LkmdTelCreateReport+0x13e
ffffbe8a`4992f690 fffff801`65d60856 : ffffa901`63411820 fffff801`00000000 00000082`00000000 00000082`8daff9d0 : nt!WheapReportLiveDump+0x7b
ffffbe8a`4992f6d0 fffff801`65bd3e7d : 00000000`00000001 ffffbe8a`4992fb40 00000082`8daff9d0 00000000`0000020c : nt!WheapReportDeferredLiveDumps+0x7a
ffffbe8a`4992f700 fffff801`65a883f7 : 00000000`00000000 ffffa901`62431030 00000000`00000103 00000000`00000000 : nt!WheaCrashDumpInitializationComplete+0x59
ffffbe8a`4992f730 fffff801`65811238 : ffffa901`630e0000 ffffa901`6243ed80 ffffbe8a`4992fb40 ffffa901`00000000 : nt!NtSetSystemInformation+0x1f7
ffffbe8a`4992fac0 00007ffe`8e2f0554 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x28
00000082`8daff978 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffe`8e2f0554

MODULE_NAME: AuthenticAMD

IMAGE_NAME: AuthenticAMD.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: LKD_0x124_0_AuthenticAMD_PROCESSOR__UNKNOWN_IMAGE_AuthenticAMD.sys

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {f59f17e7-f24e-04f5-3f16-e9425b2acba5}

Followup: MachineOwner
---------

emoticorpse Jan 20, 2024 @ 2:27pm

How long have you been playing Fortnite on this new hardware config? If for a long time and this issue JUST STARTED in Fortnite even though you've been playing fornite for a while, I'd assuming it's a hardware issue. Going off that, the first thing I would end up doing is changing out that PSU because it's the most likely culprit I can see going off just hardware given that error.

I would personally even do a format and fresh Windows re-install before that though, but in case you don't like doing that and want to jump to hardware troubleshooting I'd do the psu first.

Hoppled Jan 20, 2024 @ 2:31pm

Originally posted by emoticorpse:
How long have you been playing Fortnite on this new hardware config? If for a long time and this issue JUST STARTED in Fortnite even though you've been playing fornite for a while, I'd assuming it's a hardware issue. Going off that, the first thing I would end up doing is changing out that PSU because it's the most likely culprit I can see going off just hardware given that error.

I would personally even do a format and fresh Windows re-install before that though, but in case you don't like doing that and want to jump to hardware troubleshooting I'd do the psu first.

I got back into Fortnite only a few months ago, so. If it is a hardware issue, I'd hope its the PSU, as it's the only component I haven't replaced in the past 4+ years. Motherboard, boot drive, etc. are all relatively new.

[☥] - CJ - Jan 20, 2024 @ 2:58pm

Sounds like the display driver is crashing and is unable to recover

Whens the last time you updated the GPU driver?

Hoppled Jan 20, 2024 @ 3:05pm

Originally posted by ☥ - CJ -:
Sounds like the display driver is crashing and is unable to recover

Whens the last time you updated the GPU driver?

I have the latest driver according to the AMD overlay. And could it be that AMD drivers are just ♥♥♥♥♥ or something?

_I_ Jan 20, 2024 @ 3:23pm

can also be the chipset/mobo drivers

get them here
https://www.msi.com/Motherboard/MPG-B550-GAMING-PLUS/support#driver
pick win10, chipset, auido, lan

Illusion of Progress Jan 20, 2024 @ 3:52pm

0x124 (truncated) is sort of what I expected from the WHEA log if present, given the symptoms you described. That is sort of a generalized error that the CPU encountered an issue (does not necessarily mean the CPU caused it). If there's a WATCHDOG log(s) as well, post those too as those are (at least sometimes) more likely to narrow the cause down than the WHEA logs are.

Do you have Event ID 18 in the Event Viewer? Or any other "error" or "critical" level logs (besides Event ID 41 and possibly 6008, ignore those) from the time of the crash or around the time of the next Windows startup?

Hoppled Jan 20, 2024 @ 5:02pm

Originally posted by Illusion of Progress:
0x124 (truncated) is sort of what I expected from the WHEA log if present, given the symptoms you described. That is sort of a generalized error that the CPU encountered an issue (does not necessarily mean the CPU caused it). If there's a WATCHDOG log(s) as well, post those too as those are (at least sometimes) more likely to narrow the cause down than the WHEA logs are.

Do you have Event ID 18 in the Event Viewer? Or any other "error" or "critical" level logs (besides Event ID 41 and possibly 6008, ignore those) from the time of the crash or around the time of the next Windows startup?

Yes, I have Event 18, WHEA-Logger at the time of the crash - Reported by component: Processor core.
I do not have a WATCHDOG folder from what I can see.

Here is a screenshot of all the events that happened around that time:

https://imgur.com/Oq7EK8r

The crash occurred at 12:17 am, so around all those times.

#10

Illusion of Progress Jan 20, 2024 @ 7:37pm

Yeah, that's a machine check exception. There's a fault somewhere, most likely hardware. MCEs usually are. Drivers are software aren't impossible but they are the exception and not the norm.

Lack of a WATCHDOG folder and logs might make that tricky to track down.

In my case, I had the Event ID 18 logs and the 0x124 WHEA log but they weren't explicit on the cause because they're sort of not. My WATCHDOG logs were pointing to the GPU, which was the new part added before the issue occurred, so it made it easier to narrow down.

If you have nothing else to go on but these symptoms, you need to apply a process to narrow things down.

Here's what you can do.

Verify your RAM is stable and good. Check it with MemTest86, and I'd recommend downloading OCCT as it has a suite of tests that may come in useful for checking things. It has a suite of stability tests you can run (30 minute limit for the free version but it's better than nothing).

https://www.ocbase.com/

You can try disabling XMP (if enabled) to see if that makes a difference. I'll warn you that this might complicate things in the way that it might appear to be the cause when its not. I had no issues with my system, changed the graphics card, and then had issues, but disabling XMP made them worse (I would expect the opposite). So it led me down a false path of thinking it was a platform/RAM stability issue when it wasn't. But you do need to verify RAM stability and I'd start there.

You can run HwInfo64 or GPU-Z in the background for logging during gaming and then once the issue occurs again, you can check the log files to see if anything voltage-wise drops right beforehand (but it's possible this won't catch it fast enough).

https://www.hwinfo.com/download/

https://www.techpowerup.com/download/techpowerup-gpu-z/

Needless to say, temperatures should be ensured they are not critical but I somewhat presumed you've checked that as one of the first things.

Is any part(s) still under warranty? You might need to start ruling things out by swapping hardware. If you want to rule out software, do that. Updating firmware, drivers, and perhaps reinstalling Windows would be good options.

If I had to start making guesses on hardware, GPU and PSU would be where I'd look first, not necessarily in that order. Motherboard and RAM is possible. CPU I'd find unlikely, and same for storage, but it's all on the table until you can rule it out as good.

This is possibly not going to be easy, fun, or fast to figure out. If you have spares of any major components, it will help speed it up and make it easier and faster.

#11

Hoppled Jan 20, 2024 @ 9:38pm

Originally posted by Illusion of Progress:
Yeah, that's a machine check exception. There's a fault somewhere, most likely hardware. MCEs usually are. Drivers are software aren't impossible but they are the exception and not the norm.

Lack of a WATCHDOG folder and logs might make that tricky to track down.

In my case, I had the Event ID 18 logs and the 0x124 WHEA log but they weren't explicit on the cause because they're sort of not. My WATCHDOG logs were pointing to the GPU, which was the new part added before the issue occurred, so it made it easier to narrow down.

If you have nothing else to go on but these symptoms, you need to apply a process to narrow things down.

Here's what you can do.

Verify your RAM is stable and good. Check it with MemTest86, and I'd recommend downloading OCCT as it has a suite of tests that may come in useful for checking things. It has a suite of stability tests you can run (30 minute limit for the free version but it's better than nothing).

https://www.ocbase.com/

You can try disabling XMP (if enabled) to see if that makes a difference. I'll warn you that this might complicate things in the way that it might appear to be the cause when its not. I had no issues with my system, changed the graphics card, and then had issues, but disabling XMP made them worse (I would expect the opposite). So it led me down a false path of thinking it was a platform/RAM stability issue when it wasn't. But you do need to verify RAM stability and I'd start there.

You can run HwInfo64 or GPU-Z in the background for logging during gaming and then once the issue occurs again, you can check the log files to see if anything voltage-wise drops right beforehand (but it's possible this won't catch it fast enough).

https://www.hwinfo.com/download/

https://www.techpowerup.com/download/techpowerup-gpu-z/

Needless to say, temperatures should be ensured they are not critical but I somewhat presumed you've checked that as one of the first things.

Is any part(s) still under warranty? You might need to start ruling things out by swapping hardware. If you want to rule out software, do that. Updating firmware, drivers, and perhaps reinstalling Windows would be good options.

If I had to start making guesses on hardware, GPU and PSU would be where I'd look first, not necessarily in that order. Motherboard and RAM is possible. CPU I'd find unlikely, and same for storage, but it's all on the table until you can rule it out as good.

This is possibly not going to be easy, fun, or fast to figure out. If you have spares of any major components, it will help speed it up and make it easier and faster.

So machine check exception are usually hardware issue? And the problem is, that this crash seems to happen very randomly and not often at all. If I had to give a general occurrence, it'd be one every two weeks... so it'd be quite difficult to troubleshoot by swapping parts out. A month can go by and it may not crash, then the next day it can.

I'm not sure. The PSU is the only component that isn't really brand new like everything else. I am one BIOS update behind, but I updated the BIOS when first installing the mobo less than a year ago.

It's just so hard to troubleshoot something like this when the crash rarely happens. It would suck if I had to get a new GPU or something, I have no parts I can swap in the meantime either.

#12

Hoppled Jan 21, 2024 @ 3:35am

I found a thread online with someone who sounds like they had the same exact issue with how the crash occurred.

Someone suggested it was their PSU, since it was low quality. They upgraded to a new GPU thinking that was the culprit, but it’s likely the GPU drawing more power causing the issue: https://forums.tomshardware.com/threads/changed-gpu-gaming-and-then-black-screen-with-buzzing-noise-then-immediatly-restarts.3812021/

#13

_I_ Jan 21, 2024 @ 4:00am

evga 600w should be more than enough for the build
those are good psus

op on that thread was using cougar 750w, those are just overrated junk

#14

Illusion of Progress Jan 21, 2024 @ 7:49am

Yeah, if a crash is rare, troubleshooting it is... "fun".

The only thing with following what you find others had suggested for an issue that seems similar is...

1. It presumes what was suggested solved the issue (sometimes it has, others it was suggested and unknown).

2. More importantly, it presumes your issue is the same.

Not all BSODs are caused by the same thing. Not all machine check exceptions are caused by the same thing.

I also had a similar symptom (minus the audio buzzing; in my case the audio was fine until it restarted, and the only anomaly was it might "stop and go" after the screen went Black but before the restart if it took a while for the restart to actually occur) and in my case it was the graphics card, but it's not going to be the same for everyone.

That's not to convince you against trying the PSU. To the contrary; the PSU is one of the first couple of things I'd consider though, so if you find it's the most willing place to start, then start there. I'm just adding that there's multiple sources it could be coming from so keep in mind that if thing A fails to resolve it, be prepared and have a thing B to move on to trying.

You could lower the clock speeds/power consumption of the GPU in Adrenalin and see if that lessens or gets rid of the issue (hard to tell maybe if it's infrequent, I know). If it does occur less when the graphics card is only able to use less power, it would give support to the idea the PSU isn't coping.

On the other hand, you can do the opposite and try and force as much power consumption by running a CPU test in OCCT or Prime 95 to load the CPU, and use Furmark to load the GPU. If your system doesn't crash with the power draw of that, it's unlikely any game is making it pull more. But I'd really do both independently too in order to see if the crash is more likely with one as opposed to the other.

Testing RAM should be done too. Run MemTest86 overnight if need be.

Last edited by Illusion of Progress; Jan 21, 2024 @ 7:50am

#15

< >

Showing 1-15 of 24 comments

Per page: 1530 50

All Discussions > Steam Forums > Hardware and Operating Systems > Topic Details

Date Posted: Jan 20, 2024 @ 8:28am

Posts: 24

Start a New Discussion

Discussions Rules and Guidelines

Report this post