Steam for Linux

Steam for Linux

Marlock Nov 28, 2018 @ 5:57pm
Anyone else having Ext4 corruption while using Linux Kernel 4.19.x?
I've been hit by this bug enough that I had to run fsck from a live usb to recover my main OS.

Of course that had to happen now that I've started using UKUU to have fun with more recent mainline Linux Kernels instead of waiting for Ubuntu Kernel 4.15.x updates...

I thought it would be important to give everyone else here a heads-up, because of all the tempting GPU improvements the new kernels brought for gaming. Unfortunately its probably better to avoid 4.19.x for now (maybe forever...).

I'm currently on 4.18.x which is said not to have the issue... lets hope they are right.

https://www.phoronix.com/scan.php?page=news_item&px=EXT4-Linux-4.19-Corruption

More details:
https://lkml.org/lkml/2018/11/27/1058
https://bugzilla.kernel.org/show_bug.cgi?id=201685
< >
Showing 1-9 of 9 comments
Aoi Blue Nov 28, 2018 @ 7:18pm 
It's only on 4.19.

4.19 is the "new stable" release, which contains lots of new patches. I'm not sure what configuration options are causing corruption and to what degree, but it's a concern.

Hopefully experts will be able to narrow down exactly what filesystem configurations cause corruption and how.

You can always download 4.18-DRM-Next, which is the 4.18 core kernel with the DRM-Next GPU interface branch.

Still, the kernel dev team is known to get on top of these issues pretty fast. The initial patch will likely have performance problems for disk IO, though.
Last edited by Aoi Blue; Nov 28, 2018 @ 7:19pm
ack0329 Nov 28, 2018 @ 7:44pm 
for tyhe heck of it - I just installed 4.19.5, and applied my patch for nvidia gtx1080M, and all seems fine

WooHoo
Last edited by ack0329; Nov 28, 2018 @ 7:44pm
thetargos Nov 28, 2018 @ 8:01pm 
I've been running 4.19 for some time (now on 4.19.5) and all is OK. Could this be a specific patch? Have you seen if other have had this or if it has been reported to the LKML?

Edit:

Seems serious enough, according to this[www.phoronix.com]
Last edited by thetargos; Nov 28, 2018 @ 8:03pm
Marlock Nov 30, 2018 @ 3:34am 
@Rogue

If I understand correctly, mainline kernel builds offered on the Ubuntu Kernel channel are labeled "for testing purposes only" in the context of using them on Ubuntu...

Yet they are labeled stable on kernel.org and supposedly should be fine to use (non-beta non-RC kernels obviously).

I've scoured the Ubuntu website a while ago and found them quite laconical in explaining typical diferences and eventual advantages/risks between Ubuntu Kernels and Kernel.org kernel builds for x86.

Obviously they do some config changes for compiling the kernel but beyond that I didn't find much info. And these config changes have been criticized in the past for adding overhead/lag/hurting performance (mainly because of more debug routines toggled on than the default), while complimented for a bit broader hardware support (which given their large delay in adopting newer kernel branches is arguable)... but that is third-party comments without much reference so can all be wrong impressions.


I actually had more system instability (lags and software crashes) on Ubuntu Kernel 4.15 than after moving to mainline 4.18, but got nailed by system freezes and data corruption when 4.19 came by.

Now I'm back to 4.18, which seems to be fine for now, and will adhere to a "prior-to-latest stable release" policy for kernel updates... lets see how that works.
web1bastler Dec 9, 2018 @ 3:36pm 
Just got my first ext4 corruption on 4.19.6-041906-generic. Noticed it when audio stopped playing and autocomplete stopped working in bash. dmesg told me that there were some Inode errors and that the fs went readonly. A reboot then dropped me to the initramfs shell where I then had to do a manual fsck to fix the orphaned Inodes. What a stupid error
Aoi Blue Dec 9, 2018 @ 5:00pm 
Originally posted by Dat Owl:
https://www.phoronix.com/scan.php?page=news_item&px=EXT4-Linux-4.19-Corruption

Then this..

https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.19-4.20-BLK-MQ-Fix


Pending a back port.
So basically, it was an interaction between a bug in the IO multi-queue code and a unique behavior in the EXT4 filesystem write code that hit the bug hard.

Other filesystems did not assert multi-queue block IO in a manner that would manifest the bug, so the bug went unnoticed with them and the generic filesystem microbenchmark tests.
Aoi Blue Dec 11, 2018 @ 11:05am 
This sort of thing is exactly why most non-edge distros wait until the initial bugs are worked out of new kernel series before adopting them.

When a new kernel series is adopted they add a lot of major patches. This always risks bugs, and often many don't get found until after the initial few releases.

Many distro producers prefer to backport the patches they feel are safe, or at least not too risky for their purposes, instead of going with the whole new kernel, for this reason. The most common patches are isolated device support patches.

Block device handling patches tend to be treated with caution as they have a very bad history of this sort of thing. One patch to the old IDE-UDMA stack once manifested in a similar manner on systems using RiserFS. That one took down one of my experimental servers once. Fortunately, I never kept anything of too much value on it, as it was a RAID server designed to serve at high speed over reliability. I mostly had media and video games on it. (It was lighting fast. I hit the limits of my network card on it, which wasn't too fast by todays, as this was the days of 100BaseT but was still fast, especially for the time.)
Last edited by Aoi Blue; Dec 11, 2018 @ 11:10am
< >
Showing 1-9 of 9 comments
Per page: 1530 50

Date Posted: Nov 28, 2018 @ 5:57pm
Posts: 9