Kernel OOPs triggered by big writes to ext4 FS
#1
Hi,

I've spent more than two months(!) trying to find out what causes Linux Kernel OOPs that seem to be triggered by big writes to mounted ext4 filesystems. My quest began while trying to boot from SPI and run Armbian with rootfs on eMMC. I tried two different eMMC modules and all the Rockpro64 Linux images that I could find. I burnt a lot of midnight oil trawling the Internet for solutions, but found none...

Most of the Linuix images that I've tried work fine booting from SPI using the latest u-boot running Linux (Armbian, Debian, Ayufan) with rootfs on SD, but crash badly when either installing them to eMMC from SD or running them from eMMC and e.g. installing a big package like OMV.

Watching "htop" during installation of a 'large' package, I can see that the OOPs seem to occur when the filesystem buffers hit a high-water mark and this could be mitigated by mounting ext4 filesystems in 'sync' mode. I've done a lot of testing and now I have a stable OMV6 running under an Armbian image I compiled myself from source, and ext4 filesystems mounted in 'sync' mode: I only have occasional OOPs during heavy i/o to an ext4 filesystem mounted on an "md" raid consisting of two 1TB SATA disks connected to a 2-port Adaptec PCI-e controller.

Using 'sync' mode degrades performance, and could wear out SSD's, so it's only useful as a temporary work-around. The real problem is how to prevent the ext4 FS buffers from overflowing available memory. I suspect, but have no proof, that this is caused by the Linux 'optimistic' memory allocator oversubscribing memory and getting caught-out by the actual usage overflowing available physically memory.

I don't have a deep enough knowledge of the Linux kernel to investigate the problem, but maybe someone else reading this message does, but what I do know, is that the problem is NOT a device driver issue, because direct device access (e.g. using "dd") does not trigger an OOPs. This is not a hardware fault on my particular Rockpro64 board or eMMC module because FreeBSD installs and runs fine on SD or eMMC on the same system.

I'd be interested to read other people's experiences of using the Rockpro64 with OMV6 under Armbian.

Bye,

  Tony Travis.


Attached Files Thumbnail(s)
   
  Reply
#2
I've managed to avoid mounting filesystems in "sync" mode to stop Oops happening during heavy i/o by setting the Linux kernel "overcommit_memory" mode to never overcommit:
Code:
echo 2 > /proc/sys/vm/overcommit_memory
Also, to do it on boot:
Code:
echo "vm.overcommit_memory=2" >> /etc/sysctl.conf
I now have OMV6 working under Armbian 21.08.1 Bullseye with Linux 5.15.25-rockchip64 with u-boot in SPI, root on eMMC and dual 1TB SATA disks on an Adaptec AAR-1220SA controller. It's been a bit of an uphill struggle to get a working system. Most of the images I've tried have failed in the same way on ext4, Btrfs or ZFS - I think the overcommit occurs in the filesystem-independant kernel VFS. The system was quite stable under FreeBSD with root on eMMC, but my Adaptec controller is not supported so I couldn't test it. I want to run OMV6 under Armbian Bullseye, but tested my Rockpro64 under FreeBSD to see if the problem only occurs under Linux. What I find very puzzling is that I've not had the same Oops under heavy i/o on my Pinebook Pro, which is very similar hardware to the Rockpro64.
  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Exclamation Ethernet regression on Linux Kernel 6.5.4? Deathcrow 3 743 09-22-2023, 04:27 AM
Last Post: diederik
Question How do I compile an arbitrary kernel for U-Boot? Valenoern 3 1,017 06-16-2023, 10:54 AM
Last Post: CounterPillow
  [OS] SkiffOS and Buildroot for Rockpro64 w/ 5.17 kernel paralin1 1 1,767 05-08-2022, 03:26 PM
Last Post: paralin1
  Manjaro ARM - Built-in ethernet was broken with kernel 5.14.1-1 (over now) Dendrocalamus64 2 2,222 09-11-2021, 08:58 PM
Last Post: t4_4t
  compiling a new kernel for overclocking on arch Mentaluproar 2 2,688 07-15-2021, 10:16 AM
Last Post: Mentaluproar
  Other boot options than Sdcard for linux 5.x kernel? MisterA 2 2,926 07-14-2021, 02:37 PM
Last Post: TRS-80
  ayufan kernel update unbootable TheHunter 2 3,091 03-12-2021, 05:17 PM
Last Post: LMM
  Kernel oops after big-ish writes gaeb 1 2,296 02-11-2021, 03:55 PM
Last Post: gaeb
  Kernel Update from 4.4 (Ayufan) on Ubuntu 20.04 db579 3 5,242 10-22-2020, 01:12 PM
Last Post: dukla2000
  Problem with compiling ayufans linux kernel voegelit 4 5,825 01-14-2020, 07:38 AM
Last Post: patstew

Forum Jump:


Users browsing this thread: 1 Guest(s)