Constant freezes or kernel panics
#1
Bug 
Hello all!
I ordered my RockPro64 4GB model in December 2019 for use as a home data server.
For this I also bought the PCIe SATA adapter from the store and connected two 8TB drives in a BTRFS RAID 1 to it. For the OS I installed the most recent Debian-based desktop Armbian at the time on a microSD and would work on it either locally or via SSH.

This worked very well until about this summer, when I could suddenly not reach my server over SSH anymore. When I came home to check it I found it completely frozen and had to use the reset button to reboot it. From then on these freezes became common and I started troubleshooting:
  • First I guessed at flash death and replaced the microSD with an Armbian recommended 32GB Sandisk Extreme Pro A1.
  • Then I installed the latest terminal-only Ubuntu-based Armbian.
  • Then I ran memtester to check the RAM, with no errors.
  • Lastly I disconnected my RAID and plugged in a big USB-Stick to do tests on that instead.
Still freezes occur regularly.

My observations on them:
  • They do not occur on boot.
  • They do not occur even after days of uptime if it's only idle.
  • They occur on various I/O intense workloads such as scrubbing the BTRFS filesystem (which I also did as part of the troubleshooting) or checking the files of a large torrent.
  • They rekt my data on the RAID (likely from the repeated interrupted writes), a backup is available.
  • If a display is plugged in at the time of the freeze it stays active and shows the last frame. If I plug it in afterwards it does not find a signal.
  • Alt + SysRq + B worked exactly once, at the other tries no keyboard input seemed to have registered. So the reset button is almost always necessary for recovery.
  • HDDs audibly go into idle and spin down seconds after a freeze.
  • The same tasks that lead to a freeze work without a problem on other machines.
  • I'm still a linux beginner and don't know much about logs, but I couldn't find anything relevant in them.
An SSH connection will usually end with a short error message ("...broken pipe..." I think) except for one time, when it read:
Code:
Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.436064] Insufficient stack space to handle exception!

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.436076] ESR: 0x96000047 -- DABT (current EL)

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.441468] FAR: 0xffff800011af7fa0

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.443967] Task stack:     [0xffff800011bc0000..0xffff800011bc4000]

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.446804] IRQ stack:      [0xffff800011af8000..0xffff800011afc000]

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.449632] Overflow stack: [0xffff0000f77592b0..0xffff0000f775a2b0]

Message from syslogd@localhost at Dec  4 07:31:20 ...
kernel:[42746.515506] Kernel panic - not syncing: kernel stack overflow

I found some other posts on here and other forums that might be about the same problem but none seem to provide an explanation or solution.
So now I wonder if anyone has an idea what leads to these freezes and if something can be done about them.
Even knowing that this is down to an unfixable hardware error would help since I have some alternative single-board computers in mind.
#2
Can't really help but are you running a recent kernel (5.9 or 5.10) and using a recent version of u-boot?
#3
(12-21-2020, 04:39 AM)diizzy Wrote: Can't really help but are you running a recent kernel (5.9 or 5.10) and using a recent version of u-boot?
Yes, Kernel is 5.9 and since it was a fresh install with the most recent image offered for download I'd say everything else is recent too.
#4
(12-21-2020, 08:34 AM)data_kraken Wrote:
(12-21-2020, 04:39 AM)diizzy Wrote: Can't really help but are you running a recent kernel (5.9 or 5.10) and using a recent version of u-boot?
Yes, Kernel is 5.9 and since it was a fresh install with the most recent image offered for download I'd say everything else is recent too.
Is there a max capacity for the adapter ? 8To is a big disk.
#5
(12-21-2020, 01:43 PM)LMM Wrote:
(12-21-2020, 08:34 AM)data_kraken Wrote:
(12-21-2020, 04:39 AM)diizzy Wrote: Can't really help but are you running a recent kernel (5.9 or 5.10) and using a recent version of u-boot?
Yes, Kernel is 5.9 and since it was a fresh install with the most recent image offered for download I'd say everything else is recent too.
Is there a max capacity for the adapter ? 8To is a big disk.
Never seen one mentioned anywhere. And 8TB really isn't all that much nowadays when you can buy 20TB drives. But in any case the setup worked at first and the problem existed even without the drives so I assume their size isn't the problem.

Also some more details on the OS:
[Image: screenfetch.png]
#6
(12-22-2020, 08:09 AM)data_kraken Wrote:
(12-21-2020, 01:43 PM)LMM Wrote:
(12-21-2020, 08:34 AM)data_kraken Wrote:
(12-21-2020, 04:39 AM)diizzy Wrote: Can't really help but are you running a recent kernel (5.9 or 5.10) and using a recent version of u-boot?
Yes, Kernel is 5.9 and since it was a fresh install with the most recent image offered for download I'd say everything else is recent too.
Is there a max capacity for the adapter ? 8To is a big disk.
Never seen one mentioned anywhere. And 8TB really isn't all that much nowadays when you can buy 20TB drives. But in any case the setup worked at first and the problem existed even without the drives so I assume their size isn't the problem.

Also some more details on the OS:
[Image: screenfetch.png]
what about your power supply ?
why not using the ayufan image ?
#7
(12-22-2020, 02:02 PM)LMM Wrote: what about your power supply ?
why not using the ayufan image ?
Well, I use the one that came with it.
And I don't think switching to an outdated, private distro would help much with stability issues.
#8
(12-22-2020, 06:56 PM)data_kraken Wrote:
(12-22-2020, 02:02 PM)LMM Wrote: what about your power supply ?
why not using the ayufan image ?
Well, I use the one that came with it.
And I don't think switching to an outdated, private distro would help much with stability issues.
The consumption of one Sata Disk may reach 5W-10W
the newest ayufan image (not latest released) is 0.10.12 and one can update the kernel to the mainline 5.9
You may ping @Bullet64 he is really knowledgeable on your kind of issue


Possibly Related Threads…
Thread Author Replies Views Last Post
  fan continously runs on kernel 5.8 but I need to use 5. kernel for PCIe sata card GreyLinux 4 2,187 10-20-2020, 10:52 AM
Last Post: GreyLinux
  kernel for NAS chengjianwen 2 1,649 03-21-2019, 09:18 PM
Last Post: onfire4g05

Forum Jump:


Users browsing this thread: 2 Guest(s)