NVMe-related crashes and instability, plus a solution
#1
After installing an NVMe SSD in my Pinebook Pro I began to see Linux crashing periodically with output like the following:

Code:
[    7.153982] SError Interrupt on CPU2, code 0xbf000002 -- SError
[    7.153986] CPU: 2 PID: 169 Comm: udevd Not tainted 5.8.1-gnu #1
[    7.153988] Hardware name: PINE64 Pinebook Pro (DT)
[    7.153989] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--)
[    7.153991] pc : nvme_submit_cmd+0x11c/0x130
[    7.153992] lr : nvme_queue_rq+0x43c/0x6b8
[    7.153993] sp : ffff80001409b6f0
[    7.153995] x29: ffff80001409b6f0 x28: ffff0000f4716000
[    7.153998] x27: 0000000000000000 x26: 0000000000001000
[    7.154002] x25: 0000000000000001 x24: 0000000000001000
[    7.154004] x23: ffff0000eff62000 x22: 0000000000000000
[    7.154007] x21: 0000000000000001 x20: ffff0000f4536a40
[    7.154010] x19: ffff800010d1a000 x18: 0000000000000000
[    7.154014] x17: 0000000000000000 x16: 0000000000000000
[    7.154016] x15: 0000000000000000 x14: 0000000000000000
[    7.154019] x13: 0000000000000000 x12: ffff800010226c88
[    7.154022] x11: 0000000000000000 x10: 0000000000000000
[    7.154025] x9 : 0000000000000000 x8 : ffffffffffffffff
[    7.154028] x7 : 00000000e929d000 x6 : 00000000e929d000
[    7.154031] x5 : 0000000007ef7ac9 x4 : 0000000000000006
[    7.154034] x3 : 0000000000000000 x2 : 0000000780000007
[    7.154037] x1 : ffff0000f4536a48 x0 : 0000000000000000
[    7.154040] Kernel panic - not syncing: Asynchronous SError Interrupt
[    7.154042] CPU: 2 PID: 169 Comm: udevd Not tainted 5.8.1-gnu #1
[    7.154044] Hardware name: PINE64 Pinebook Pro (DT)
[    7.154044] Call trace:
[    7.154046]  dump_backtrace+0x0/0x1d8
[    7.154047]  show_stack+0x14/0x20
[    7.154048]  dump_stack+0xbc/0xf8
[    7.154049]  panic+0x150/0x348
[    7.154050]  add_taint+0x0/0xa8
[    7.154051]  arm64_serror_panic+0x74/0x80
[    7.154053]  do_serror+0x6c/0x168
[    7.154054]  el1_error+0x84/0x100
[    7.154055]  nvme_submit_cmd+0x11c/0x130
[    7.154056]  nvme_queue_rq+0x43c/0x6b8
[    7.154058]  __blk_mq_try_issue_directly+0x104/0x230
[    7.154059]  blk_mq_request_issue_directly+0x50/0x100
[    7.154061]  blk_mq_try_issue_list_directly+0x58/0xe8
[    7.154062]  blk_mq_sched_insert_requests+0xe0/0x150
[    7.154064]  blk_mq_flush_plug_list+0x11c/0x188
[    7.154065]  blk_flush_plug_list+0xd8/0x108
[    7.154066]  blk_finish_plug+0x30/0xa0
[    7.154067]  read_pages+0x154/0x290
[    7.154069]  page_cache_readahead_unbounded+0x160/0x220
[    7.154070]  __do_page_cache_readahead+0x34/0x48
[    7.154072]  force_page_cache_readahead+0xb4/0x108
[    7.154073]  page_cache_sync_readahead+0xe4/0xf0
[    7.154074]  generic_file_buffered_read+0x5d8/0xa28
[    7.154076]  generic_file_read_iter+0xd0/0x180
[    7.154077]  blkdev_read_iter+0x38/0x48
[    7.154079]  new_sync_read+0xec/0x188
[    7.154080]  vfs_read+0x1bc/0x1d0
[    7.154081]  ksys_read+0x68/0xf8
[    7.154082]  __arm64_sys_read+0x14/0x20
[    7.154083]  do_el0_svc+0x68/0xd0
[    7.154084]  el0_sync_handler+0x16c/0x2a0
[    7.154086]  el0_sync+0x140/0x180
[    7.154112] SMP: stopping secondary CPUs
[    7.154113] Kernel Offset: disabled
[    7.154114] CPU features: 0x200022,01006008
[    7.154116] Memory Limit: none

The crashes became more and more frequent until eventually the system would fail to boot most times. The exact backtrace varied, but it always referenced the NVMe driver and indicated an "asynchronous system error", pointing to an issue with the hardware itself.

After some research, I've found the solution is to remove this line from the Pinebook Pro device tree:

Code:
max-link-speed = <2>;

Since building a new kernel with this change I've yet to see a single crash from the NVMe driver and the system appears completely stable.

What this change does is stop the Linux PCIe driver from trying to operate the PCIe link at rates above the default for RK3399-based devices of 2.5 GT/s, which is the maximum rate Rockchip themselves claim the SoC will support. It seems the RK3399 was originally designed to operate its PCIe bus at the higher, "gen 2" speed, but since the SoC's release the company has downgraded its specifications as (I assume) variances in manufacturing resulted in many parts proving unstable at that speed—as my Pinebook Pro demonstrates.

I suspect this may be the cause of many of the NVMe-related issues other forum members are experiencing, particularly when failures are intermittent or the drive is known to work in other machines.

In fact, between this and the 2.0 GHz CPU frequency (also unsupported by Rockchip) that is enabled in the kernels most people are using, I find it remarkable that most Pinebook Pros have been running out-of-spec by default, which I have to think has something to do with the uneven experiences people are reporting with the machine as well as the general lack of reliability you sense skimming the posts in this forum.

In any case, if your Pinebook Pro seems to be having trouble using an NVMe drive, try bringing it back within the manufacturer's specifications by removing the line above from the device tree (and reverting the 2.0 GHz patch, if you've been using it) and building a new kernel. You may find the problems you've been experiencing disappear completely.
  Reply


Messages In This Thread
NVMe-related crashes and instability, plus a solution - by simonsouth - 09-30-2020, 02:18 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  NVMe SSD testing methodology halogen 1 165 07-22-2021, 05:57 PM
Last Post: calinb
Question Battery stops charging and NVMe and other media disconnect randomly Eey0zu6O 4 196 07-09-2021, 08:45 PM
Last Post: moonwalkers
  nvme drive disappears after about an hour of uptime codebreaker 25 8,966 02-09-2021, 11:32 PM
Last Post: dsimic
  NVME SPI Update not booting SD Card WZ9V 5 2,794 10-18-2020, 08:36 PM
Last Post: wdt
  Possible to upgrade PBP with NVMe M.2 1TB SSD from Sabrent? kkdao 15 7,284 08-22-2020, 05:20 AM
Last Post: kkdao
  LTE adapter via NVMe grego 2 1,896 07-08-2020, 01:37 PM
Last Post: manawyrm
  Pro NVMe adapter Does not fit ! ? bcnaz 110 49,827 07-03-2020, 10:51 PM
Last Post: xmixahlx
  NVME problem chaoskampf 2 1,840 07-01-2020, 09:10 AM
Last Post: chaoskampf
  PBP with NVMe installed - Issues booting to SD eluno1 2 1,713 06-05-2020, 09:33 AM
Last Post: eluno1
Lightbulb Hacking the Ill-Fitting NVMe Adapter diodelass 0 1,078 06-03-2020, 03:15 PM
Last Post: diodelass

Forum Jump:


Users browsing this thread: 1 Guest(s)