01-16-2021, 12:32 PM
Since my last failure, I booted a few times with NVMe not reappearing. I opened up my PBP and re-seated the NVMe and the ribbon for the adapter. Booted up and system and NVMe was there again. I attempted another overnight restic restore from my USB external to the NVMe. As expected, it again drew too much power, and was powered off in the morning. I looked at journalctl logs, and while the NVMe errors didn't coincide with the shutdown itself, I found this error:
Which led me to these pages:
18.04 and 18.10 fail to boot nvme0: failed to set APST feature (-19)
EXT4-fs error after Ubuntu 17.04 upgrade
These suggest that APST causes errors with NVMe drives in Linux, due a known bug. I didn't read the bug more than a cursory glance, but the workaround for such is to set the maximum exit latency, which effectively disables idle power states within such threshold:
[Solved] Can't start array after adding 2 NVME drives to the config
Fixing NVME SSD Problems On Linux
These show what to change in Grub, but it appears PBP uses U-Boot. After looking around a little, looks like you can edit the /boot/extlinux/extlinux.conf file. Add the following to the end of the APPEND line:
According to Solid state drive/NVMe, the max latency disables any states with exlat above that value. So for the Intel 660p 2TB, we have the following power states, according to "nvme id-ctrl":
This means, setting a value of 5500 would disable PS 4, but leave PS 3 enabled. Setting a value of 0 effectively disables APST (and checking nvme feature 0x0c confirms this). I set mine to 0 for now.
Overall, after researching this and configuring it, I tried restarting. I was able to confirm the above, but the NVME controller still did not load (as was my previous issue). I had the barrel power plugged in still, so I tried disconnecting that and rebooting again; voila, NVMe was there again. So it may be something with the power plugged in on boot, or may be an insufficient power plugged in on boot (just running from some US wall-based outlet USB port), or may be completely coincidental.
I have not yet tried an overnight again. I suspect that this will stop the NVMe from disappearing during use, but will still have the issue of drawing too much power that the battery can't charge quickly enough to be sustained. This also doesn't definitively rule out a thermal issue. My return deadline is coming up, so I may just return it and get a portable USB drive, and revisit this in the future.
Hope this helps someone!
Code:
nvme0: failed to set APST feature (-19)
Which led me to these pages:
18.04 and 18.10 fail to boot nvme0: failed to set APST feature (-19)
EXT4-fs error after Ubuntu 17.04 upgrade
These suggest that APST causes errors with NVMe drives in Linux, due a known bug. I didn't read the bug more than a cursory glance, but the workaround for such is to set the maximum exit latency, which effectively disables idle power states within such threshold:
[Solved] Can't start array after adding 2 NVME drives to the config
Fixing NVME SSD Problems On Linux
These show what to change in Grub, but it appears PBP uses U-Boot. After looking around a little, looks like you can edit the /boot/extlinux/extlinux.conf file. Add the following to the end of the APPEND line:
Code:
nvme_core.default_ps_max_latency_us=5500
According to Solid state drive/NVMe, the max latency disables any states with exlat above that value. So for the Intel 660p 2TB, we have the following power states, according to "nvme id-ctrl":
Code:
ps 0 : mp:5.50W operational enlat:0 exlat:0 rrt:0 rrl:0
ps 1 : mp:3.60W operational enlat:0 exlat:0 rrt:1 rrl:1
ps 2 : mp:2.60W operational enlat:0 exlat:0 rrt:2 rrl:2
ps 3 : mp:0.0300W non-operational enlat:5000 exlat:5000 rrt:3 rrl:3
ps 4 : mp:0.0040W non-operational enlat:5000 exlat:9000 rrt:4 rrl:4
This means, setting a value of 5500 would disable PS 4, but leave PS 3 enabled. Setting a value of 0 effectively disables APST (and checking nvme feature 0x0c confirms this). I set mine to 0 for now.
Overall, after researching this and configuring it, I tried restarting. I was able to confirm the above, but the NVME controller still did not load (as was my previous issue). I had the barrel power plugged in still, so I tried disconnecting that and rebooting again; voila, NVMe was there again. So it may be something with the power plugged in on boot, or may be an insufficient power plugged in on boot (just running from some US wall-based outlet USB port), or may be completely coincidental.
I have not yet tried an overnight again. I suspect that this will stop the NVMe from disappearing during use, but will still have the issue of drawing too much power that the battery can't charge quickly enough to be sustained. This also doesn't definitively rule out a thermal issue. My return deadline is coming up, so I may just return it and get a portable USB drive, and revisit this in the future.
Hope this helps someone!