Rockpro64 not stable... crashes now and then.
#1
I was so happy that I got the Rockpro64 working including Wifi. But I soon found it's not stable at all. I was listening to some internet radio with SMplayer and suddenly I heard the Rockpro crashed and couple seconds later screen went total black. Only thing I could do is remove the power supply and put it back to reboot.

First I thought maybe SMplayer was fooling around but even on a new fresh reboot I left the rockpro days on to see what will happen. And sometimes after couple days it crashed again. And onetime I had it for 11 days up no problem until it went black again. 

This is my Rockpro version:
Linux rockpro64 4.4.167-1213-rockchip-ayufan-g34ae07687fce #1 SMP Tue Jun 18 20:44:49 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

Anyone also had these kind of problems?  Huh

In the future I would like to run small things 24/7 on the Rockpro and need a stable one for that.

Thanks Smile
  Reply
#2
(11-18-2019, 01:23 PM)Pineapple Wrote: Anyone also had these kind of problems?  Huh

Nope: I have been running my RockPro64 for nearly 18 months now as my daily driver. It runs 24/7 and while it hasn't been flawless I have figured the 2 or 3 hiccups I have had in that time were more likely power (my mains can be iffy) or bluetooth (my keyboard & mouse are cheapo and oftern play up!) related. (I don't have the Pine WiFi/Bluetooth - my bluetooth is a cheapo USB dongle.)


Code:
$ uname -a
Linux rpro64.dukla.net 4.4.138-1100-rockchip-ayufan-g95cecee47f40 #1 SMP Sat Sep 29 15:43:04 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
  • ROCKPro64 v2.1 2GB, 16Gb eMMC for rootfs, SX8200Pro 512GB NVMe for /home, HDMI video & sound, Bluetooth keyboard & mouse. Arch (6.2 kernel, Openbox desktop) for general purpose daily PC.
  • PinePhone Pro Explorer Edition, daily driver, rk2aw & U-boot on SPI, Arch/SXMO & Arch/phosh on eMMC
  • PinePhone BraveHeart now v1.2b 3/32Gb, Tow-boot with Arch/SXMO on eMMC
  Reply
#3
(11-19-2019, 05:37 AM)dukla2000 Wrote:
(11-18-2019, 01:23 PM)Pineapple Wrote: Anyone also had these kind of problems?  Huh

Nope: I have been running my RockPro64 for nearly 18 months now as my daily driver. It runs 24/7 and while it hasn't been flawless I have figured the 2 or 3 hiccups I have had in that time were more likely power (my mains can be iffy) or bluetooth (my keyboard & mouse are cheapo and oftern play up!) related. (I don't have the Pine WiFi/Bluetooth - my bluetooth is a cheapo USB dongle.)


Code:
$ uname -a
Linux rpro64.dukla.net 4.4.138-1100-rockchip-ayufan-g95cecee47f40 #1 SMP Sat Sep 29 15:43:04 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux


Thanks for your reply. Which distribution do you have? Might try the one you are using and see if I get the same issues. 

Can't imagine my powersupply would be the problem. It's a beefy meanwell one.
  Reply
#4
(11-22-2019, 02:53 PM)Pineapple Wrote: Thanks for your reply. Which distribution do you have? Might try the one you are using and see if I get the same issues. 

Can't imagine my powersupply would be the problem. It's a beefy meanwell one.

I start with a basic ayufan release - bionic minimal rockpro64 arm64 from here.
  • ROCKPro64 v2.1 2GB, 16Gb eMMC for rootfs, SX8200Pro 512GB NVMe for /home, HDMI video & sound, Bluetooth keyboard & mouse. Arch (6.2 kernel, Openbox desktop) for general purpose daily PC.
  • PinePhone Pro Explorer Edition, daily driver, rk2aw & U-boot on SPI, Arch/SXMO & Arch/phosh on eMMC
  • PinePhone BraveHeart now v1.2b 3/32Gb, Tow-boot with Arch/SXMO on eMMC
  Reply
#5
@Pineapple Do you use an eMMC Modul for your OS?
Sorry for any mistakes. English is not my native language

1. Quartz64 Model B, 4GB RAM

2. Quartz64 Model A, 4GB RAM

3. RockPro64 v2.1

https://linux-nerds.org/
  Reply
#6
No I don't have eMMC. Last week I also thought maybe the SD card I'm using is a but wonky. Havn't touched the Pine that much last weeks and it has an uptime for 17 days so far. Maybe a software update I did maybe solved it's problem. Will let you guys now when it didn't Wink
  Reply
#7
The Pine was doing fine for a while but now it crashed kind of again. Couldn't open new programs so tried to reboot. The screen gave a big list of errors. Only thing that helped was to take out the power plug and put it back in the pine. And booted without a problem. So it's seems to be like a hardware kind of problem?
  Reply
#8
Hi
I'm also experiencing issue on my new rockpro64.
To summarize here is what I saw:
- The system is unstable and can crash if you do quick disk read/write. With SATA disk via PCIe I have the oops in no time if I run syncthing. Without SATA disk, you can crash if you do hexdump /dev/your eMMC/sdcard disk. It will take time but it will crash
- If you slow down the read/write, it will (i think) take more time to crash. For example, if I add lots of debug logs to the mmc driver, i will have to wait more time
- a possible (but ugly) work around fix may be to slow down the driver. But I prefer a cleaner/proper fix
- I saw that when we got the error on sd card, DMA read is started but we never get the DMA complete interruption. It results in bus reset. If we are lucky we will get a few error, and bus reset and it will restart to work. But it may also fail and result in (a)synchronous external abort. (meaning we are dereferencing an invalid address outside of the CPU (in SPI bus)
- I saw on kernel.org they fixed BUGs on PCIe rockchip leading to external abort, but its not exactly like my opps. (and we also have the oops on eMMC/sdcard without PCIe). I tried anyway the last kernel 5.5-rc2 and I still have the same issue. In the other hand, they may still have unfound bugs.
- I reported a BUG in manjaro where some patches weren't applied, we will got the fix on next manjaro release, but it still doesnt solve our issue.
- same issue with debian buster or manjaro with arch linux kernel
- I am also afraid to find that the BUG is a hardware BUG. I dont know yet. If it is the case, maybe we can find an acceptable software work around ?

Here is some logs:


let's start by a normal and successful transfer:
[ 1028.378665] dwmmc_rockchip fe320000.dwmmc: start command: ARGR=0x00000100 CMDR=0x20000157

[ 1028.379392] dwmmc_rockchip fe320000.dwmmc: sd sg_cpu: 0xffff800011ee5000 sg_dma: 0xebf6b000 sg_len: 32
[ 1028.380238] dwmmc_rockchip fe320000.dwmmc: start command: ARGR=0x00e67708 CMDR=0x20002352
[ 1028.389514] dwmmc_rockchip fe320000.dwmmc: DMA complete
[ 1028.389993] dwmmc_rockchip fe320000.dwmmc: list empty

And now a bad one, the frist failure I have: We initiate the transfer and we never get the interruption for DMA completion.
So we get a CTO timeout with unexpected state of "data busy" (state 3) and we power off
[ 1028.395158] dwmmc_rockchip fe320000.dwmmc: start command: ARGR=0x00000100 CMDR=0x20000157
[ 1028.395886] dwmmc_rockchip fe320000.dwmmc: sd sg_cpu: 0xffff800011ee5000 sg_dma: 0xebf6b000 sg_len: 32
[ 1028.396739] dwmmc_rockchip fe320000.dwmmc: start command: ARGR=0x00e67808 CMDR=0x20002352
[ 1028.737229] dwmmc_rockchip fe320000.dwmmc: start command: ARGR=0x00000000 CMDR=0x2000414c
[ 1028.772081] dwmmc_rockchip fe320000.dwmmc: Unexpected command timeout, state 3
[ 1029.092087] dwmmc_rockchip fe320000.dwmmc: data error, status 0x00000200
[ 1029.092690] dwmmc_rockchip fe320000.dwmmc: list empty

When you get this error, you will get your oops between 0 and 10 seconds.

[ 1030.613586] SError Interrupt on CPU3, code 0xbf000000 -- SError
[ 1030.613589] CPU: 3 PID: 475 Comm: systemd-journal Tainted: G L 5.5.0-rc2-1-ARCH #7
[ 1030.613591] Hardware name: Pine64 RockPro64 (DT)
[ 1030.613592] pstate: 40000005 (nZcv daif -PAN -UAO)
[ 1030.613594] pc : allocate_slab+0x210/0x460
[ 1030.613595] lr : allocate_slab+0x1f8/0x460
[ 1030.613596] sp : ffff800011f139f0
[ 1030.613597] x29: ffff800011f139f0 x28: 0000000000000005
[ 1030.613601] x27: ffff000009c5c800 x26: 0000000000000010
[ 1030.613603] x25: 0000000000001000 x24: 0000000000000400
[ 1030.613606] x23: ffff000009c5c500 x22: ffff000009c5c000
[ 1030.613609] x21: fffffe0000071700 x20: 0000000000000001
[ 1030.613612] x19: ffff0000ea079c00 x18: 0000000000000000
[ 1030.613615] x17: 0000000000000000 x16: 0000000000000000
[ 1030.613617] x15: 0000000000000000 x14: 0000000000000000
[ 1030.613620] x13: 0000000000000000 x12: 0000000000000000
[ 1030.613623] x11: 0000000000000000 x10: 0000000000000000
[ 1030.613626] x9 : ffff8000102e8788 x8 : 00000000f7e00000
[ 1030.613629] x7 : ffff8000e5f2e000 x6 : ffff8000e5f2e000
[ 1030.613631] x5 : 000000000000507b x4 : 0000000000000000
[ 1030.613634] x3 : 0000000044042000 x2 : 0000000080010400
[ 1030.613637] x1 : 0000000000000000 x0 : 0000000000000010
[ 1030.613640] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 1030.613643] CPU: 3 PID: 475 Comm: systemd-journal Tainted: G L 5.5.0-rc2-1-ARCH #7
[ 1030.613644] Hardware name: Pine64 RockPro64 (DT)
[ 1030.613645] Call trace:
[ 1030.613646] dump_backtrace+0x0/0x1b0
[ 1030.613647] show_stack+0x1c/0x28
[ 1030.613648] dump_stack+0xac/0xd4
[ 1030.613650] panic+0x154/0x32c
[ 1030.613651] __stack_chk_fail+0x0/0x20
[ 1030.613652] arm64_serror_panic+0x84/0x90
[ 1030.613653] do_serror+0x88/0x140
[ 1030.613654] el1_error+0x84/0x100
[ 1030.613656] allocate_slab+0x210/0x460
[ 1030.613657] new_slab+0x5c/0xb8
[ 1030.613658] ___slab_alloc.constprop.0+0x308/0x500
[ 1030.613660] __slab_alloc.constprop.0+0x24/0x40
[ 1030.613661] kmem_cache_alloc+0x310/0x320
[ 1030.613662] __alloc_file+0x30/0xf8
[ 1030.613663] alloc_empty_file+0x64/0x108
[ 1030.613665] path_openat+0x4c/0x258
[ 1030.613666] do_filp_open+0x7c/0x100
[ 1030.613667] do_sys_open+0x170/0x220
[ 1030.613669] __arm64_sys_openat+0x28/0x30
[ 1030.613670] el0_svc_handler+0x84/0x190
[ 1030.613671] el0_sync_handler+0x138/0x258
[ 1030.613672] el0_sync+0x140/0x180
[ 1030.614142] SMP: stopping secondary CPUs
[ 1030.614144] Kernel Offset: disabled
[ 1030.614145] CPU features: 0x10002,20006008
[ 1030.614146] Memory Limit: none

To reproduce the BUG with eMMC (and no PCIe):
- connect the serial console and increase log level. (dmesg -n 8). So you will get the oops
- on a ssh connection run this command and wait: while `true`; do sudo hexdump /dev/mmcblk0 ; done
  Reply
#9
I googled and found that people do have this issue.
For example: https://github.com/ayufan-rock64/linux-build/issues/299

If some people dont have this issue, then it can be interesting to know which revision of the board they have.
  Reply
#10
My RockPro64 NAS with two 3,5 HDD and booting from an USB3 SSD are absolute stable. I am using

Code:
Linux rockpro64 4.4.202-1237-rockchip-ayufan-gfd4492386213


I have bad experience with all armbian builds.
Sorry for any mistakes. English is not my native language

1. Quartz64 Model B, 4GB RAM

2. Quartz64 Model A, 4GB RAM

3. RockPro64 v2.1

https://linux-nerds.org/
  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  irradium (based on crux linux) RockPro64 riscv64, aarch64 mara 7 1,879 11-20-2024, 03:53 PM
Last Post: mara
  New OS for RockPro64 is here, TwisterOS Armbian jtremblant 92 106,012 08-17-2024, 02:32 PM
Last Post: taltamir
  OpenEuler OS on RockPro64 Yuriy Gavrilov 0 289 06-15-2024, 09:38 AM
Last Post: Yuriy Gavrilov
  yocto for RockPro64 Fide 1 1,114 01-16-2024, 10:01 AM
Last Post: Fide
  Installing Ubuntu Server on RockPro64 deutschlmao 2 3,476 10-29-2023, 04:43 PM
Last Post: brotherj4mes
  Vanilla mainline Debian 11 (Bullseye) on the RockPro64 Pete Tandy 22 21,326 08-16-2023, 01:34 AM
Last Post: varac
  slarm64 (unofficial slackware) ROCKPro64 RK3399 (aarch64) mara 54 92,894 08-11-2023, 11:13 AM
Last Post: mara
  How to enable CoreSight ETM trace on RockPro64 shpark 0 864 05-21-2023, 11:34 PM
Last Post: shpark
  Rockpro64 Dead on arrival? quixoticgeek 1 1,370 03-12-2023, 06:55 PM
Last Post: quixoticgeek
  RockPro64 boot questions misterc 3 2,329 01-13-2023, 06:21 PM
Last Post: misterc

Forum Jump:


Users browsing this thread: 2 Guest(s)