PINE64
How about a hard power switch? + crash report - Printable Version

+- PINE64 (https://forum.pine64.org)
+-- Forum: Pinebook Pro (https://forum.pine64.org/forumdisplay.php?fid=111)
+--- Forum: General Discussion on Pinebook Pro (https://forum.pine64.org/forumdisplay.php?fid=112)
+--- Thread: How about a hard power switch? + crash report (/showthread.php?tid=9859)



How about a hard power switch? + crash report - Dendrocalamus64 - 05-14-2020

Debian + MATE MrFixit version 2.0
Kernel: Linux Debian-Desktop 4.4.213 #1 SMP Fri Feb 7 13:03:55 EST 2020 aarch64 GNU/Linux

While playing an mp3 with mplayer through the internal speakers, I had the system freeze so hard that I had to take the back cover off & push the reset switch to get it to reboot.
For the next incremental hardware tweak, how about adding a hard power switch in a more accessible location?

The last thing it wrote to the logs:
Code:
May 15 04:04:44 Debian-Desktop kernel: [22168.303153] BUG: spinlock wrong CPU on CPU#1, ksdioirqd/mmc2/519
May 15 04:04:44 Debian-Desktop kernel: [22168.375143]  lock: 0xffffffc0f12000d8, .magic: dead4ead, .owner: ksdioirqd/mmc2/519, .owner_cpu: 0
May 15 04:04:44 Debian-Desktop kernel: [22168.482500] CPU: 1 PID: 519 Comm: ksdioirqd/mmc2 Tainted: G           O    4.4.213 #1
May 15 04:04:44 Debian-Desktop kernel: [22168.576254] Hardware name: Pine64 Pinebook Pro (DT)
May 15 04:04:44 Debian-Desktop kernel: [22168.576258] Call trace:
May 15 04:04:44 Debian-Desktop kernel: [22168.576274] [<ffffff8008088284>] dump_backtrace+0x0/0x220
May 15 04:04:44 Debian-Desktop kernel: [22168.576280] [<ffffff80080884c8>] show_stack+0x24/0x30
May 15 04:04:44 Debian-Desktop kernel: [22168.576288] [<ffffff8008448278>] dump_stack+0xa4/0xcc
May 15 04:04:44 Debian-Desktop kernel: [22168.576294] [<ffffff80081026c8>] spin_dump+0x84/0xa4
May 15 04:04:44 Debian-Desktop kernel: [22168.576297] [<ffffff8008102718>] spin_bug+0x30/0x3c
May 15 04:04:44 Debian-Desktop kernel: [22168.576300] [<ffffff8008102a00>] do_raw_spin_unlock+0xac/0xd0
May 15 04:04:44 Debian-Desktop kernel: [22168.576309] [<ffffff8008f7c1c8>] _raw_spin_unlock_irqrestore+0x24/0x34
May 15 04:04:44 Debian-Desktop kernel: [22168.576315] [<ffffff800885f71c>] dhd_rx_frame+0x46c/0x5d4
May 15 04:04:44 Debian-Desktop kernel: [22168.576320] [<ffffff8008897658>] dhdsdio_readframes+0x130c/0x135c
May 15 04:04:44 Debian-Desktop kernel: [22168.576324] [<ffffff800889b93c>] dhdsdio_dpc+0x814/0xb3c
May 15 04:04:44 Debian-Desktop kernel: [22168.576328] [<ffffff800889be34>] dhdsdio_isr+0x180/0x1b8
May 15 04:04:44 Debian-Desktop kernel: [22168.576335] [<ffffff800888cf94>] IRQHandler+0x44/0x9c
May 15 04:04:44 Debian-Desktop kernel: [22168.576356] [<ffffff8008c348c4>] process_sdio_pending_irqs+0x140/0x16c
May 15 04:04:44 Debian-Desktop kernel: [22168.576360] [<ffffff8008c349e0>] sdio_irq_thread+0xa8/0x1c8
May 15 04:04:44 Debian-Desktop kernel: [22168.576366] [<ffffff80080d3ac4>] kthread+0xe0/0xf0
May 15 04:04:44 Debian-Desktop kernel: [22168.576371] [<ffffff8008082ef0>] ret_from_fork+0x10/0x20
May 15 04:26:50 Debian-Desktop kernel: [    0.000000] Booting Linux on physical CPU 0x0

IIRC it was actually still running after that. The next time I tried to play the mp3, it froze. There was soft static coming from the speakers.


And then when I tried to post this, the PBP rebooted. That didn't leave any message in the logs [edit] but there's a console-ramoops-0 and dmesg-ramoops-0 in /sys/fs/pstore that look like they're from the right time.

Code:
<1>[ 1352.655030] Unable to handle kernel NULL pointer dereference at virtual address 00000000
<1>[ 1352.655843] pgd = ffffff8009981000
<1>[ 1352.656197] [00000000] *pgd=00000000f7ffe003, *pud=00000000f7ffe003, *pmd=0000000000000000
<0>[ 1352.656987] Internal error: Oops: 96000005 [#1] SMP

with a full dump following.


RE: How about a hard power switch? + crash report - manawyrm - 05-15-2020

That could be the sysrq-Issue. 

Try running:
Code:
zcat /proc/config.gz | fgrep SYSRQ
and check whether CONFIG_MAGIC_SYSRQ is enabled. 

As far as I'm aware that's a hardware issue in the audio/uart circuitry. 

Disabling SYSRQ via 

Code:
echo "0" >/proc/sys/kernel/sysrq
should fix the issue. (until a reboot)

Manjaro have disabled SYSRQs in our kernel for this reason.



Quote:I had the system freeze so hard that I had to take the back cover off & push the reset switch to get it to reboot.

You don't need to do that. Just hold the power button for 10 seconds.


RE: How about a hard power switch? + crash report - Der Geist der Maschine - 05-15-2020

(05-15-2020, 09:52 AM)manawyrm Wrote: That could be the sysrq-Issue. 

This is by no means obvious.
  • The first crash runs into a BUG() call which intentionally crashes (reboots?) the system.
  • The second crash dereferences a null pointer - without the backtrace it's not obvious if that was triggered by sysrq. That's unlikely as a userspace process would have needed to do that... but for what reason?



RE: How about a hard power switch? + crash report - manawyrm - 05-16-2020

Quote:This is by no means obvious.

True. 
I have encountered this same issue on my own PBP in the past and saw that there are enormous amounts of sysrq's happening when audio is played back at the highest volume levels. 
These have caused stuttering, lockups, hard reboots and other weird behaviour. 
That's not unexpected if you look at the list of available sysrqs: 
https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html#what-are-the-command-keys

For example "c" would produce this same output as shown Dendrocalamus64:
"Will perform a system crash by a NULL pointer dereference."


RE: How about a hard power switch? + crash report - Der Geist der Maschine - 05-16-2020

(05-16-2020, 03:33 PM)manawyrm Wrote: True. 
I have encountered this same issue on my own PBP in the past and saw that there are enormous amounts of sysrq's happening when audio is played back at the highest volume levels. 
These have caused stuttering, lockups, hard reboots and other weird behaviour. 
That's not unexpected if you look at the list of available sysrqs: 
https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html#what-are-the-command-keys

Sysrqs are triggered over procfs or keyboard combinations. There are not obvious further callers https://elixir.bootlin.com/linux/v5.6.13/ident/handle_sysrq

How have you traced invocations of sysrqs?


(05-16-2020, 03:33 PM)manawyrm Wrote: For example "c" would produce this same output as shown Dendrocalamus64:
"Will perform a system crash by a NULL pointer dereference."

That's why I wrote "without the backtrace it's not obvious if that was triggered by sysrq". The backtrace would show sysrq() ... or not. When Dendrocalamus64 had the 0 pointer dereference, they was not using audio, though.


RE: How about a hard power switch? + crash report - Dendrocalamus64 - 05-17-2020

Neither crash occurs frequently, I've played the same mp3 many times since then and it hasn't repeated.
Yes, CONFIG_MAGIC_SYSRQ is enabled, but it's occasionally useful so I don't want to turn it off if I don't have to.

Yes, I know I should post the whole dump for the null pointer deref, but I was paranoid that it might contain some sort of PII, private keys, or whatever, improbable though it is, since this is my daily use computer.


.txt   console-ramoops-0.txt (Size: 36.4 KB / Downloads: 161)


RE: How about a hard power switch? + crash report - manawyrm - 05-18-2020

(05-16-2020, 10:34 PM)Der Geist der Maschine Wrote: Sysrqs are triggered over procfs or keyboard combinations. [..]
How have you traced invocations of sysrqs?

SYSRQs can also be issued over the UART. And the UART on the PBP is also being used as the headphone jack for audio. 
There is significant (~3Vpp) Crosstalk between the audio and UART lines. This is a hardware issue.

Here's a video of my PBP playing audio at full volume (and already distorting a little bit):
https://www.youtube.com/watch?v=MPKHM1J_Uxk

This causes erroneous inputs on the UART /dev/ttyS2. Normally ttyS2 has a getty and the kernel will respond to SYSRQs on this terminal. 
These inputs will cause all sorts of behaviour, including hangs, crashes and stuttering because the kernel will stop everything it does to parse the SYSRQs.

If you want to reproduce this, set everything in alsamixer and pavucontrol to max volume, play a loud rock song and make sure to stop serial-getty@ttyS2 (or the equivalent for your OS, otherwise the getty will fetch all your tty input before your cat can).


RE: How about a hard power switch? + crash report - Der Geist der Maschine - 05-24-2020

(05-17-2020, 04:01 PM)Dendrocalamus64 Wrote: Yes, CONFIG_MAGIC_SYSRQ is enabled, but it's occasionally useful so I don't want to turn it off if I don't have to.

Dendrocalamus64, sysrq is not enabled by default on MrFixIT's kernel:

CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0

You did not run into garbage SYSRQs anyway. The cause of your kernel crashes is hard to debug. Perhaps install MrFixIT's latest kernel 4.4.222 or even better move to a distribution with a 5.X kernel. The 5.X kernels are rock-stable on the Pinebook Pro.

(05-15-2020, 09:52 AM)manawyrm Wrote: Manjaro have disabled SYSRQs in our kernel for this reason.

More recent kernels, such as Manjaros 5.X kernel, have an additonal option CONFIG_MAGIC_SYSRQ_SERIAL. I'm surprised Manjaro has completely disabled SYSRQ while not making use of CONFIG_MAGIC_SYSRQ_SERIAL=0.


(05-18-2020, 12:00 AM)manawyrm Wrote: SYSRQs can also be issued over the UART. And the UART on the PBP is also being used as the headphone jack for audio.
There is significant (~3Vpp) Crosstalk between the audio and UART lines. This is a hardware issue.

This is highly interesting but why is there cross-talk even when there is no headset plugged in? Shouldn't the audio line be "idle"?