05-01-2020, 07:37 AM
(04-30-2020, 06:49 PM)kuleszdl Wrote: Please find attached the kernel log with the crash when the PCIe card is inserted:
I am running the current 5.6 Mainline kernel from Debian unstable.
I found a report about similar issues on the Manjaro forums:
https://forum.manjaro.org/t/freezes-on-r...4/97978/85
I tried limiting the number of CPU cores as suggested there by appending
Code:maxcpus=1
to the kernel command line. However, this did not work either. I am getting basically the following error now:
Code:Internal error: synchrononous external abort: 96000210 [#1] SMP
Any ideas?
This is exactly the error described above:
https://forum.pine64.org/showthread.php?...2#pid64622
Quote:It may be the hardware issue, but do note there is an issue with the rk3399 pcie controller that is currently unmitigated.
See the LKML thread here : https://lore.kernel.org/linux-pci/CAMdYz...gmail.com/
Also see this for additional information : https://lkml.org/lkml/2020/4/6/320
TLDR: We found the rk3399 throws either a synchronous error or a SError when a pcie device sends an unknown message.
The error type is determined by which cpu cluster handles the message.
We hijacked the arm64 error handling and processed it ourselves, and that corrects the issue, but it's not a good fix.
In the end, it was determined that significant changes to how arm64 handles pcie errors in the linux kernel need to happen.
There's a hack in the mailing list to disable SError handling (https://lkml.org/lkml/diff/2020/4/27/1041/1) , then you can load the pcie module manually with:
Code:
taskset -c 4 modprobe pcie_rockchip_host
But this is nothing more than a hack, in the end the pcie controller doesn't handle certain error sequences correctly which is a hardware bug.