I'm trying to set up the Pine64 NAS. Here's what I've done:
* Installed Armbian Focal 20.04 on to the eMMC module, using Pine64 5A adapter.
* System worked fine, installed software. No problems.
* Installed the SATA card from Pine64, no hard drives yet. lspci shows the card.
* Installed 2 Seagate Iron Wolf HDD
* Booted the system, and it sees /dev/sda and /dev/sdb.
* Installed mdadm, created a new partition /dev/md0 based on /dev/sda and /dev/sdb.
* Formatted /dev/md0 with XFS.
* No problems yet.
* Tried to copy a large movie file to the /dev/md0 partition, got an I/O error. Didn't save the error unfortunately, but looking back at my search history shows the line "md: super_written gets error=10"
* I install the latest Armbian, Focal 20.08.9, Linux kernel:
uname -a
Linux rockpro64 5.8.13-rockchip64 #20.08.8 SMP PREEMPT Mon Oct 5 15:59:02 CEST 2020 aarch64 aarch64 aarch64 GNU/Linux
* I saw a few posts about the hard drives drawing too much power, so I purchased a 10A adapter from Amazon.
* Booted the RockPro64 with the new 10A adapter: Nothing. No lights, no fan.
* Booted the RockPro64 with the original Pine64 5A adapter: Nothing. No lights, no fan.
* Removed the SATA card.
* Booted the RockPro64 with the new 10A adapter: Boots fine, I can SSH in, all seems good.
* Powered off, connected SATA card with hard drives disconnected.
* Booted again with 10A adapter: Nothing, no lights, no fan.
Any thoughts? Did the SATA adapter somehow die? I've seen a few posts about using a Marvell SATA adapter, but I don't want to keep buying extra things as replacement for the items which should have worked.
I'm confident that these items work fine:
* Hard drives
* RockPro64 board
* 10A adapter
I have my doubts about the SATA adapter.
How should I proceed? The only thing that looks like a version marking on the SATA card is: SU-SATA3-T2 Ver. 005.
Update: I bought a new SATA adapter, plugged it in, and everything seemed to work. However, trying to actually use the partition caused a panic.
I rebuilt the RAID-1 virtual drive, reformatted at EXT4. Then trying to mount the new partition, I get this: (md127 was what I think the kernel automatically detected after I deleted the mdadm file and removed the partition info from fstab).
[ 536.361958] md127: detected capacity change from 12000003358720 to 0
[ 536.361985] md: md127 stopped.
[ 598.569637] md/raid1:md0: not clean -- starting background reconstruction
[ 598.569642] md/raid1:md0: active with 2 out of 2 mirrors
[ 598.637543] md0: detected capacity change from 0 to 12000003358720
[ 598.701396] md: resync of RAID array md0
[ 691.037890] ata1.00: failed to read SCR 1 (Emask=0x40)
[ 691.037910] ata1.01: failed to read SCR 1 (Emask=0x40)
[ 691.037915] ata1.02: failed to read SCR 1 (Emask=0x40)
[ 691.087979] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[ 691.088796] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter bridge governor_performance bnep snd_soc_hdmi_codec hci_uart btqca btrtl btbcm btintel snd_soc_simple_card snd_soc_audio_graph_card hantro_vpu© pwm_fan rockchip_vdec© snd_soc_simple_card_utils bluetooth v4l2_h264 rockchip_rga videobuf2_dma_contig videobuf2_vmalloc v4l2_mem2mem videobuf2_dma_sg fusb302 videobuf2_memops tcpm typec snd_soc_es8316 videobuf2_v4l2 rc_cec dw_hdmi_cec panfrost snd_soc_rockchip_i2s gpu_sched dw_hdmi_i2s_audio videobuf2_common zstd snd_soc_core rfkill videodev snd_pcm_dmaengine snd_pcm mc snd_timer snd soundcore sg cpufreq_dt zram sch_fq_codel g_serial libcomposite ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid0 multipath
[ 691.088909] linear raid1 md_mod realtek rockchipdrm analogix_dp dw_hdmi dwmac_rk dw_mipi_dsi stmmac_platform drm_kms_helper stmmac cec mdio_xpcs rc_core drm drm_panel_orientation_quirks
[ 691.099456] CPU: 2 PID: 214 Comm: scsi_eh_0 Tainted: G C 5.8.16-rockchip64 #20.08.12
[ 691.100379] Hardware name: Pine64 RockPro64 v2.1 (DT)
[ 691.100901] pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
[ 691.101483] pc : ahci_scr_read+0x44/0x88
[ 691.101890] lr : sata_scr_read+0x70/0x90
[ 691.102297] sp : ffff8000128fbad0
[ 691.102640] x29: ffff8000128fbad0 x28: 0000000000000000
[ 691.103187] x27: ffff800011acb828 x26: ffff0000f5262ca8
[ 691.103733] x25: ffff0000f617ab80 x24: ffff0000f5aea040
[ 691.104282] x23: 0000000000000003 x22: 0000000000000000
[ 691.104829] x21: ffff0000f5aea3e0 x20: ffff0000f5aea040
[ 691.105375] x19: ffff0000f5aea440 x18: 0000000000000020
[ 691.105920] x17: 0000000000000000 x16: 0000000000000000
[ 691.106468] x15: ffff80001181e000 x14: ffff800011a06242
[ 691.107014] x13: 0000000000000000 x12: ffff800011a05000
[ 691.107561] x11: ffff80001181e000 x10: ffff800011a05888
[ 691.108106] x9 : 0000000000000000 x8 : 0000000000000004
[ 691.108654] x7 : ffff0000f5aea040 x6 : ffff8000128fbb74
[ 691.109200] x5 : 0000000000000001 x4 : ffff800010f3b098
[ 691.109747] x3 : 0000000000000100 x2 : ffff8000128fbb74
[ 691.110300] x1 : ffff800011cd5130 x0 : ffff800011cd5000
[ 691.110847] Call trace:
[ 691.111107] ahci_scr_read+0x44/0x88
[ 691.111479] ata_eh_link_autopsy+0x88/0xbb8
[ 691.111912] ata_eh_autopsy+0xec/0x100
[ 691.112303] sata_pmp_error_handler+0x4c/0x950
[ 691.112764] ahci_error_handler+0x40/0x88
[ 691.113179] ata_scsi_port_error_handler+0x238/0x5f8
[ 691.113690] ata_scsi_error+0x94/0xd8
[ 691.114071] scsi_error_handler+0xa0/0x388
[ 691.114500] kthread+0x118/0x150
[ 691.114837] ret_from_fork+0x10/0x34
[ 691.115214] Code: b8615881 340001c1 8b21c061 8b010001 (b9400021)
[ 691.115843] ---[ end trace 9880ab5860dacb97 ]--
I'll reboot, since maybe that's needed for the kernel to recognise the md127 -> md0. I also opted for EXT4 this time, maybe that's easier on the Kernel. Who knows. I'm running out of options.
I had the Pine64 NAS working for a week with the 2x HDD's working as a stripped RAID array. I did a software update "apt-get dist-upgrade" and has this at reboot:
[ 0.000000] Linux version 5.8.16-rockchip64 (root@xeon) (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 9.2.1 20191025, GNU ld (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 2.33.1.20191209) #20.08.14 SMP PREEMPT Tue Oct 20 22:37:51 CEST 2020
[ 0.000000] Machine model: Pine64 RockPro64 v2.1
[ 0.000000] earlycon: uart8250 at MMIO32 0x00000000ff1a0000 (options '')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 17.979376] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[ 18.003131] Modules linked in: snd_soc_hdmi_codec panfrost pwm_fan snd_soc_audio_graph_card snd_soc_simple_card gpu_sched snd_soc_simple_card_utils rc_cec dw_hdmi_i2s_audio fusb302 dw_hdmi_cec tcpm snd_soc_rockchip_i2s snd_soc_es8316 rockchip_vdec© hantro_vpu© typec snd_soc_core rockchip_rga v4l2_h264 hci_uart videobuf2_vmalloc videobuf2_dma_contig videobuf2_dma_sg snd_pcm_dmaengine v4l2_mem2mem btqca videobuf2_memops snd_pcm btrtl videobuf2_v4l2 btbcm btintel videobuf2_common bluetooth videodev snd_timer mc snd soundcore rfkill sg cpufreq_dt sch_fq_codel ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid0 multipath linear raid1 md_mod realtek rockchipdrm analogix_dp dw_hdmi dwmac_rk dw_mipi_dsi stmmac_platform drm_kms_helper stmmac cec mdio_xpcs rc_core drm drm_panel_orientation_quirks
[ 18.003239] CPU: 1 PID: 591 Comm: cached_setup_fo Tainted: G C 5.8.16-rockchip64 #20.08.14
[ 18.003241] Hardware name: Pine64 RockPro64 v2.1 (DT)
[ 18.003246] pstate: 00000085 (nzcv daIf -PAN -UAO BTYPE=--)
[ 18.003261] pc : ahci_single_level_irq_intr+0x24/0x90
[ 18.003278] lr : __handle_irq_event_percpu+0x60/0x2a8
[ 18.053720] sp : ffff800011aa3e40
[ 18.053722] x29: ffff800011aa3e40 x28: ffff8000114de018
[ 18.053726] x27: ffff0000f5a8e000 x26: ffff8000114de018
[ 18.053729] x25: ffff8000117fa1c0 x24: ffff800011aa3f04
[ 18.053733] x23: 0000000000000000 x22: 00000000000000e8
[ 18.053737] x21: ffff800011cd5008 x20: ffff0000f5a55200
[ 18.053740] x19: ffff0000f671c200 x18: 0000000000000000
[ 18.053748] x17: 0000000000000000 x16: 0000000000000000
[ 18.094273] x15: 0000000000000000 x14: 0000000000000a70
[ 18.094277] x13: 0000000000000a70 x12: 0000000000000018
[ 18.094280] x11: 0000000000000040 x10: ffff80001181e690
[ 18.094283] x9 : ffff80001181e688 x8 : ffff0000f68c8dd8
[ 18.094286] x7 : 0000000000000000 x6 : 0000000000000000
[ 18.094290] x5 : ffff0000f68c8db0 x4 : ffff0000f68c8f18
[ 18.094294] x3 : ffff8000114de018 x2 : ffff800010955938
[ 18.094297] x1 : ffff0000f598fc80 x0 : 00000000000000e8
[ 18.136707] Call trace:
[ 18.136714] ahci_single_level_irq_intr+0x24/0x90
[ 18.136717] __handle_irq_event_percpu+0x60/0x2a8
[ 18.136720] handle_irq_event_percpu+0x34/0x90
[ 18.136725] handle_irq_event+0x48/0xe8
[ 18.136730] handle_fasteoi_irq+0xcc/0x180
[ 18.136735] generic_handle_irq+0x30/0x48
[ 18.136744] __handle_domain_irq+0x94/0x108
[ 18.177317] gic_handle_irq+0x60/0x158
[ 18.177322] el1_irq+0xb8/0x180
[ 18.177329] __mod_lruvec_state+0x7c/0x128
[ 18.177334] page_remove_rmap+0xd0/0x518
[ 18.177343] unmap_page_range+0x4d4/0xbc0
[ 18.203146] unmap_single_vma+0x88/0x118
[ 18.203149] unmap_vmas+0x70/0xe8
[ 18.203153] exit_mmap+0xc8/0x180
[ 18.203157] mmput+0x84/0x158
[ 18.203166] begin_new_exec+0x2d0/0xab0
[ 18.226190] load_elf_binary+0x39c/0x16b8
[ 18.226193] __do_execve_file.isra.0+0x520/0x9f8
[ 18.226196] __arm64_sys_execve+0x44/0x58
[ 18.226200] el0_svc_common.constprop.0+0x70/0x188
[ 18.226202] do_el0_svc+0x24/0x90
[ 18.226207] el0_sync_handler+0x90/0x198
[ 18.226215] el0_sync+0x158/0x180
[ 18.260378] Code: a9025bf5 f9401021 f9400835 910022b5 (b94002b3)
[ 18.260391] ---[ end trace 3fd32674691543d2 ]---
[ 18.260399] Kernel panic - not syncing: Fatal exception in interrupt
[ 18.277167] SMP: stopping secondary CPUs
[ 18.279589] Kernel Offset: disabled
[ 18.279592] CPU features: 0x240022,2000600c
[ 18.279594] Memory Limit: none
[ 18.292913] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Any thoughts? I know I'm just yelling into the void here, but is there something I'm doing wrong?
Well, I would not rule out that this is rather a kernel issue as it happened after the upgrade. Do you have the possibility to test with the previous kernel as well?