12-05-2018, 02:05 PM
I'm using the RP64 with the dual SATA board to run two drives in a RAID 1 configuration. This is using OpenMediaVault 4.
The drives were seen as /dev/sda and /dev/sdb.
It was running fine for a while. And then I found that the Raid was showing as 'degraded' and only sda was showing in the Raid config. When I looked at the Disks section I found that sdb had disappeared and sdc had appeared. Using lsblk showed sdc as not having a partition.
Looking through syslog, I found this:
Followed by those last four lines repeated many times. Then:
Followed by that last line repeated many, many times. Followed by this:
I have no clue what any of this means. This happened to me before, too, and I assumed it might be a disk fault and used a different (brand new) disk.
Can anyone give me some clues as to what's going on here?
The drives were seen as /dev/sda and /dev/sdb.
It was running fine for a while. And then I found that the Raid was showing as 'degraded' and only sda was showing in the Raid config. When I looked at the Disks section I found that sdb had disappeared and sdc had appeared. Using lsblk showed sdc as not having a partition.
Looking through syslog, I found this:
Code:
Dec 2 09:07:45 mediavault kernel: [ 1766.245638] ata2.00: irq_stat 0x08000000, interface fatal error
Dec 2 09:07:45 mediavault kernel: [ 1766.251514] ata2: SError: { Handshk }
Dec 2 09:07:45 mediavault kernel: [ 1766.257211] ata2.00: failed command: WRITE FPDMA QUEUED
Dec 2 09:07:45 mediavault kernel: [ 1766.263010] ata2.00: cmd 61/28:00:00:70:08/06:00:01:00:00/40 tag 0 ncq 806912 out
Dec 2 09:07:45 mediavault kernel: [ 1766.263010] res 40/00:b8:00:08:09/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
Dec 2 09:07:45 mediavault kernel: [ 1766.274936] ata2.00: status: { DRDY }
Dec 2 09:07:45 mediavault kernel: [ 1766.280635] ata2.00: failed command: WRITE FPDMA QUEUED
Dec 2 09:07:45 mediavault kernel: [ 1766.286419] ata2.00: cmd 61/d8:08:28:76:08/01:00:01:00:00/40 tag 1 ncq 241664 out
Dec 2 09:07:45 mediavault kernel: [ 1766.286419] res 40/00:b8:00:08:09/00:00:01:00:00/40 Emask 0x10 (ATA bus error)
Followed by those last four lines repeated many times. Then:
Code:
Dec 2 09:07:45 mediavault kernel: [ 1766.819510] ata2.00: status: { DRDY }
Dec 2 09:07:45 mediavault kernel: [ 1766.823698] ata2: hard resetting link
Dec 2 09:07:55 mediavault kernel: [ 1776.824271] ata2: softreset failed (1st FIS failed)
Dec 2 09:07:55 mediavault kernel: [ 1776.828531] ata2: hard resetting link
Dec 2 09:08:05 mediavault kernel: [ 1786.829268] ata2: softreset failed (1st FIS failed)
Dec 2 09:08:05 mediavault kernel: [ 1786.833469] ata2: hard resetting link
Dec 2 09:08:40 mediavault kernel: [ 1821.834491] ata2: softreset failed (1st FIS failed)
Dec 2 09:08:40 mediavault kernel: [ 1821.838710] ata2: limiting SATA link speed to 3.0 Gbps
Dec 2 09:08:40 mediavault kernel: [ 1821.842808] ata2: hard resetting link
Dec 2 09:08:42 mediavault kernel: [ 1824.053694] ata2: SATA link down (SStatus 1 SControl 320)
Dec 2 09:08:42 mediavault kernel: [ 1824.058049] ata2: hard resetting link
Dec 2 09:08:45 mediavault kernel: [ 1826.265715] ata2: SATA link down (SStatus 1 SControl 320)
Dec 2 09:08:45 mediavault kernel: [ 1826.269883] ata2: limiting SATA link speed to 1.5 Gbps
Dec 2 09:08:50 mediavault kernel: [ 1831.269486] ata2: hard resetting link
Dec 2 09:08:52 mediavault kernel: [ 1833.480738] ata2: SATA link down (SStatus 1 SControl 310)
Dec 2 09:08:52 mediavault kernel: [ 1833.484879] ata2.00: disabled
Dec 2 09:08:52 mediavault kernel: [ 1833.493213] ata2: irq_stat 0x00000040, connection status changed
Dec 2 09:08:52 mediavault kernel: [ 1833.497272] ata2: SError: { CommWake DevExch }
Dec 2 09:08:52 mediavault kernel: [ 1833.501183] ata2: hard resetting link
Dec 2 09:08:53 mediavault kernel: [ 1834.382757] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Dec 2 09:08:53 mediavault kernel: [ 1834.425005] ata2.00: ATA-10: ST1000LM048-2E7172, SDM1, max UDMA/133
Dec 2 09:08:53 mediavault kernel: [ 1834.429135] ata2.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Dec 2 09:08:53 mediavault kernel: [ 1834.484307] ata2.00: configured for UDMA/133
Dec 2 09:08:53 mediavault kernel: [ 1834.488274] sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Dec 2 09:08:53 mediavault kernel: [ 1834.492577] sd 1:0:0:0: [sdb] tag#0 Sense Key : 0x5 [current] [descriptor]
Dec 2 09:08:53 mediavault kernel: [ 1834.496755] sd 1:0:0:0: [sdb] tag#0 ASC=0x21 ASCQ=0x4
Dec 2 09:08:53 mediavault kernel: [ 1834.500726] sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x2a 2a 00 01 08 70 00 00 06 28 00
Dec 2 09:08:53 mediavault kernel: [ 1834.504911] blk_update_request: I/O error, dev sdb, sector 17330176
Dec 2 09:08:53 mediavault kernel: [ 1834.509051] sd 1:0:0:0: rejecting I/O to offline device
Dec 2 09:08:53 mediavault kernel: [ 1834.513001] sd 1:0:0:0: [sdb] killing request
Dec 2 09:08:53 mediavault kernel: [ 1834.516833] sd 1:0:0:0: rejecting I/O to offline device
Dec 2 09:08:53 mediavault kernel: [ 1834.520696] blk_update_request: I/O error, dev sdb, sector 16
Dec 2 09:08:53 mediavault kernel: [ 1834.524624] md: super_written gets error=-5
Dec 2 09:08:53 mediavault kernel: [ 1834.528376] md/raid1:md127: Disk failure on sdb, disabling device.
Dec 2 09:08:53 mediavault kernel: [ 1834.528376] md/raid1:md127: Operation continuing on 1 devices.
Dec 2 09:08:53 mediavault kernel: [ 1834.536240] sd 1:0:0:0: rejecting I/O to offline device
Dec 2 09:08:53 mediavault kernel: [ 1834.540072] blk_update_request: I/O error, dev sdb, sector 16
Dec 2 09:08:53 mediavault kernel: [ 1834.543942] md: super_written gets error=-5
Dec 2 09:08:53 mediavault kernel: [ 1834.547728] sd 1:0:0:0: rejecting I/O to offline device
Followed by that last line repeated many, many times. Followed by this:
Code:
Dec 2 09:08:55 mediavault kernel: [ 1836.385630] ata2.00: detaching (SCSI 1:0:0:0)
Dec 2 09:08:55 mediavault kernel: [ 1836.392394] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Dec 2 09:08:55 mediavault kernel: [ 1836.394083] sd 1:0:0:0: [sdb] Stopping disk
Dec 2 09:08:56 mediavault kernel: [ 1837.154695] scsi 1:0:0:0: Direct-Access ATA ST1000LM048-2E71 SDM1 PQ: 0 ANSI: 5
Dec 2 09:08:56 mediavault kernel: [ 1837.157630] sd 1:0:0:0: [sdc] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Dec 2 09:08:56 mediavault kernel: [ 1837.159989] sd 1:0:0:0: [sdc] 4096-byte physical blocks
Dec 2 09:08:56 mediavault kernel: [ 1837.162502] sd 1:0:0:0: [sdc] Write Protect is off
Dec 2 09:08:56 mediavault kernel: [ 1837.164634] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00
Dec 2 09:08:56 mediavault kernel: [ 1837.164779] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 2 09:08:56 mediavault kernel: [ 1837.734092] sd 1:0:0:0: [sdc] Attached SCSI disk
Dec 2 09:08:59 mediavault systemd-udevd[4876]: Process '/sbin/mdadm -If sdb --path platform-f8000000.pcie-pci-0000:01:00.0-ata-2' failed with exit code 1.
I have no clue what any of this means. This happened to me before, too, and I assumed it might be a disk fault and used a different (brand new) disk.
Can anyone give me some clues as to what's going on here?