PINE64
Receiving errors about eMMC (?) - Printable Version

+- PINE64 (https://forum.pine64.org)
+-- Forum: ROCK64 (https://forum.pine64.org/forumdisplay.php?fid=85)
+--- Forum: General Discussion on ROCK64 (https://forum.pine64.org/forumdisplay.php?fid=86)
+--- Thread: Receiving errors about eMMC (?) (/showthread.php?tid=5258)

Pages: 1 2


Receiving errors about eMMC (?) - acasta - 10-12-2017

TL;DR:  Ayufans latest eMMC-Mate-image. Strange errors in dmesg. See snippet below. Need help interpreting.

----------------

Hi! After some weeks of using my rock64 as bookshelf decoration, I'm finally starting to tinker with it.

At first, I hit some smaller bumps like using the official Mate image and loosing USB power after login.

Now I'm using ayufans latest eMMC Mate image. Installation to the eMMC module was super unproblematic. First reboot worked fine. Did some basic first login stuff (keyboard-layout, password, ...), and issued a reboot command. System tried to start, but delivered a black screen with a blinking cursor. I let it sit like this for about fifteen minutes ("just let it work. who knows, what's going on inside"), then I did a hard reset. System started to graphical login prompt. Typed in password, login box vanished, and that was it. Again fifteen minutes, this time only with the blue rock64 background. 
I grew a bit concerned, but after another hard reset, the system bootet up and everything was fine. And since this last reset it's been stable and useful: No more freezes, quick boot and login, USB power and ethernet woking fine, all looking bright and shiny. 
 
Then I made the error and issued the dmesg command. You know, just out of curiosity. Not that I could understand what it tells me. Just wanted to see what's going on. 
And there I saw them: Errors. Colored brightly red, lots of errors about mmcblk0.
 
Small snippet of dmesg output:

Code:
[   23.781898] mmcblk0: retrying using single block read
[   23.825835] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   23.991919] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[   23.991933] mmcblk0: error -110 transferring data, sector 3279888, nr 256, cmd response 0x900, card status 0xb00
[   23.992879] mmcblk0: retrying using single block read
[   24.120340] mmcblk0: error -110 sending stop command, original cmd response 0x900, card status 0x400900
[   24.120355] mmcblk0: retrying because a re-tune was needed
[   24.321852] dwmmc_rockchip ff520000.rksdmmc: Successfully tuned phase to 135
[   24.340830] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   24.506898] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[   24.506914] mmcblk0: error -110 transferring data, sector 265240, nr 232, cmd response 0x900, card status 0xb00
[   24.507855] mmcblk0: retrying using single block read
[   24.577818] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   24.742932] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[   24.742947] mmcblk0: error -110 transferring data, sector 10191680, nr 144, cmd response 0x900, card status 0xb00
[   24.743909] mmcblk0: retrying using single block read
[   24.835882] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   24.850620] tty_port_close_start: tty->count = 1 port count = 2.
[   25.001946] mmcblk0: error -110 transferring data, sector 5610720, nr 32, cmd response 0x900, card status 0x0
[   25.002839] mmcblk0: retrying using single block read
[   25.306353] mmcblk0: retrying because a re-tune was needed
[   25.599857] dwmmc_rockchip ff520000.rksdmmc: Successfully tuned phase to 135
[   35.494818] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   35.660947] mmcblk0: error -110 transferring data, sector 8489664, nr 152, cmd response 0x900, card status 0x0
[   35.661864] mmcblk0: retrying using single block read
[   35.707819] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   35.873936] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[   35.873950] mmcblk0: error -110 transferring data, sector 8486664, nr 224, cmd response 0x900, card status 0xb00
[   35.874895] mmcblk0: retrying using single block read
[   38.238823] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   38.404938] mmcblk0: error -110 transferring data, sector 3006264, nr 256, cmd response 0x900, card status 0x0
[   38.405854] mmcblk0: retrying using single block read
[   38.830921] mmcblk0: retrying because a re-tune was needed
[   39.124857] dwmmc_rockchip ff520000.rksdmmc: Successfully tuned phase to 135
[   39.864829] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[   40.030909] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[   40.030923] mmcblk0: error -110 transferring data, sector 5127832, nr 56, cmd response 0x900, card status 0xb00
[   40.031846] mmcblk0: retrying using single block read

Seeing similar errors after reboot(s) made me really concerned. I tried my google-fu, but with limited success: 
  • Googling the "dwmmc [...] Unexpected command timeout [...]" line did not bring up any results.
  • Googling the "error -115" line also did not yield useful hints (or I did not recognise them as such)
  • Googling the "error -110" line was to some extent successful:
I stumbled across a NXP forum thread discussing a very similar error message. Their accepted answer was that the SD-card was "too fast". I don't know whether this is correct or not, but also their problem description is missing the "error -115". (And I am using the 32GB eMMC module, not an SD-card.)

Whilest playing around with the rock64 and dmesg, I made some further observations: 
  • When idling, usually no further errors occur. When doing stuff (opening programs, ...), errors continue.
  • The amount of errors is random. I even had one completly error-free boot.
  • In dmesg-output, at ~305 sec runtime, I get three EXT4-fs entries, telling about number of errors since last fsck.
  • Only the "-110" error is colored red, the "-115" is not. (Maybe less dangerous?)
 
To clarify: I have a tiny bit of linux experience, but really tiny. Someone showed me "dmesg", now all I know about errors and logging in Linux is dmesg. I also have a Raspberry Pi, and installed Owncloud on it (by following a tutorial). But that's really where my abilities end. To the most part I have no idea what's going on, it feels like stumbling around in the fog. But I am willing to learn about Linux and Rock64 (even though I really should invest my time in other things right now).

I would be very grateful, if someone could shed some light on these error issues.

-ac

_______
Side note to board administrators: I could not decide whether to put this thead under "Linux" or under "Hardware", so I ended up in "General Discussion" - just to be on the wrong side either way. Feel free to move this thread around. I am very new to forums (at least as active participant) and I am sorry.


RE: Receiving errors about eMMC (?) - acasta - 10-19-2017

...A small update:

Over the last days, I kept an eye on this issue, restarted the Rock64 several times, let it sit idle, ran some tasks, then idling again...

All I found was: 
  • Doing tasks that access flash (e.g. loading programs) triggers errors.
  • Doing cpu-bound tasks (e.g. some lightweight linear algebra in iPython) creates very few errors.
  • Error distribution over time fluctuates heavily (some boot-ups without any errors, on other occasions the log is bursting with errors). 
...but nothing of this come unexpectedly.


The last two days, the Rock64 was just idling on my desk, while I was out of town, an this is what I came back to:

Code:
[  305.218470] EXT4-fs (mmcblk0p7): error count since last fsck: 5
[  305.218553] EXT4-fs (mmcblk0p7): initial error at time 4: ext4_journal_check_start:56
[  305.218592] EXT4-fs (mmcblk0p7): last error at time 1507828026: ext4_find_entry:1450: inode 6816
[86809.295631] EXT4-fs (mmcblk0p7): error count since last fsck: 5
[86809.295703] EXT4-fs (mmcblk0p7): initial error at time 4: ext4_journal_check_start:56
[86809.295741] EXT4-fs (mmcblk0p7): last error at time 1507828026: ext4_find_entry:1450: inode 6816
[89235.510964] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[89235.677149] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[89235.677179] mmcblk0: error -110 transferring data, sector 6998016, nr 256, cmd response 0x900, card status 0xb00
[89235.678237] mmcblk0: retrying using single block read
[173318.501166] EXT4-fs (mmcblk0p7): error count since last fsck: 5
[173318.501237] EXT4-fs (mmcblk0p7): initial error at time 4: ext4_journal_check_start:56
[173318.501275] EXT4-fs (mmcblk0p7): last error at time 1507828026: ext4_find_entry:1450: inode 6816


Again, nothing out of the ordinary. The EXT4-fs comments, 300s after boot and then daily afterwards, are well known. One error while idling for two days. That's near okay.

Then, I did some tinkering (copying files, repeatedly loading/exiting ipython):

Code:
[192228.340598] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192228.506799] mmcblk0: error -110 transferring data, sector 2108560, nr 512, cmd response 0x900, card status 0x0
[192228.507897] mmcblk0: retrying using single block read
[192228.596614] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192228.762782] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192228.762812] mmcblk0: error -110 transferring data, sector 2110096, nr 512, cmd response 0x900, card status 0xb00
[192228.763946] mmcblk0: retrying using single block read
[192228.856611] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192229.022789] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192229.022818] mmcblk0: error -110 transferring data, sector 2112656, nr 512, cmd response 0x900, card status 0xb00
[192229.023939] mmcblk0: retrying using single block read
[192229.148618] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192229.314785] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192229.314816] mmcblk0: error -110 transferring data, sector 2264064, nr 512, cmd response 0x900, card status 0xb00
[192229.315958] mmcblk0: retrying using single block read
[192229.424619] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192229.590822] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192229.590852] mmcblk0: error -110 transferring data, sector 2270208, nr 512, cmd response 0x900, card status 0xb00
[192229.591988] mmcblk0: retrying using single block read
[192229.679625] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192229.845811] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192229.845840] mmcblk0: error -110 transferring data, sector 2271232, nr 512, cmd response 0x900, card status 0xb00
[192229.846975] mmcblk0: retrying using single block read
[192229.936631] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192230.102809] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192230.102838] mmcblk0: error -110 transferring data, sector 2273280, nr 512, cmd response 0x900, card status 0xb00
[192230.103979] mmcblk0: retrying using single block read
[192230.192634] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192230.358806] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192230.358835] mmcblk0: error -110 transferring data, sector 2274816, nr 512, cmd response 0x900, card status 0xb00
[192230.359968] mmcblk0: retrying using single block read
[192230.583658] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192230.749834] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192230.749864] mmcblk0: error -110 transferring data, sector 2176512, nr 512, cmd response 0x900, card status 0xb00
[192230.750994] mmcblk0: retrying using single block read
[192230.868652] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192231.034829] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192231.034860] mmcblk0: error -110 transferring data, sector 2184704, nr 512, cmd response 0x900, card status 0xb00
[192231.035993] mmcblk0: retrying using single block read
[192231.108742] mmcblk0: retrying because a re-tune was needed
[192231.310628] dwmmc_rockchip ff520000.rksdmmc: Successfully tuned phase to 135
[192231.483668] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192231.649842] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192231.649873] mmcblk0: error -110 transferring data, sector 2352128, nr 512, cmd response 0x900, card status 0xb00
[192231.651007] mmcblk0: retrying using single block read
[192231.738666] dwmmc_rockchip ff520000.rksdmmc: Unexpected command timeout, state 3
[192231.904848] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
[192231.904878] mmcblk0: error -110 transferring data, sector 2353152, nr 512, cmd response 0x900, card status 0xb00

...And there we are again. The r/w speed while copying files is okay-ish, I guess: Using rsync (I have no idea whether this is appropriate for benchmarking) I get +50MB/s. As source and target device are both mmcblk0, I do not expect any world record values. But, being used to class4 SD cards, I am quite happy.

The only thing that bothers me, is my fear of the eMMC dying or corrupting. It would make me more than happy, if someone could be as kind as to explain to me what's going on with those errors.

Thanks, -ac


RE: Receiving errors about eMMC (?) - gusarg81 - 08-02-2018

Hi!

    I am having the same exact problems, and both my Rock64 and the eMMc memory (32GB) are brand new!

Quote:...
Aug  3 00:56:52 home kernel: [ 8038.800524] dwmmc_rockchip ff520000.dwmmc: Unexpected command timeout, state 3
Aug  3 00:56:52 home kernel: [ 8038.966556] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
Aug  3 00:56:52 home kernel: [ 8038.973013] mmcblk0: error -110 transferring data, sector 21641944, nr 16, cmd response 0x900, card status 0xb00
Aug  3 00:56:52 home kernel: [ 8038.979558] mmcblk0: retrying using single block read
Aug  3 00:56:57 home kernel: [ 8043.324614] dwmmc_rockchip ff520000.dwmmc: Unexpected command timeout, state 3
Aug  3 00:56:57 home kernel: [ 8043.490705] mmcblk0: error -115 sending stop command, original cmd response 0x900, card status 0xb00
Aug  3 00:56:57 home kernel: [ 8043.497268] mmcblk0: error -110 transferring data, sector 21640752, nr 8, cmd response 0x900, card status 0xb00
Aug  3 00:56:57 home kernel: [ 8043.503919] mmcblk0: retrying using single block read
Aug  3 00:57:09 home kernel: [ 8055.523671] mmcblk0: error -110 sending stop command, original cmd response 0x900, card status 0x400900
Aug  3 00:57:09 home kernel: [ 8055.530184] mmcblk0: retrying because a re-tune was needed
Aug  3 00:57:09 home kernel: [ 8056.008556] dwmmc_rockchip ff520000.dwmmc: Successfully tuned phase to 104
Aug  3 00:57:15 home kernel: [ 8061.275323] mmcblk0: retrying because a re-tune was needed
Aug  3 00:57:15 home kernel: [ 8061.752670] dwmmc_rockchip ff520000.dwmmc: Successfully tuned phase to 85
...


To many lines of them... Now if i have to analice this, is a typical debug of a bad drive, which in this case, is the eMMC. But seems I am not alone and maybe is something with the Rock64 hardware?

By the way, I am using Debian Stretch (from a OpenMediaVault image).

Any ideas? Thanks.


RE: Receiving errors about eMMC (?) - mikekehrli - 08-15-2018

I'm getting the same errors too on my rock64. I moved my root partions to an ssd drive. So the only partition mounted is the /boot/efi directory. It's mounted on /dev/mmcblk1p6. But my error messages are happening on mmcblk0. I'm not sure why the kernel is trying to access that card at all. But there is definitely some problem with the rock64's ability to access memory cards. This appears to be hardware related, although there could well be a software fix. Maybe a timing issue? I'm not qualified to say. But there are a number of reports on this problem even though new cards are used.  Mine is a new Samsung 32GB Evo.


RE: Receiving errors about eMMC (?) - mikekehrli - 08-15-2018

Acasta, you mentioned your sd card, but mmcbl0 is the emmc card, not the sd card.  I'm going to try taking mine out and see if it works better on the sd card. I've moved root to an external ssd but the kernel is using the emmc card for tmpfs. That seems to be where the errors are occurring.


RE: Receiving errors about eMMC (?) - asavah - 08-15-2018

I've been fighting this for a week.
Conclusions:
1) Ayufan's kernel has emmc freq set too high to be stable on some (most?) emmc modules https://github.com/ayufan-rock64/linux-kernel/blob/release-4.4/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts#L234 should be 150000000 ,same as in rockchip-linux repo.
2) Foresee emmc modules sold in the shop are CRAP. I'm currently using hardkernel (odroid-c2) emmc which are made from samsung flash chips, runs like a charm and is at least two times faster than Foresee (subjective, I don't have any benchmark results handy to back this claim)

Note: I'm no longer using ayufan's kernels, I build my own from rockchip-linux bsp repo plus tweaked dtsi/dts.


RE: Receiving errors about eMMC (?) - mikekehrli - 08-15-2018

Hold that thought Asavah. I took mine apart and used contact cleaner on the emmc chip and resinstalled.  I've had no errors now for about 2-1/2 hours. I'm running a process that uses 7.7 gb so, errors should be showing up. So far so good.  I'll post again in a day if it continues error free.


RE: Receiving errors about eMMC (?) - mikekehrli - 08-15-2018

(08-15-2018, 03:09 PM)asavah Wrote: I've been fighting this for a week.
Conclusions:
1) Ayufan's kernel has emmc freq set too high to be stable on some (most?) emmc modules https://github.com/ayufan-rock64/linux-kernel/blob/release-4.4/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts#L234 should be 150000000 ,same as in rockchip-linux repo.
2) Foresee emmc modules sold in the shop are CRAP. I'm currently using hardkernel (odroid-c2) emmc which are made from samsung flash chips, runs like a charm and is at least two times faster than Foresee (subjective, I don't have any benchmark results handy to back this claim)

Note: I'm no longer using ayufan's kernels, I build my own from rockchip-linux bsp repo plus tweaked dtsi/dts.

Mine is a Forsesee 16 gb.  Again, I'll report back if the contact cleaner fix holds up. I would be interested in some actual memory benchmark test comparisons.  But in general I have a lot of confidence in Samsung chips.

I looked at the odroid site. Are all of those versions to do with the pre-installed OS? I would think that an emmc chip should be standard and work in any device like an SD card. Am I missing something?


RE: Receiving errors about eMMC (?) - asavah - 08-16-2018

(08-15-2018, 05:51 PM)mikekehrli Wrote: Hold that thought Asavah. I took mine apart and used contact cleaner on the emmc chip and resinstalled.  I've had no errors now for about 2-1/2 hours. I'm running a process that uses 7.7 gb so, errors should be showing up. So far so good.  I'll post again in a day if it continues error free.

Nope, tried that too, I was having issues with odroid emmc too until I changed max-frequency to 150000000 .
I was having all types of errors, random hangs, hang on soft reboot, etc.
I've even tried different u-boots (ayufan vs rockchip vs mainline), currently I stick with ayufan's .

(08-15-2018, 06:05 PM)mikekehrli Wrote: I looked at the odroid site. Are all of those versions to do with the pre-installed OS? I would think that an emmc chip should be standard and work in any device like an SD card. Am I missing something?

Nope, there might be differences in emmc module PCB which could cause issues.
Yep odroid emmc comes with an OS preinstalled, but it also includes a very nice free sd-to-emmc adapter making it very easy to flash anything.
Edit: now their site states that adapter is sold separately, mine came bundled with the emmc, odroid-c2 was being used as a paperweight anyway, poor support, crap hacky ancient kernel.
See https://forum.pine64.org/showthread.php?tid=6279
Mine is https://www.hardkernel.com/main/products/prdt_info.php?g_code=G145622510341 16GB , came with android which I nuked from orbit with dd Tongue .


RE: Receiving errors about eMMC (?) - mikekehrli - 08-17-2018

Ok, thanks a lot.  That helped.

I'm still having stability issues with my rock64. It keeps crashing for no apparent reason and no help in any of the logs. I would say it crashes after about 10-12 hours of run time. So, I can't use it yet.  I took the emmc card out and am booting from SD card, but root partition is on a USB ssd. I've heard of stability issues with the usb3 port. Currently I'm testing it by writing 1 GB files back and forth between ssd and SD partitions. That's been running for 1/2 hour without incident.

I want to get to the bottom of this.  I'm not doing anything fancy with it, but it keeps dying on me.  I'm using debian from ayufan too. I've gotta think there is something wrong at kernel level for it to die without logging anything.