PINE64
Help debug my pine64A+ ubuntu/plex hangs - Printable Version

+- PINE64 (https://forum.pine64.org)
+-- Forum: PINE A64(+) (https://forum.pine64.org/forumdisplay.php?fid=4)
+--- Forum: Linux on Pine A64(+) (https://forum.pine64.org/forumdisplay.php?fid=6)
+--- Thread: Help debug my pine64A+ ubuntu/plex hangs (/showthread.php?tid=5146)



Help debug my pine64A+ ubuntu/plex hangs - XaRz - 09-18-2017

Hello, 

I'm trying to understand why my pine64+ keeps hanging after 3 days or so. 

I'm using it as Plex Server (direct play) and it's works very well but when it became unreacheable from the network and I suspect the board hangs. How can I debug the problem? which logs can help me?
Maybe a temperature problem? 
Any hint would be great.

I've installed an ubuntu and plex as this guide says: http://jez.me/article/plex-server-on-a-pine64-how-to. In fact I've a rpi-monitor installed too and sometimes the temperature is near 90ºC but normally is 70-75ºC  in playing mode.

thanks in advance.


.zip   logs.zip (Size: 44.09 KB / Downloads: 374)


RE: Help debug my pine64A+ ubuntu/plex hangs - MarkHaysHarris777 - 09-18-2017

(09-18-2017, 05:17 PM)XaRz Wrote: I'm trying to understand why my pine64+ keeps hanging after 3 days or so ... when it became unreacheable from the network and I suspect the board hangs. How can I debug the problem? 
Any hint would be great.

One simple thing you can do is to setup a heart-beat monitor ( blinking LED ) on the system LED pins ; directly next to the IR port.  Use the code below:

sysled_heartbeat.sh

Code:
#!/bin/sh

echo $1  > /sys/class/gpio/export
echo out > /sys/class/gpio/gpio$1/direction

COUNTER=0
while [ $COUNTER -lt 10 ]; do
   echo 0 > /sys/class/gpio/gpio$1/value
   sleep .35
   echo 1 > /sys/class/gpio/gpio$1/value
   sleep .65
done

echo $1  > /sys/class/gpio/unexport


You can run this code as sudo with :

      sudo  ./sysled_heartbeat.sh  359

This assumes you have the system LED plugged in;  the ballast resistor is built-in;  use a 3mm low power LED ( white is nice )

The idea is simple,  if the board hangs the light will stop blinking.  The blinking light requires the OS to be functional to provide the ON|OFF sleep cycles.

I suspect your board is NOT hanging ( the light will prove that ).  More likely your network connection has dropped for some reason... is this a wifi connection ?  If so, sometimes in idle states the wifi will shutdown to conserve power and the connection will drop.  Sometimes the eth connection will do the same;  one way to put a stop to this temporarily while you're getting a handle on the problem is to setup a script that wakes up once every few minutes to send three pings to your router ( use crontab ). Another thing you can do is to send part ( or all ) of your dmesg log to another machine using scp every so many minutes ( say thirty ). 

If your OS is hanging something is really wrong ( probably corrupted SD card ).  My PineA64+ boards ( both of them ) run 24-7-365 ... I rarely reboot them and they have both been running for several months now.  I have heart-beat monitors on all my boards, and I have a function monitor on my main server.


RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 09-19-2017

Thanks for all the information.
reading your ideas, I've remembered that I've one solution for determinate the problem more quicky: A spare 7" tft fot troubleshooting if its the board or the OS.

I'll post my investigations and my outcomes!

Thanks again!


RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 09-21-2017

Ok, now I'm fighting with cron to run a script for resseting the eth0 because I suspected that the problem was an eth0 dropping. The script is this:
Code:
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/pere/bin
LOGFILE=/home/pere/network-monitor.log

if ifconfig eth0 | grep -q "inet addr:" ;
then
      echo "$(date "+%m %d %Y %T") : Ethernet OK" >> $LOGFILE
else
      echo "$(date "+%m %d %Y %T") : Ethernet connection down! Attempting reconnection." >> $LOGFILE
      ifup --force eth0
      OUT=$? #save exit status of last command to decide what to do next
      if [ $OUT -eq 0 ] ; then
              STATE=$(ifconfig eth0 | grep "inet addr:")
              echo "$(date "+%m %d %Y %T") : Network connection reset. Current state is" $STATE >> $LOGFILE
      else
              echo "$(date "+%m %d %Y %T") : Failed to reset ethernet connection" >> $LOGFILE
      fi



But debuggin why is cron not executting it, I've seen these lines in syslog:

Code:
21 10:51:18 pine64 kernel: [  114.459270] Mali: Set gpu frequency to 144 MHz
Sep 21 10:51:19 pine64 kernel: [  115.440931] CPU Budget:update CPU 0 cpufreq max to 1056000 min to 480000
Sep 21 10:51:19 pine64 kernel: [  115.440967] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 10:51:19 pine64 kernel: [  115.440981] gpu cooling callback set freq limit 360
Sep 21 10:51:19 pine64 kernel: [  115.441037] Mali: Set gpu frequency to 360 MHz
Sep 21 10:51:19 pine64 kernel: [  115.932942] CPU Budget:update CPU 0 cpufreq max to 1008000 min to 480000
Sep 21 10:51:19 pine64 kernel: [  115.935187] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 10:51:19 pine64 kernel: [  115.935201] gpu cooling callback set freq limit 144
Sep 21 10:51:19 pine64 kernel: [  115.935256] Mali: Set gpu frequency to 144 MHz

Is there a heat problem with my pine64A+?

By the way , anyone can help me understand why cron is not executing this script?

I set up this with the command : sudo crontab -e


Code:
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command

*/1 * * * *  root /bin/bash /home/pere/bin/./network-monitor.sh



RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 09-21-2017

Ok solved the cron

no need of root in the crontab line if I just edit with sudo crontab -e.

Now what about my syslog? it's a heat problem?

Sep 21 18:54:01 pine64 CRON[2144]: (root) CMD (bash /home/pere/bin/network-monitor.sh)
Sep 21 18:54:19 pine64 kernel: [28631.085162] CPU Budget:update CPU 0 cpufreq max to 1104000 min to 480000
Sep 21 18:54:19 pine64 kernel: [28631.087406] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:19 pine64 kernel: [28631.087429] CPU Budget:update CPU 0 cpufreq max to 1056000 min to 480000
Sep 21 18:54:19 pine64 kernel: [28631.089643] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:19 pine64 kernel: [28631.089657] gpu cooling callback set freq limit 360
Sep 21 18:54:19 pine64 kernel: [28631.089713] Mali: Set gpu frequency to 360 MHz
Sep 21 18:54:20 pine64 kernel: [28632.069082] CPU Budget:update CPU 0 cpufreq max to 1104000 min to 480000
Sep 21 18:54:20 pine64 kernel: [28632.069116] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:20 pine64 kernel: [28632.069131] gpu cooling callback set freq limit 0
Sep 21 18:54:20 pine64 kernel: [28632.069188] Mali: Set gpu frequency to 408 MHz
Sep 21 18:54:22 pine64 kernel: [28634.037028] CPU Budget hotplug: cluster0 min:0 max:4


RE: Help debug my pine64A+ ubuntu/plex hangs - MarkHaysHarris777 - 09-21-2017

(09-21-2017, 10:55 AM)XaRz Wrote: Ok solved the cron

no need of root in the crontab line if I just edit with sudo crontab -e.

Now what about my syslog? it's a heat problem?


Usually all that is needed is a passive cooling device;  generally a 14mm x 14mm  aluminum heatsink with 3M thermal tape adhesive.   All of my boards are being used as servers of one type or another , and all of them have active cooling;  soft-pwm driven 5v brushless fan on either a PN2222 or 2N2222 transistor;  a 4N35 optical coupler may be used as well. 

The lack of passive cooling may or may not have anything to do with the eth drop;  although, throttling can be a networking problem, not necessarily.   Try the heatsink first ( any Raspberry PI heatsink will do ) and see what happens, then decide if you need active cooling also;  if you do, I can help you set that up.


RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 09-25-2017

Thanks. I've ordered the parts.

I'll post my results here when they are applied!

Thanks for all!


RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 10-19-2017

Well, yesterday I applied finally the passive cooling device: 14x14mm aluminum heatsink with 3M thermal tape.

And the system hanged at 01:55AM with no one using plex (then no high temperature levels there). And I don't know how to know why my pine64 A+ keeps hanging.

Any hints?


RE: Help debug my pine64A+ ubuntu/plex hangs - XaRz - 11-02-2017

Seems a False alarm!!!

6 days non stop and counting! Seems that cooling was the problem at the end.

Thanks!