Help debug my pine64A+ ubuntu/plex hangs
#1
Hello, 

I'm trying to understand why my pine64+ keeps hanging after 3 days or so. 

I'm using it as Plex Server (direct play) and it's works very well but when it became unreacheable from the network and I suspect the board hangs. How can I debug the problem? which logs can help me?
Maybe a temperature problem? 
Any hint would be great.

I've installed an ubuntu and plex as this guide says: http://jez.me/article/plex-server-on-a-pine64-how-to. In fact I've a rpi-monitor installed too and sometimes the temperature is near 90ºC but normally is 70-75ºC  in playing mode.

thanks in advance.


.zip   logs.zip (Size: 44.09 KB / Downloads: 374)
#2
(09-18-2017, 05:17 PM)XaRz Wrote: I'm trying to understand why my pine64+ keeps hanging after 3 days or so ... when it became unreacheable from the network and I suspect the board hangs. How can I debug the problem? 
Any hint would be great.

One simple thing you can do is to setup a heart-beat monitor ( blinking LED ) on the system LED pins ; directly next to the IR port.  Use the code below:

sysled_heartbeat.sh

Code:
#!/bin/sh

echo $1  > /sys/class/gpio/export
echo out > /sys/class/gpio/gpio$1/direction

COUNTER=0
while [ $COUNTER -lt 10 ]; do
   echo 0 > /sys/class/gpio/gpio$1/value
   sleep .35
   echo 1 > /sys/class/gpio/gpio$1/value
   sleep .65
done

echo $1  > /sys/class/gpio/unexport


You can run this code as sudo with :

      sudo  ./sysled_heartbeat.sh  359

This assumes you have the system LED plugged in;  the ballast resistor is built-in;  use a 3mm low power LED ( white is nice )

The idea is simple,  if the board hangs the light will stop blinking.  The blinking light requires the OS to be functional to provide the ON|OFF sleep cycles.

I suspect your board is NOT hanging ( the light will prove that ).  More likely your network connection has dropped for some reason... is this a wifi connection ?  If so, sometimes in idle states the wifi will shutdown to conserve power and the connection will drop.  Sometimes the eth connection will do the same;  one way to put a stop to this temporarily while you're getting a handle on the problem is to setup a script that wakes up once every few minutes to send three pings to your router ( use crontab ). Another thing you can do is to send part ( or all ) of your dmesg log to another machine using scp every so many minutes ( say thirty ). 

If your OS is hanging something is really wrong ( probably corrupted SD card ).  My PineA64+ boards ( both of them ) run 24-7-365 ... I rarely reboot them and they have both been running for several months now.  I have heart-beat monitors on all my boards, and I have a function monitor on my main server.
marcushh777    Cool

please join us for a chat @  irc.pine64.xyz:6667   or ssl  irc.pine64.xyz:6697

( I regret that I am not able to respond to personal messages;  let's meet on irc! )
#3
Thanks for all the information.
reading your ideas, I've remembered that I've one solution for determinate the problem more quicky: A spare 7" tft fot troubleshooting if its the board or the OS.

I'll post my investigations and my outcomes!

Thanks again!
#4
Ok, now I'm fighting with cron to run a script for resseting the eth0 because I suspected that the problem was an eth0 dropping. The script is this:
Code:
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/pere/bin
LOGFILE=/home/pere/network-monitor.log

if ifconfig eth0 | grep -q "inet addr:" ;
then
      echo "$(date "+%m %d %Y %T") : Ethernet OK" >> $LOGFILE
else
      echo "$(date "+%m %d %Y %T") : Ethernet connection down! Attempting reconnection." >> $LOGFILE
      ifup --force eth0
      OUT=$? #save exit status of last command to decide what to do next
      if [ $OUT -eq 0 ] ; then
              STATE=$(ifconfig eth0 | grep "inet addr:")
              echo "$(date "+%m %d %Y %T") : Network connection reset. Current state is" $STATE >> $LOGFILE
      else
              echo "$(date "+%m %d %Y %T") : Failed to reset ethernet connection" >> $LOGFILE
      fi



But debuggin why is cron not executting it, I've seen these lines in syslog:

Code:
21 10:51:18 pine64 kernel: [  114.459270] Mali: Set gpu frequency to 144 MHz
Sep 21 10:51:19 pine64 kernel: [  115.440931] CPU Budget:update CPU 0 cpufreq max to 1056000 min to 480000
Sep 21 10:51:19 pine64 kernel: [  115.440967] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 10:51:19 pine64 kernel: [  115.440981] gpu cooling callback set freq limit 360
Sep 21 10:51:19 pine64 kernel: [  115.441037] Mali: Set gpu frequency to 360 MHz
Sep 21 10:51:19 pine64 kernel: [  115.932942] CPU Budget:update CPU 0 cpufreq max to 1008000 min to 480000
Sep 21 10:51:19 pine64 kernel: [  115.935187] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 10:51:19 pine64 kernel: [  115.935201] gpu cooling callback set freq limit 144
Sep 21 10:51:19 pine64 kernel: [  115.935256] Mali: Set gpu frequency to 144 MHz

Is there a heat problem with my pine64A+?

By the way , anyone can help me understand why cron is not executing this script?

I set up this with the command : sudo crontab -e


Code:
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command

*/1 * * * *  root /bin/bash /home/pere/bin/./network-monitor.sh
#5
Ok solved the cron

no need of root in the crontab line if I just edit with sudo crontab -e.

Now what about my syslog? it's a heat problem?

Sep 21 18:54:01 pine64 CRON[2144]: (root) CMD (bash /home/pere/bin/network-monitor.sh)
Sep 21 18:54:19 pine64 kernel: [28631.085162] CPU Budget:update CPU 0 cpufreq max to 1104000 min to 480000
Sep 21 18:54:19 pine64 kernel: [28631.087406] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:19 pine64 kernel: [28631.087429] CPU Budget:update CPU 0 cpufreq max to 1056000 min to 480000
Sep 21 18:54:19 pine64 kernel: [28631.089643] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:19 pine64 kernel: [28631.089657] gpu cooling callback set freq limit 360
Sep 21 18:54:19 pine64 kernel: [28631.089713] Mali: Set gpu frequency to 360 MHz
Sep 21 18:54:20 pine64 kernel: [28632.069082] CPU Budget:update CPU 0 cpufreq max to 1104000 min to 480000
Sep 21 18:54:20 pine64 kernel: [28632.069116] CPU Budget hotplug: cluster0 min:0 max:4
Sep 21 18:54:20 pine64 kernel: [28632.069131] gpu cooling callback set freq limit 0
Sep 21 18:54:20 pine64 kernel: [28632.069188] Mali: Set gpu frequency to 408 MHz
Sep 21 18:54:22 pine64 kernel: [28634.037028] CPU Budget hotplug: cluster0 min:0 max:4
#6
(09-21-2017, 10:55 AM)XaRz Wrote: Ok solved the cron

no need of root in the crontab line if I just edit with sudo crontab -e.

Now what about my syslog? it's a heat problem?


Usually all that is needed is a passive cooling device;  generally a 14mm x 14mm  aluminum heatsink with 3M thermal tape adhesive.   All of my boards are being used as servers of one type or another , and all of them have active cooling;  soft-pwm driven 5v brushless fan on either a PN2222 or 2N2222 transistor;  a 4N35 optical coupler may be used as well. 

The lack of passive cooling may or may not have anything to do with the eth drop;  although, throttling can be a networking problem, not necessarily.   Try the heatsink first ( any Raspberry PI heatsink will do ) and see what happens, then decide if you need active cooling also;  if you do, I can help you set that up.
marcushh777    Cool

please join us for a chat @  irc.pine64.xyz:6667   or ssl  irc.pine64.xyz:6697

( I regret that I am not able to respond to personal messages;  let's meet on irc! )
#7
Thanks. I've ordered the parts.

I'll post my results here when they are applied!

Thanks for all!
#8
Well, yesterday I applied finally the passive cooling device: 14x14mm aluminum heatsink with 3M thermal tape.

And the system hanged at 01:55AM with no one using plex (then no high temperature levels there). And I don't know how to know why my pine64 A+ keeps hanging.

Any hints?
#9
Seems a False alarm!!!

6 days non stop and counting! Seems that cooling was the problem at the end.

Thanks!


Possibly Related Threads…
Thread Author Replies Views Last Post
  Issues with both Deb and Ubuntu germanshep 0 2,481 08-28-2018, 05:47 PM
Last Post: germanshep
  Plex Home Theater? jpenberthy 0 2,328 03-07-2017, 02:15 PM
Last Post: jpenberthy
  Ubuntu 16.04 cannot resolve dns - connection timed out; no servers could be reached marcosti 2 9,968 03-01-2017, 02:46 PM
Last Post: dkryder
Question Ultrawide Display for Ubuntu Mate DE dudeytsang 1 4,910 01-01-2017, 07:26 AM
Last Post: CaptainZalo
  Pine64+Ubuntu+Tvheadend+DVBT Dongle Shai 3 6,259 10-03-2016, 04:23 AM
Last Post: Shai
  Dmesg output Ubuntu Mate S265 1 3,536 10-02-2016, 07:28 AM
Last Post: MarkHaysHarris777
  Overclocking ubuntu mate Mixermic 2 4,863 10-01-2016, 04:36 AM
Last Post: pfeerick
  How to make multicast working on ubuntu and debian brengthdom 3 6,899 09-18-2016, 02:05 PM
Last Post: brengthdom
  Camera support - Ubuntu Xenial 16.04 LTS @lex 3 5,085 08-07-2016, 03:47 AM
Last Post: romansavrulin
  [How-To] Make PINE 64 with Ubuntu Xenial Longsleep build crunch BOINC Tasks moisesmcardona 3 7,944 07-27-2016, 01:16 PM
Last Post: NexusDude

Forum Jump:


Users browsing this thread: 1 Guest(s)