Network problems (actually bad power supply)
#1
But different from the problems I've read so far.

Using the latest version of Armbian Buster all the modules boot and are accessible. However after a couple of hours the modules become unreachable one by one. After ~1 week only 2 of the 7 modules are resolve-able over the network. All modules have a static IP natively, however I have tried with DHCP and a static IP on the gateway. Here a network ping to a "off-line" module:


Code:
C:\WINDOWS\system32>ping 10.10.10.70

Pinging 10.10.10.70 with 32 bytes of data:
Reply from 169.254.1.1: Destination host unreachable.
Reply from 169.254.1.1: Destination host unreachable.
Reply from 169.254.1.1: Destination host unreachable.
Reply from 169.254.1.1: Destination host unreachable.

Ping statistics for 10.10.10.70:
   Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),

C:\WINDOWS\system32>

I do not know whether the system is actually running. What kind of serial interface should I be using to debug this?
The extra software I am running on the modules is docker (18.06.3~ce~3-0~debian), containerd.io (latest) and kubernetes (1.15.2).


p.s.
the software date on some of the modules jumps randomly to october 2119 on the command date, but the hardware clock shows the correct time (hwclock). This is resolved after a reboot and is unkown whether this has something to do with the network, I just thought it was important enough to mention.

EDIT: After having physical access I think I've found the issue. I believe it is twofold:
1. Misconfigured switch and gateway. Couldn't prove this one, but I believe it caused a part of the issues I had.
2. Faulty PSU. I use the 5v 15A PSU from the Pine64 store, and on top of making poor contact between the AC cord and the adapter, it is faulty and not able to sustain all modules under a full load. I've asked for a replacement unit.

EDIT2: Ordered a ATX PSU, will see if that goes better.
  Reply
#2
Yeah the clock is wrong on my nodes as well so it causes problems with networking and such (certificate errors basically).

Have you inserted the 2 batteries for the RTC? I also installed chrony on each of mine to ensure the time stays correct.

I only did this the other day and all seem to be staying up ok.
  Reply
#3
(09-11-2019, 10:31 AM)Dreamwalker Wrote: Yeah the clock is wrong on my nodes as well so it causes problems with networking and such (certificate errors basically).

Have you inserted the 2 batteries for the RTC? I also installed chrony on each of mine to ensure the time stays correct.

I only did this the other day and all seem to be staying up ok.

I'll try that, thanks.
  Reply
#4
Just thought I would update you. I've not had a node gone down now since I made the changes and also after unexpected switch off as well all come up. Reboots still don't work :S
  Reply
#5
Thank you for your update, thats really good to hear. I'm still waiting for a new power supply and will update this thread accordingly.
  Reply


Possibly Related Threads...
Thread Author Replies Views Last Post
  Clusterboard networking problems BryanS 25 922 03-31-2019, 04:06 PM
Last Post: aww
  Power Switch AZClusterboard 1 184 02-16-2019, 06:55 AM
Last Post: mdmbc
  Individual SOPINE Power On After Shutdown? Pine 2 270 01-30-2019, 08:04 AM
Last Post: mdmbc
  Question on the power resistors bergera 2 389 02-15-2018, 08:20 AM
Last Post: bergera

Forum Jump:


Users browsing this thread: 1 Guest(s)