Network problems (actually bad power supply)
#11
Some additional notes, and a possible fix that doesn't involve an ATX power supply.  During the last week, I ran module 1 with a cron to reset the system clock from RTC by running /sbin/hwclock -s at 5 minute intervals.  The module would usually go down in 24 hours but has not.  Instead module 2 is the first and only one to go down.

The systems also run chronyd to maintain time and is set to synchronize to the RTC clock every 11 minutes.  This by itself was not sufficient. After adding the cron to module 2 both are on 2 days of uptime.  So it has had a positive effect.  Five minutes is arbitrary, I tried with 1 minute but it confused chronyd.  Five minutes has the affect of keeping the "Update Interval" to 60 seconds.  So a internal time server is probably recommended with this to avoid frequent polling of external servers.

Even with this, I still get the following errors which may be related to the underlying issue.  Those messages have only appeared on module 1 and 2 which so far have been the only devices to exhibit time jumps and network outages.
Code:
[Mon Oct 14 08:00:36 2019] rcu: INFO: rcu_sched self-detected stall on CPU
[Mon Oct 14 08:00:36 2019] rcu:         1-...!: (102 GPs behind) idle=23e/0/0x1 softirq=5005523/5005524 fqs=12 
[Mon Oct 14 08:00:36 2019] rcu:          (t=259866 jiffies g=12543365 q=28)
[Mon Oct 14 08:00:36 2019] rcu: rcu_sched kthread starved for 259842 jiffies! g12543365 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=3
[Mon Oct 14 08:00:36 2019] rcu: RCU grace-period kthread stack dump:
[Mon Oct 14 08:00:36 2019] rcu_sched       I    0    10      2 0x00000028
[Mon Oct 14 08:00:36 2019] Call trace:
[Mon Oct 14 08:00:36 2019]  __switch_to+0x94/0xd8
[Mon Oct 14 08:00:36 2019]  __schedule+0x1e8/0x640
[Mon Oct 14 08:00:36 2019]  schedule+0x24/0x80
[Mon Oct 14 08:00:36 2019]  schedule_timeout+0x90/0x398
[Mon Oct 14 08:00:36 2019]  rcu_gp_kthread+0x550/0x8f8
[Mon Oct 14 08:00:36 2019]  kthread+0x128/0x130
[Mon Oct 14 08:00:36 2019]  ret_from_fork+0x10/0x1c
[Mon Oct 14 08:00:36 2019] Task dump for CPU 1:
[Mon Oct 14 08:00:36 2019] swapper/1       R  running task        0     0      1 0x0000002a
[Mon Oct 14 08:00:36 2019] Call trace:
[Mon Oct 14 08:00:36 2019]  dump_backtrace+0x0/0x1a0
[Mon Oct 14 08:00:36 2019]  show_stack+0x14/0x20
[Mon Oct 14 08:00:36 2019]  sched_show_task+0x160/0x198
[Mon Oct 14 08:00:36 2019]  dump_cpu_task+0x40/0x50
[Mon Oct 14 08:00:36 2019]  rcu_dump_cpu_stacks+0xc0/0x100
[Mon Oct 14 08:00:36 2019]  rcu_check_callbacks+0x594/0x780
[Mon Oct 14 08:00:36 2019]  update_process_times+0x2c/0x58
[Mon Oct 14 08:00:36 2019]  tick_sched_handle.isra.5+0x30/0x48
[Mon Oct 14 08:00:36 2019]  tick_sched_timer+0x48/0x98
[Mon Oct 14 08:00:36 2019]  __hrtimer_run_queues+0xe4/0x1f8
[Mon Oct 14 08:00:36 2019]  hrtimer_interrupt+0xf4/0x2b0
[Mon Oct 14 08:00:36 2019]  arch_timer_handler_phys+0x28/0x40
[Mon Oct 14 08:00:36 2019]  handle_percpu_devid_irq+0x80/0x138
[Mon Oct 14 08:00:36 2019]  generic_handle_irq+0x24/0x38
[Mon Oct 14 08:00:36 2019]  __handle_domain_irq+0x5c/0xb0
[Mon Oct 14 08:00:36 2019]  gic_handle_irq+0x58/0xa8
[Mon Oct 14 08:00:36 2019]  el1_irq+0xb0/0x140
[Mon Oct 14 08:00:36 2019]  arch_cpu_idle+0x10/0x18
[Mon Oct 14 08:00:36 2019]  do_idle+0x1d4/0x298
[Mon Oct 14 08:00:36 2019]  cpu_startup_entry+0x24/0x28
[Mon Oct 14 08:00:36 2019]  secondary_start_kernel+0x18c/0x1c8


Messages In This Thread
RE: Network problems - by Dreamwalker - 09-11-2019, 10:31 AM
RE: Network problems - by Unkn0wn - 09-11-2019, 12:45 PM
RE: Network problems - by Unkn0wn - 09-27-2019, 04:07 AM
RE: Network problems - by Dreamwalker - 09-17-2019, 10:38 AM
RE: Network problems - by Unkn0wn - 09-17-2019, 03:50 PM
RE: Network problems (actually bad power supply) - by venix1 - 10-14-2019, 08:20 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Creating a current armbian-Image with network-fix clusterDude 15 27,830 05-29-2024, 03:50 PM
Last Post: poVoq
Sad Version/Date of last armbian build that came with network patches? Bazmundi 0 548 12-07-2023, 03:23 PM
Last Post: Bazmundi
  Clusterboard not getting IP address after network fix Norlark 14 15,350 08-30-2021, 05:00 PM
Last Post: poVoq
  ArchLinux Network Booting xblack86 2 4,557 02-25-2021, 08:42 AM
Last Post: xblack86
  sopine socket power problem cgiraldo 1 3,722 06-17-2020, 02:10 PM
Last Post: cgiraldo
  Clusterboard networking problems BryanS 25 34,807 03-31-2019, 04:06 PM
Last Post: aww
  Power Switch AZClusterboard 1 3,200 02-16-2019, 06:55 AM
Last Post: mdmbc
  Individual SOPINE Power On After Shutdown? Pine 2 4,398 01-30-2019, 08:04 AM
Last Post: mdmbc
  Question on the power resistors bergera 2 4,567 02-15-2018, 08:20 AM
Last Post: bergera

Forum Jump:


Users browsing this thread: 2 Guest(s)