PINE64
Frequent segfaults etc. while compiling - Printable Version

+- PINE64 (https://forum.pine64.org)
+-- Forum: Pinebook Pro (https://forum.pine64.org/forumdisplay.php?fid=111)
+--- Forum: Linux on Pinebook Pro (https://forum.pine64.org/forumdisplay.php?fid=114)
+--- Thread: Frequent segfaults etc. while compiling (/showthread.php?tid=10437)

Pages: 1 2


Frequent segfaults etc. while compiling - mfritsche - 06-26-2020

Is it possible my pinebook pro suffers from a similar issue as mentioned in https://forum.pine64.org/showthread.php?tid=7387 ?

When I try to compile large projects (using Manjaro), i.e. GHC (Haskell Compiler), I suffer from segfaults and similar errors. Most of the time (if the previous object files weren't broken), just restarting helps.

I already extended the cooling pads. Using make -j 6 yields errors quicker, but single thread compiling doesn't fully fix the issue. Using different govenors yields the same observation - using powersave instead of the performance ones does lead to errors later, but don't fully fix them.

Anyone else seeing the same? Is there a way to limit the RAM clock on the pinebook pro?

Best regards, Markus


RE: Frequent segfaults etc. while compiling - slyecho - 06-27-2020

Hmm, obvious question but, how much RAM do you use when compiling? I had some issues when compiling large C++ projects, especially when using more make jobs.

The only way to fix it was to add swap.


RE: Frequent segfaults etc. while compiling - mfritsche - 06-27-2020

(06-27-2020, 05:25 AM)slyecho Wrote: Hmm, obvious question but, how much RAM do you use when compiling? I had some issues when compiling large C++ projects, especially when using more make jobs.

The only way to fix it was to add swap.

Yah, did that.

https://dynamic.reauktion.de/nextcloud/index.php/s/SBGb3RFW5RzrP9B

I guess bernoulli is bound for decommissioning :-(


RE: Frequent segfaults etc. while compiling - xmixahlx - 06-27-2020

zswap will help.


RE: Frequent segfaults etc. while compiling - Der Geist der Maschine - 06-27-2020

(06-27-2020, 09:18 AM)xmixahlx Wrote: zswap will help.

He needs to replace soldered memory. So much about a hack-ability. Confused

Maybe your system overheated? What is the temperature sensors at the time of the memory errors?

Can you try to figure out if that are always the same memory addresses? If so, you can try to exclude them with CONFIG_PHYSICAL_START - see https://www.kernel.org/doc/Documentation/kdump/kdump.txt [Update: Stackexchange has a smarter approach https://unix.stackexchange.com/questions/439755/what-can-i-do-with-the-output-of-memtester-when-it-shows-bad-memory]


RE: Frequent segfaults etc. while compiling - mfritsche - 06-27-2020

I already have zswap enabled - the problem isn't that I don't have enough RAM, the problem is, that the RAM gets faulty under load. It starts with one faulty cell and once that happens, it goes downhill from there...

Is there a way to underclock the RAM?


RE: Frequent segfaults etc. while compiling - mfritsche - 06-27-2020

(06-27-2020, 09:36 AM)Der Geist der Maschine Wrote:
(06-27-2020, 09:18 AM)xmixahlx Wrote: zswap will help.

He needs to replace soldered memory. So much about a hack-ability. Confused

Maybe your system overheated? What is the temperature sensors at the time of the memory errors?

Can you try to figure out if that are always the same memory addresses? If so, you can try to exclude them with CONFIG_PHYSICAL_START - see https://www.kernel.org/doc/Documentation/kdump/kdump.txt [Update: Stackexchange has a smarter approach https://unix.stackexchange.com/questions/439755/what-can-i-do-with-the-output-of-memtester-when-it-shows-bad-memory]


Thanks for the link! No, it's not the same address... once I reach full RAM (GHC is a memory hog) usage, It starts with some errors. I then restarted memtester and it lit up.

Even during "normal" usage (surfing), the pinebook crashes sometimes and needs to be restarted. I guess it's the same problem.

Temperature of CPU is at most 54°C when that happens, but it also happens when the sensor shows about 45°C.


RE: Frequent segfaults etc. while compiling - Der Geist der Maschine - 06-27-2020

Temperature of 54C is ok. 45C even more.

I just realize that you see virtual addresses which change for the same physical address from one run of memtester to the next.

So, block the first 1GB of physical memory, boot and run memtest. Repeat for the second 1GB, third 1GB and fourth 1GB. Once you found the offending 1GB block, refine this search. Note: things are not as easy as they sound: the kernel can't block its own memory, e.g. where it was loaded by uboot.

What is probably much better is kernel option CONFIG_MEMTEST. Here is a experience report https://raid6.com.au/~onlyjob/posts/MEMTEST_explained/


About under-clocking. The good news: it should be possible. The bad news: nobody did it before. Here is a related thread about over-clocking https://forum.pine64.org/showthread.php?tid=10398

I have not really looked into it. Here are some starting points for you:

#1 uboot drivers/clk/rockchip/clk_rk3399.c

Code:
static ulong rk3399_ddr_set_clk(struct rk3399_cru *cru,
                ulong set_rate)
{
    struct pll_div dpll_cfg;

    /*  IC ECO bug, need to set this register */
    writel(0xc000c000, PMUSGRF_DDR_RGN_CON16);

    /*  clk_ddrc == DPLL = 24MHz / refdiv * fbdiv / postdiv1 / postdiv2 */
    switch (set_rate) {
    case 50 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 1, .fbdiv = 12, .postdiv1 = 3, .postdiv2 = 2};
        break;
    case 200 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 1, .fbdiv = 50, .postdiv1 = 6, .postdiv2 = 1};
        break;
    case 300 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 2, .fbdiv = 100, .postdiv1 = 4, .postdiv2 = 1};
        break;
    case 400 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 1, .fbdiv = 50, .postdiv1 = 3, .postdiv2 = 1};
        break;
    case 666 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 2, .fbdiv = 111, .postdiv1 = 2, .postdiv2 = 1};
        break;
    case 800 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 1, .fbdiv = 100, .postdiv1 = 3, .postdiv2 = 1};
        break;
    case 933 * MHz:
        dpll_cfg = (struct pll_div)
        {.refdiv = 1, .fbdiv = 116, .postdiv1 = 3, .postdiv2 = 1};
        break;
    default:
        pr_err("Unsupported SDRAM frequency!,%ld\n", set_rate);
    }
    rkclk_set_pll(&cru->dpll_con[0], &dpll_cfg);

    return set_rate;
}


#2 the kernel device tree has an attribute for the memory speed

Code:
cru: clock-controller@ff760000 {
...
                        <&cru PLL_CPLL>,
...
                        <800000000>,


Maybe, I just say maybe, the kernel is reinitializing the memory speed back to 800000000Hz and you need to make modifications in the device tree?


I hope I confused you Tongue. Keep us up to date with your experiments.


RE: Frequent segfaults etc. while compiling - Syonyk - 06-27-2020

On the recent 5.7 kernel, the dynamic DRAM clocking capability of the rk3399 is enabled - so the kernel very well could be shifting clocks around independent of what uboot sets. I've not found a good place to read out the current DRAM clock speed though.


RE: Frequent segfaults etc. while compiling - xmixahlx - 06-27-2020

what happens if you underclock the CPU and use conservative governor?:
sudo cpupower -c 4,5 frequency-set -u 1.80GHz
sudo cpupower -C all frequency-set -g conservative