flipping frames bug and workaround
#1
flipping frames bug and workaround

for mobian phosh with kernel 6.1.
https://barrelmem.s3.eu-north-1.amazonaw...ingbug.mp4

i have concluded at this point that this bug is in two parts, powersaving part and frequency change part. practically all gnu/linux distributions and plasma and phosh are affected.

powersaving part seems to be more straightforward and more rare. this can be avoided by following.
Code:
# udev rule
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/autosuspend_delay_ms}="-1"
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/control}="on"

powersaving part is harder to replicate. you need some screen activity with some networking activity and possible cpu activity, it is not immediate.

frequency change part can by avoided by following:
Code:
# udev rule
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{min_freq}="432000000"
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{max_freq}="432000000"

however, in my situtation, frequency bug happens only in 1.2a hardware version and not in 1.2b. so i don't rule out possible hardware problem in 1.2a version or my specfic individual phone.

if you tech savvy enough and you have pinephone regular. could you test frequency part of this bug and say your hardware version (1.2 1.2a 1.2b etc). you only set powersaving part, then run glxgears in user interfaces without auto suspend, sreenlocks and screen saving. then run following script in background.
Code:
while true
do echo 432000000 > /sys/class/devfreq/1c40000.gpu/min_freq
echo 432000000 > /sys/class/devfreq/1c40000.gpu/max_freq
echo 312000000 > /sys/class/devfreq/1c40000.gpu/max_freq
done

generic copy paste:
Code:
# udev rule. this prevent flipping frames bug
# example file location /lib/udev/rules.d/98-preventflippingbug.rules
# powersaving part
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/autosuspend_delay_ms}="-1"
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/control}="on"
# frequency part, may not be needed
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{min_freq}="432000000"
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{max_freq}="432000000"

log file is for mobian phosh.


Attached Files
.txt   dmesg-powersavebug.txt (Size: 28.2 KB / Downloads: 143)
  Reply
#2
now i have tested two 1.2a hw phones and both have frequency part of the bug. i start to think there is something fishy going on in hardware chips in case of frequency changes of gpu.

fedora f38 with "megi-kernel-pboot" 6.0.7 kernel does not crash for frequency changes, even in 1.2a. fedora does crashes for powersaving parts though (at least in the past).
  Reply
#3
i have tested (further) two pinephones of 1.2a version for frequency bug, using archlinux plasma, glxgears in u.i. and gpu frequency changing script.

older pinephone crashes only if i flip from 432MHz to 312MHz.

newer pinephone crashes both 423MHZ <> 312MHZ and 432MHZ <> 120MHZ. however not in 312MHZ <> 120MHZ.

(powersaving bug is different and both hw versions crash, 1.2a and 1.2b. it is also more difficult to replicate)

Code:
[root@danctnix ~]# pacman -Si mesa
Repository      : extra
Name            : mesa
Version         : 22.3.3-1
Description     : An open-source implementation of the OpenGL specification
Architecture    : aarch64
URL             : https://www.mesa3d.org/
Licenses        : custom
Groups          : None
Provides        : mesa-libgl  opengl-driver
Depends On      : libdrm  wayland  libxxf86vm  libxdamage  libxshmfence  libelf  libomxil-bellagio  libunwind  llvm-libs  lm_sensors  libglvnd  zstd  vulkan-icd-loader  libsensors.so=5-64  libexpat.so=1-64  libvulkan.so
Optional Deps   : opengl-man-pages: for the OpenGL API man pages
                  mesa-vdpau: for accelerated video playback
                  libva-mesa-driver: for accelerated video playback
Conflicts With  : mesa-libgl
Replaces        : mesa-libgl
Download Size   : 15.24 MiB
Installed Size  : 59.51 MiB
Packager        : Arch Linux ARM Build System <builder+seattle@archlinuxarm.org>
Build Date      : Fri 13 Jan 2023 12:05:11 PM UTC
Validated By    : MD5 Sum  SHA-256 Sum  Signature

[root@danctnix ~]# pacman -Si linux-megi
Repository      : danctnix
Name            : linux-megi
Version         : 6.0.10-1
Description     : The Linux Kernel and modules - Megous Kernel
Architecture    : aarch64
URL             : https://github.com/megous/linux
Licenses        : GPL2
Groups          : None
Provides        : kernel26  linux=6.0.10
Depends On      : coreutils  kmod  mkinitcpio>=0.7
Optional Deps   : crda: to set the correct wireless channels of your country
Conflicts With  : linux
Replaces        : linux-pine64
Download Size   : 165.98 MiB
Installed Size  : 186.53 MiB
Packager        : DanctNIX Build System <builder@main-key.danctnix.org>
Build Date      : Fri 02 Dec 2022 03:50:38 PM UTC
Validated By    : MD5 Sum  SHA-256 Sum  Signature

[root@danctnix ~]# uname -a
Linux danctnix 6.0.10-1-danctnix #1 SMP PREEMPT_DYNAMIC Fri Dec 2 15:51:28 UTC 2022 aarch64 GNU/Linux
[root@danctnix ~]#

some log

Code:
[  222.596592] lima 1c40000.gpu: mmu page fault at 0x4d200c0 from bus id 0 of type read on ppmmu0
[  222.605485] lima 1c40000.gpu: pp task error 0 int_state=0 status=5
[  222.611743] lima 1c40000.gpu: pp task error 1 int_state=0 status=0
[  222.618017] lima 1c40000.gpu: mmu resume

script for displaying info

Code:
echo cat /sys/class/devfreq/1c40000.gpu/trans_stat
cat /sys/class/devfreq/1c40000.gpu/trans_stat
echo cat /sys/class/devfreq/1c40000.gpu/min_freq
cat /sys/class/devfreq/1c40000.gpu/min_freq
echo cat /sys/class/devfreq/1c40000.gpu/max_freq
cat /sys/class/devfreq/1c40000.gpu/max_freq
echo cat /sys/devices/platform/soc/1c40000.gpu/power/autosuspend_delay_ms
cat /sys/devices/platform/soc/1c40000.gpu/power/autosuspend_delay_ms
echo cat /sys/devices/platform/soc/1c40000.gpu/power/control
cat /sys/devices/platform/soc/1c40000.gpu/power/control
echo cat /sys/devices/platform/soc/1c40000.gpu/power/runtime_suspended_time
cat /sys/devices/platform/soc/1c40000.gpu/power/runtime_suspended_time
echo cat /sys/module/lima/parameters/sched_timeout_ms
cat /sys/module/lima/parameters/sched_timeout_ms

example of frequency changing script

Code:
echo 432000000 > /sys/class/devfreq/1c40000.gpu/min_freq
echo 432000000 > /sys/class/devfreq/1c40000.gpu/max_freq

while true
do echo 312000000 > /sys/class/devfreq/1c40000.gpu/max_freq
echo 432000000 > /sys/class/devfreq/1c40000.gpu/max_freq
done
  Reply
#4
i forgot this link:

https://gitlab.com/postmarketOS/pmaports/-/issues/805



how to reproduce powersaving bug:

this is more difficult to replicate. what i know all pinephones are affected.

default settings are applied for gpu pwersaving. some kind of visible script is run on screen, which uses gpu somewhat, but in a way that gpu goes to powersaving for awhile. on a background, maybe ssh, one runs something which consumes networking and cpu power. mobile data and ethernet is more likely to reproduce this bug, but wifi is less likely. this may take hours.

Code:
# udev rule. this prevents frequency part only
# example file location /lib/udev/rules.d/98-test.rules
# reboot required
# powersaving part
#KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/autosuspend_delay_ms}="-1"
#KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/control}="on"
# frequency part, may not be needed
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{min_freq}="432000000"
KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{max_freq}="432000000"



how to reproduce frequency bug:

only some pinephones are affected. fedora (dirty) kernel is/was not affected for frequency part).

isome settings for gpu needs to be initiated, because powersaving should be ruled out. something needs to be done a screen, maybe glxgears. on a background like ssh, frequency change script is run. crash probably happens in minutes.

Code:
# udev rule. this prevents powersaving part only
# example file location /lib/udev/rules.d/98-test.rules
# reboot required
# powersaving part
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/autosuspend_delay_ms}="-1"
KERNEL=="1c40000.gpu", SUBSYSTEM=="platform", DRIVER=="lima", ATTR{power/control}="on"
# frequency part, may not be needed
#KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{min_freq}="432000000"
#KERNEL=="1c40000.gpu", SUBSYSTEM=="devfreq", ATTR{max_freq}="432000000"

Code:
# stupid bash script to forcibly change frequency at all time
# root or sudo required
# options are 432000000 312000000 120000000.
# you may need 432<>312 or 432<>120 or 312<>120

echo 432000000 > /sys/class/devfreq/1c40000.gpu/min_freq
echo 432000000 > /sys/class/devfreq/1c40000.gpu/max_freq

while true
do echo 432000000 > /sys/class/devfreq/1c40000.gpu/max_freq
echo 312000000 > /sys/class/devfreq/1c40000.gpu/max_freq
done



frequency bug happens only in some pinephones, so i ask, can you test frequency part in your pinephone, and also report hardware version.

https://wiki.pine64.org/wiki/PinePhone#H..._revisions

i put this thread for wider discussion about mali gpu and a driver for it. does mali chip have some deficiencies? has anyone some knowledge what's going on in lima driver? why some devices are affected and others aren't? is frequency change reliable?

even related information could be helpful.
  Reply
#5
I have this happen on 1.2b (Manjaro Edition 3GB), although I don't know what trigger (power saving or frequency change). It seems to happen when there's some sort of high CPU and moderate to high network use situation, e.g. it happens sometimes when I launch multiple apps at once with some already running that use the internet. Somehow, the combined activity of all the launches just makes it trip over something. Edit: actually I transferred a mainboard once but I forgot if it was for this one, so it could also be 1.2a (but not older), if someone knows how to check via software let me know
Away
  Reply
#6
(02-15-2023, 09:54 AM)e1337 Wrote: I have this happen on 1.2b (Manjaro Edition 3GB), although I don't know what trigger (power saving or frequency change). It seems to happen when there's some sort of high CPU and moderate to high network use situation, e.g. it happens sometimes when I launch multiple apps at once with some already running that use the internet. Somehow, the combined activity of all the launches just makes it trip over something. Edit: actually I transferred a mainboard once but I forgot if it was for this one, so it could also be 1.2a (but not older), if someone knows how to check via software let me know

i cannot help much for checking hw version. one could use "lshw", but this command does not show sub versions, in this case, 1.2, 1.2a or 1.2b. (lshw probably needs to be installed first.) "journalctl -r" also works, if you search correct place near boot up, but this also is inaccurate.

i do not recoomend opening directly, but one number may provide clue for what version you have. number is above pogo pins. if it is just "C", i think mainboard is 1.2b. if there is a date, date probably indicates production date. picture is for 1.2a. however, i cannot accurately know what number means.

just for pre-emptive warning, if you or others post pictures, pixelize imei codes and similar.

[Image: pmos-12a-pixelized.png]
  Reply
#7
i went on bug reporting spree.

https://gitlab.freedesktop.org/mesa/mesa/-/issues/8415
https://gitlab.freedesktop.org/mesa/mesa/-/issues/8410
  Reply
#8
I haven't faced this problem anymore with Manjaro Phosh Beta 29.
Or more precise (I think) after Manjaro updated to Mesa 22.3.5
This use to be nearly daily issue and now I have not seen it once for many many days - maybe over 20 days or something?
  Reply
#9
(03-02-2023, 09:31 AM)alaraajavamma Wrote: I haven't faced this problem anymore with Manjaro Phosh Beta 29.
Or more precise (I think) after Manjaro updated to Mesa 22.3.5
This use to be nearly daily issue and now I have not seen it once for many many days - maybe over 20 days or something?

if you refer to this bug https://gitlab.freedesktop.org/mesa/mesa/-/issues/8198 , i think, this is totally different issue. i encountered this bug as well, but it was already reported and fixed soon, i didn't put much attention to it.

btw, i already crashed manjaro with mesa 22.3.5.
  Reply
#10
most likely these are the same bug.

https://salsa.debian.org/Mobian-team/dev.../issues/65
https://gitlab.freedesktop.org/mesa/mesa/-/issues/8415
  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  poor call quality try flipping switch six D4rkh0rs3 0 1,344 05-12-2021, 12:09 PM
Last Post: D4rkh0rs3
  Importing vcf, country prefix problem workaround p1trson 3 4,408 12-02-2020, 09:29 AM
Last Post: cybercow

Forum Jump:


Users browsing this thread: 1 Guest(s)