SPDIF output audio gaps
#1
Hi there,

TL;DR: S/PDIF output is choppy, unclear whether hardware or software issue, any insights appreciated.

I got a ROCKPro64 running Debian stable (with the kernel pulled in from backports). I modified the devicetree to enable SPDIF, by disabling the other two i2s blocks and enabling the SPDIF blocks. I also re-routed the SPDIF signal to the Pi2 GPIO header (using the alternate pin which was already provisioned in the devicetree), because I didn't have a compatible 2mm pin socket available, and delivery times were in the order of months for those. I am connecting via a hand-soldered coax cable to the amp. I tried different cables and adapters to rule out the cabling as a source of problem.

S/PDIF output works in principle, but there are issues. In particular, the amplifier will irregularly fail to recognize the signal for a short moment (it has indicator lights for S/PDIF PCM which turn off briefly), like some kind of synchronization loss. This causes a gap in the audio.

After looking at the amplifier schematics (it is a Denon AVR-1910 and the service manual can be found on the internet), I realized that the signal is tied to GND using a 75 ohm resistor, which would result in significant peak current being sourced from the RK3399 GPIO. To mitigate that, I added a 2N7000 N-FET as driver, powered from the 5V rail with a 75 ohm resistor. This results in 33mA of peak current, I suppose, which should be fine for the 5V rail (any USB device may need more than that).

That way, I hope to have ruled out any cabling/power issues on the transmission line.

I am currently running nothing except mpd on the box. I tried with pulseaudio inbetween mpd and ALSA, but that did not fix the issues (although I *think* it reduced the issues somewhat, not 100% sure).

The rate of issues seems to correlate with the kind of content played back. 44.1 kHz 16 bit content seems to work better than 48 kHz 24 bit content. 96 kHz 24 bit content is completely unplayable (all you can hear are irregular pops on the output, somewhat like light drizzle rain on a metal roof). mpd is not saturating the CPU; it is between 5% and 30%, depending on the ALSA buffer and period time configuration. (With a buffer_time of 0.5s and a period_time of 1ms, I can get 96 kHz content to play somewhat, but with many, many gaps). There's also no significant iowait going on. Swap is turned off.

I took out my oscilloscope and looked at the signal on the wire and I noticed that most of the time (not always, but that may be my scope failing to trigger properly) when a gap occurs, the S/PDIF signal is interrupted. The smallest gap I saw was 22 microseconds. However, ALSA does *not* report a buffer underrun (via the EPIPE error code) to the application. (I also saw some significant ringing at the rising edges of my simplistic driver, which I removed with a 100pF capacitor in parallel to GND, but that did not change the issue.)

I am a bit at a loss at where to look at next. With ALSA not reporting an underrun, I'm not sure if this is a hardware or a software issue.

I saw someone else report a similar problem, but I can't find it anymore in the forums (the search is broken for me).

I'm not sure what to look at next. I'm not 100% sure this is a hardware problem, it may also be in the Linux kernel or whatnot. Feel free to move this thread if appropriate. I am trying to reproduce the issue with speaker-test currently, but so far I wasn't able to (unless when tweaking the buffer sizes well below any sensible threshold, at which point -EPIPE *is* returned from ALSA, suggesting that that then is another issue). Because of this, I decided to post this in the Linux subforum instead of the hardware one.

I appreciate any kind of input and I'm more than happy to provide scope traces, config files, schematics or anything else which may be useful in debugging this. Because I'm at this point rather clueless where to even start, I decided against uploading a swath of data, but don't hesitate to ask.

kind regards & thank you for reading,
jssfr
  Reply
#2
Hi there, thanks for this extensive level of looking into it. I hope I can shed some more light from my Quartz64 SPDIF experience: over there it uses pretty much the same driver. I don't know if the controller on the SoC is exactly the same, but it likely is. SPDIF on the Quartz64 Model A works fine.

A few things I've noticed while looking at the schematic:

1. Apparently there's a second SPDIF thing connected to some driver, not on the Pi-2 header? Not sure what that's about.
2. The one on the Pi-2 bus GPIO header doesn't have a low-pass filter. The Quartz64 Model A uses a low-pass filter of 220 MHz.

I don't have an oscilloscope to try and reproduce your measurements (I'm working on it, desk space in my room is at a premium) but you can try using TOSlink instead of coax to optoisolate the whole thing and rule out problems with the transmission line even further. I can send you one of my less well soldered together Quartz64 Model A TOSlink adapters for free if you need a breakout board with a driver IC/LED combo that can handle both 3.3V and 5V, you'll just have to connect to the right pins with jumper wires.

However, the information you gave about 44.1 KHz working better than 48 KHz and 96 KHz not working at all points towards some sort of buffer issue maybe. Try adding some debug instrumentation to sound/soc/rockchip/rockchip_spdif.c.

I don't have access to the Part 2 of the RK3399 TRM in which SPDIF is documented, but I do have access to the RK356x TRM which should have the same SPDIF controller.

One thing I can see right now when comparing the TRM with the driver is the following discrepancy: the SPDIF_CFGR has bits 23:16 assigned like this:
[Image: Screenshot_20220724_112808.png]
The driver sets mclk to srate * 128 in rk_spdif_hw_params. However, it doesn't seem to write the register, unless I'm missing something. It does write "val" but "val" has nothing to do with the mclk value being set it seems.

The comment above the clock setting call says /* Set clock and calculate divider */ but it never seems to calculate/set the divider.

Downstream BSP kernel driver in the RK3588 SDK does this seemingly differently:

Code:
static int rk_spdif_hw_params(struct snd_pcm_substream *substream,
                  struct snd_pcm_hw_params *params,
                  struct snd_soc_dai *dai)
{
    struct rk_spdif_dev *spdif = snd_soc_dai_get_drvdata(dai);
    unsigned int val = SPDIF_CFGR_HALFWORD_ENABLE;
    unsigned int mclk_rate = clk_get_rate(spdif->mclk);
    int bmc, div;
    int ret;

    /* bmc = 128fs */
    bmc = 128 * params_rate(params);
    div = DIV_ROUND_CLOSEST(mclk_rate, bmc);
    val |= SPDIF_CFGR_CLK_DIV(div);

    switch (params_format(params)) {
    case SNDRV_PCM_FORMAT_S16_LE:
        val |= SPDIF_CFGR_VDW_16;
        break;
    case SNDRV_PCM_FORMAT_S20_3LE:
        val |= SPDIF_CFGR_VDW_20;
        break;
    case SNDRV_PCM_FORMAT_S24_LE:
        val |= SPDIF_CFGR_VDW_24;
        break;
    default:
        return -EINVAL;
    }

    ret = regmap_update_bits(spdif->regmap, SPDIF_CFGR,
                 SPDIF_CFGR_CLK_DIV_MASK |
                 SPDIF_CFGR_HALFWORD_ENABLE |
                 SDPIF_CFGR_VDW_MASK, val);

    return ret;
}

Whereby the defines used are

Code:
#define SPDIF_CFGR_CLK_DIV_SHIFT    (16)
#define SPDIF_CFGR_CLK_DIV_MASK        (0xff << SPDIF_CFGR_CLK_DIV_SHIFT)
#define SPDIF_CFGR_CLK_DIV(x)        ((x - 1) << SPDIF_CFGR_CLK_DIV_SHIFT)

Interestingly, this doesn't set the mclk rate at all, this is done in rk_spdif_set_sysclk.

Hope this is of some help to you, I just quickly looked over this while having my morning coffee. It definitely looks like a driver issue to me, and it'd be great if we could get it sorted.

Occasional Linux Kernel Contributor, Avid Wiki Updater, Ask Me About Quartz64
Open Hardware Quartz64 Model A TOSLink Adapter
Pi-bus GPIO Extender For ROCKPro64 And Quartz64 Model A
Plebian GNU/Linux
  Reply
#3
Hi there an thank you for the extensive reply :-).

I wasn't aware that this thread had actually been posted (I expected some kind of notification after the moderation queue), otherwise I would've posted this as resolution earlier.

I found a "fix" for the issue: disabling the cluster-sleep CPU idle state (echo 1 | tee /sys/devices/system/cpu/cpu*/cpuidle/state2/disable) as well as (potentially, not 100% sure yet if this is also required) setting the cpufreq governor to performance. This not only allows fluent 96 kHz playback, but also allows rsync+ssh to transfer at ~80 MiB/s instead of ~65 MiB/s, and allows rsync without SSH (via rsyncd) to reach GbE linerate (~112 MiB/s), instead of also crawling at ~65 MiB/s. I initially thought those were unrelated issues, but they weren't, apparently.

To me, it thus seems that this is some weird power management issue where the CPU (or the kernel?) decides that cluster-sleep is ok to enter, when it in fact introduces too much latency for audio (and other things) to work reliably. Initially, I tried to disable cluster-sleep only on cpu0, because that was where all the interrupts came in, but that was less reliable than disabling it on all CPUs (potentially because mpd got scheduled to a cpu != 0 and it took too long to wake it up? don't know, and at this point I already sank multiple days in debugging it and I'm not very keen on testing more unless it helps to provide an upstream fix).

Thank you for the TOSLINK offer, but given the resolution above, I'm pretty sure that this is not an electrical issue :-).

kind regards,
jssfr

P.S.: If you wonder how we figured that cluster-sleep thing out … I wanted to try plain aplay again without mpd so I ran ffmpeg to convert a file to wav… and while ffmpeg was running, things were completely smooth. We then thought this was I/O related and tried to keep the source device busy with pv /dev/sdb >/dev/null, but that didn't quite do the trick (and after poking around in power saving stuff of that device, I also had no clue what could possibly be going on). Then I ran stress -c 7 and that fixed everything and so we came up with the idea that this must be in fact CPU powersaving going wrong.

P.P.S.: This also explains why *lowering* the buffer sizes helped in mpd… more cpu load = less cpuidle = less chance to mess the timing up.
  Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  rk3399 and Visual Output Processor 2 (VOP2) swan 3 1,284 05-13-2023, 12:01 AM
Last Post: swan
  A fix for Bluetooth audio stuttering on the RockPro64 raph 2 1,580 01-03-2023, 06:53 PM
Last Post: raph
  No audio output with Manjaro Arm 20.04 Faradn 2 4,344 08-02-2020, 09:48 AM
Last Post: new-rockpro-user
  How to deactivate kernel output on ttyS2? ellerbach 1 2,468 04-09-2019, 08:37 PM
Last Post: rhex
  HDMI audio solved yoramro 2 5,469 01-29-2019, 10:01 PM
Last Post: fczuardi

Forum Jump:


Users browsing this thread: 1 Guest(s)