More payphone shenanigans

Your periodic reminder that the only planet where 100% of Linux systems have working sound is Mars.

A few weeks ago I griped about GPIO noise in the payphone and someone pointed out that it was most likely caused by me believing that I had a pull-down resistor when I did not. I was using pin 12, you see, and that is one of the Raspberry Pi's pins that arbitrarily does not contain a built-in pull-down resistor. Because every god damned GPIO pin is a god damned special snowflake with its own arbitrary god damned set of magical behaviors. And good luck finding any of this in the documentation. Oh yeah, also the API call that turns on the resistor does not return an error status when said resistor does not exist. How very.

Anyway, having humped that heavy-ass payphone home on my back to work on it, I faced the traditional dilemma: do I upgrade the OS to avoid whatever pile of new security exploits have appeared in the last two years? Or do I leave it alone, so that shit doesn't break?

Reader, I chose poorly.

I upgraded from Raspbian 9.13 to 10.10 and -- wait for it -- sound stopped working. Are you shocked? Don't be shocked. After I figured out the set of random, poorly documented config files where I had to change a "1" to a "2" because they decided to change how audio devices are numbered, I finally got it making use of my USB audio interface again. But then, even more shitfuckery!

See, because the year is still 1991, Linux systems cannot play two sounds simultaneously.

Yes, really.

So if you are using /usr/bin/play to play an MP3 file in the background, you have to wait for it to finish before trying to play another one, or the second one gives you the completely sensible error message, "play WARN alsa: can't encode 0-bit Unknown or not applicable".

This is considered normal, and in fact not batshit insane, by the Linux community.

The Linux Sound Architecture, I am reliably informed, is considered "Advanced".

When I got this working back in the Raspbian 9.13 days of yore, I solved this problem by having my program fork "play" and keep track of its pid; and when it's time to play a new sound, kill the existing "play" process first.

Guess what, that trick no longer works in this modern jetpack future world of Raspbian 10.10.

Now, apparently, even after the play process has exited, and has been waitpid'ed, the audio device still hasn't been unlocked -- audio remains unplayable for somewhere between 250 and 400 milliseconds.

Someone on Twitter suggests that maybe this is some new kernel fuckery that appears to affect the 10.10 kernel (5.10.52-v7+ #1441) but I sure can't tell what any of them are talking about.

In summary, fuck all of this entirely.

Previously, previously, previously, previously.

Tags: , , , , , , , ,

49 Responses:

  1. Scott says:

    Ouch, that is really annoying. Sound has always been an issue for me on linux as well. I THOUGHT there was a way to configure Alsa using dmix or jack or some such to allow multiple "streams" at once though.

    Good luck though, that payphone is such a cool hack.

  2. Zygo says:

    It's really weird that you're going directly through alsa. Alsa is the low-level "here's a DMA buffer and some pointers, use the pointers to tell me what part of the buffer I should be hurling out the USB port for the next few seconds" driver. It gives raw access to the hardware and nothing else. Basically the only sane use case for alsa is to be an output driver for pulseaudio or jack. dmix is an old alsa demo whose only useful outcome was to confirm the need to use something like pulse or jack instead.

    Pulseaudio does the multiplexing, mixing, resampling, etc. on top of the raw device. Pulseaudio also gives you stable device IDs, like "alsa_card.usb-0b54_USB_PnP_Audio_Device-00", and a command-line (pacmd) to peer into its tiny mind and correct most of the remaining bugs (like making sure all your streams are actually using said alsa_card.usb-0b54_USB_PnP_Audio_Device-00, and haven't flipped to some other device because the filesystem was full and the configuration database was lost).

    Pulseaudio took more time to debug for production use than it normally takes to make a human and get it licensed to drive, but these days pulse should be good for simple cases like this. You're not even trying to pass bluetooth through a qemu USB passthrough device here, this should be easy.

    • jwz says:

      Well, I got here because /usr/bin/play used to work, mpg321 was unreliable garbage that often left things in such a bad state as to require a reboot, sox didn't work at all, and every other option seemed to require a dozen daemons, dbus, and a GUI clone of half of iTunes circa 2001. So if there was a suggestion in there for what command-line program I should be using instead of "play" I did not see it.

      • Zygo says:

        If it's Debian, 'apt-get install pulseaudio' should do it.

        Yes, it does require dbus, and it recommends (but does not require) the entire bottom half of GNOME desktop. You only need to run dbus and pulseaudio though.

      • Paul N says:

        I use vlc in commandline mode for things like this. I believe my netbook does not have Pulseaudio, and it works (it can even play multiple streams with only ALSA).

        vlc --intf rc --play-and-exit file.mp3

        I don't know about the dozen daemons, dbus, etc.

        • Paul N says:

          I have never used it (yikes!) but it looks like in addition to --intf rc there is a --intf telnet mode that might be more appropriate for your application (since then you won't have the delay of starting and stopping VLC for every file you play). See https://wiki.videolan.org/Console.

          Also: because my VLC was set to repeat all tracks by default, in order to get the example above working I needed --no-loop as well. But if you are starting with a fresh install this probably will not be necessary.

    • Aka_Nabla says:

      In fact, Raspberry OS (née Raspbian) switched officially to Pulseaudio last December...

      https://www.raspberrypi.org/blog/new-raspberry-pi-os-release-december-2020/

      So, time to ditch alsa and stop continue crying maybe?

      • jwz says:

        I see no information in that link talking about their lovely new GUI taskbar menus and dialogs that answers any of the actual questions I might have.

        • ben says:

          Assuming you have pulseaudio installed (and I believe it is by default on Raspbian 10.10), you can tell play to route audio through pulseaudio. Install libsox-fmt-pulse and set the envvar AUDIODRIVER to pulseaudio.

          If pulseaudio isn't using your USB output by default, pactl is what you'll want to use to fix that. Check out https://shallowsky.com/linux/pulseaudio-command-line.html (not my website).

          • George Dorn says:

            Sound on Linux wouldn't be half the nightmare it is today if the documentation existed and could be trusted to be accurate.

          • jwz says:

            When AUDIODRIVER is set, that just makes play say "play FAIL formats: no handler for given file type `pulseaudio'".

      • Baggypants says:

        Just in time to see other distros start to migrate from PulseAudio to Pipewire then :D

      • Roger Weskins says:

        Advocating for a Poettring 'solution' is never the right answer.

    • Zygo says:

      It's also concerning that the story didn't end at "sound stopped working, so I restored the SD card from the pre-upgrade backup and everything is fine now, wow, dodged a bullet there" for two reasons:

      1) SD cards are mayflies--they don't gracefully degrade, they just pick a random Tuesday and die. Any problem, from SD card failure to bad software upgrade, should be fixable by reverting to the last working copy; otherwise, you're going to be building it from scratch at the least convenient possible time.

      2) If you're running anything on that device that needs an upgrade, you should probably remove it? AFAICT the only thing you need to upgrade on the device is ssh. Almost anything else that opens a network socket can be summarily uninstalled, and firewall rules should let only a handful of admin nodes talk to it, with RSA keys, passwords disabled. You control the mp3 media files it consumes, and you built all the directly connected hardware interfaces, so there's no attack surface there other than what you put into your own code. If you get rid of unnecessary software, you'll only need upgrades once a decade--because the hardware it was running on died, and new hardware doesn't run the old software, so you have to rebuild the whole system anyway.

      If you blindly upgrade to every release, you don't get very much security. It was pretty secure to start with. Blind upgrades will add new security vulnerabilities, in addition to functional regressions. In this particular case you are far better off without them.

      The kernel fuckery from Twitter is probably unrelated--it's just a random bug that is getting its 15 minutes of fame this week.

      Alsa devices stay open after the process is killed because there's no way to shut down some audio devices in mid-buffer. e.g. the audio device is on a serial port, and there's nothing to do at close() but wait until the device reads the data it was already sent. There might be some asynchronous thing going on where the close() returns before the device is ready to use again, but you don't have to know any of this if you use pulse or jack instead of raw alsa.

  3. We made a similar interactive playphone project built around a Pi and lots of swearing. My original solution was like yours - play sound files, track the PID, and kill the currently playing one if I want to interrupt it. Then my coworker who knows better took over the sound part and switched over to PureData.

    I think the answer is that you need a framework to handle the foibles of the OS. It feels like another dependency, and it is, but in practice it's a more reasonable choice. The audio platform is primary and that decides the OS.

    I haven't worked on the project for several months so of course I have no recollection of how it worked, and getting another one running would take serious labor, but in case it helps, https://github.com/futel/audiofone

  4. Glaurung says:

    You do have a restoreable backup of your working install of 9.13, don't you?

    The conventional wisdom that one must upgrade because Malware! Vulnerabilities! Overlooks the fact that upgrading is always going to unlock a can of very nasty worms. For most situations, I really very strongly doubt that the reduced risk of malware is enough to compensate for the days of time lost to dealing with upgrade fuckery.

    Just... don't upgrade. A working system is a precious jewel. Protect it, preserve it, and keep using it until you are forced to upgrade. In many situations, any other approach is just madness. It's not like the payphone is going to be surfing the internet and downloading porn from sketchy torrent sites, after all.

    • jwz says:

      That would be a great theory if Linux had any APIs that were stable for longer than 3 months. Eventually you're going to need to recompile some utility that you depend on due to forces outside of your control, and it won't work because your system is "so out of date" and then the whole house of cards comes tumbling down.

      Believe me, I have tried the "never upgrade" approach. That was my go-to for years. It works badly. Which of "never upgrade" or "always upgrade" works more badly is still an area of active research.

      • Glaurung says:

        o.O

        Reading your ongoing travails with Linux over the past two decades convinces me that I made the right choice to continue to completely avoid that OS.

        • tobias says:

          Is it really self abuse if it provides hours of free entertainment?

          I certainly can see the attraction of billing somebody else for one's time on it.

          By completely avoiding it, how can you really know the hilarity you've missed out on?

          Some of us enjoy using sed to change soundcard when the GUI program no longer runs, having been remade and improved.

  5. vc says:

    I still roll with straight ALSA+dmix[0] in modern linux on my laptops. It works fine until you start needing stuff like per-stream volume controls and intricate audio routing like switching between a bluetooth headset and integrated audio without interrupting the streams.

    For a statically configured payphone kiosk I'd expect plain ALSA+dmix to be perfectly sufficient, and even preferable since it's a much simpler stack. But maybe you need to explicitly configure dmix, since it uses heuristics in the driver surrounding hardware mixing to control its automatic enabling.

    [0] https://wiki.archlinux.org/title/Advanced_Linux_Sound_Architecture#Dmix

    • jwz says:

      Well that seems to say that it is on by default, so I fully expect someone to pop in here telling me why it's impossible to make that work on a Pi for some reason.

      • vc says:

        FWIW the `aplay` utility has more alsa-specific functionality which might work better than `play`, it's part of alsa-utils on my distro.

        `aplay --list-pcms` in particular can be helpful in seeing what outputs are available along with the appropriate names to specify with `aplay --device=`.

        `mpv` also has good alsa support, `mpv --audio-device=help` similarly produces possible outputs with the appropriate names for mpv to use.

        Maybe it makes no difference, I don't have a Pi here to poke at and see what outputs are available...

        • jwz says:

          aplay doesn't work for shit:

          Playing raw data 'dtmf_1.mp3' : Unsigned 8 bit, Rate 8000 Hz, Mono
          aplay: set_params:1339: Sample format non available
          Available formats:
          - S16_LE

          • Netluser says:

            aplay only understands simple formats like .wav and .au, not mp3. Like everything else ALSA, aplay is crap- don't use it in production, use it for debugging. It's (apparently) written by the alsa people themselves, so it's actually useful for this purpose.

            The useful switches from the manpage:

            -l, --list-devices
            -L, --list-pcms
            Check what devices and "pcms" in ALSA terminology your setup sees available. Output is also crap, but it's doable. Just pay attention to what's for playback versus recording when parsing. On my minimal Debian box for playback it's just the soundcard itself and the dmix device, dmix being default.

            --dump-hw-params
            See about the sound formats the configured devices support, manpage says "for raw device hw:X" this lists the sound card's capabilities. Might be useful.

            -D, --device=NAME
            Pick one of the nonsense device names to try to send sound to. This makes it easier to debug sound problems because you can try skipping past all the config garbage that's already set up. If you don't have a .wav file of the right samplerate/format, try "speaker-test -D DEVICE --rate RATE --test pink|sine" (try "--frequency FREQ" if you're using sine). See also manpage usage examples in speaker-test(1), or google for examples (been a while since I did this, willing to try figuring out the syntax again myself if you can't).

            -v, --verbose
            Manpage: "Show PCM structure and setup. This option is accumulative. The VU meter is displayed when this is given twice or three times."

            -N, --nonblock
            Exit immediately if the sound device is busy instead of blocking. Useful when debugging getting two things to play at once.

            If you don't have any wav files to use with aplay, Debian has some as part of speaker-test of alsa-utils in /usr/share/sounds/alsa/ - other distros probably the same.

            So all that could help figure out if "play" is sending things to the right place, and what happens if you try running two or more aplay's at the same time per each sound device. Doesn't fix the problem, but might make figuring where the problem is easier. Good luck, hope that helps. If all else fails, man alsa-info.sh may provide a bit more useful and useless info.

          • Quante Porter says:

            Is there an alsa-oss/aoss package available?

            As with vc's comments, if sound can be made to work in aplay, it may be that you're in a position where the sox play commands are using some pre-ALSA means of playing sound- possibly wrapping 'play' in aoss might get you over the line?

  6. jwilkes says:

    I'm willing to bet that the goddamned kitchen sink still works:

    cvlc --play-and-exit /usr/share/sounds/freedesktop/stereo/alarm-clock-elapsed.oga &

    cvlc --play-and-exit /usr/share/sounds/freedesktop/stereo/trash-empty.oga

    To whom it may concern: this comment has fulfilled the one and only goal of the freedesktop project. It may be retired now.

  7. Eric says:

    Using Linux is like being a witch in a dimension so unstable that the only way to know the spells that currently work is to gossip with all the other witches.

  8. Carlos says:

    Additional vote for using PulseAudio or something similar on top of ALSA. To make it work outside of a full desktop environment you may need to set the envvar pointing to where PulseAudio is listening, i.e.

    PULSE_SERVER=tcp:localhost:4713

    I use this to forward a remote machine's audio through a reverse SSH port forward back to my desktop. It should work with anything of Raspberry Pi vintage; I'm using it with an audio app that isn't compileable on anything newer than 2009's Ubuntu Lucid. Said app is, of course, light years ahead of any audio app actually distributed for Linux today.

    I get the impression that if jwz is ever found wandering the streets of SOMA with glazed eyes and bare feet he will be muttering under his breath something like

    Raspberry Pi ... every god damned GPIO ... god damned special snowflake ... god damned set of magical behaviors ...
    ... upgrade the OS ... chose poorly ... random, poorly documented config files ... even more shitfuckery ...
    ... play WARN alsa: can't encode 0-bit Unknown ... batshit insane ... "Advanced" ... modern jetpack future world ...
    ... new kernel fuckery ... fuck all of this entirely

    C.

  9. jwz says:

    Also, fun story, even after maybe resolving the floating GPIO issue, I am still getting "ghost" typing on the USB keyboard, e.g., \000\000\002\006\000 showing up at random from a keyboard that doesn't even have a control key on it. And yes, there is a ferrite lump at each end of the cable.

    • James C. says:

      Clearly it’s possessed. Put the whole thing in the middle of the street and set it on fire.

    • lpgl says:

      raspberry pi is a toy. It's better not to use it for reliable stuff.

      • jwz says:

        Go piss up a flagpole, troll.

      • vc says:

        My Pi Zero W security cameras have proven very reliable for approaching three years now, used continuously in a desert environment without AC even.

      • Carlos says:

        Except the Pi has become the single most commonly deployed PLC on the planet.

        It's plenty reliable. I know people running critical (but not life-or-death) systems on them -- hundreds of systems. Replace maybe one Pi a year.

        C.

  10. k3ninho says:

    The twitter fuckery is whole-kernel fuckery, only happened because Android 12 is going to use Linux Kernel 5.10 and the semantics of 'edge triggered' poll wait were not actually edge-triggered for sleeping tasks, they were always triggered and often multiple times on data arrival. This change means your wakeup is delayed.

    "[A] write to a pipe would unconditionally wake any waiting readers; indeed, it could wake them multiple times in a single system call. The fix changed this behavior to only perform the wakeup if the pipe buffer was empty at the start of the operation; a write to a pipe that already contained data waiting to be read would simply add the new data without causing the wakeup."
    This quote is from a fuller and better explanation at Linux Weekly News -- here's my subscriber link to The Edge-Triggered Misunderstanding -- which will be available non-subscribers in a few days.

    K3n.

    • jwz says:

      I'm still not sure what I could have taken from that link other than "the less I know about the kernel's design and politics, the happier I am."

      In particular, if it answers the question "is this what broke /usr/bin/play and would switching to a different kernel release fix that, and if so which", I can't tell.

      • Bill Paul says:

        I'm pretty sure the answer is: "no, it isn't what broke /usr/bin/play, and no, switching to another kernel version won't fix it."

        The problem described in the above link specifically relates to I/O on pipes. But all you're doing is using the SoX play(1) command to dump data into the sound device (/dev/dsp or whatever). Unless play(1) is filtering data through another process, no pipes should be involved.

        So no, this isn't the droid you're looking for. I mean, it might break other shit, but not this in particular.

        As for why you're getting that extra 250-400ms delay, I don't know. I dug back and found this Previously link that says you're using a Raspberry Pi 2 B and a board from Adafruit that uses a Cirrus Logic WM5102 chip. This looks to be the documentation for it:

        http://www.farnell.com/datasheets/1928993.pdf

        And miraculously, some of the embedded links for the chip documentation still work.

        This looks like it's connected to the GPIO header on the RPi (as opposed to being a USB device, which was something I wasn't sure about).

        It looks like the audio path from the Pi to the WM5102 is via I2S. There are also other SPI and I2C pathways for chip management (for the WM5102 and the other chips as well, e.g. volume control on the WM8804).

        I2S is basically a serial bus, which has typically 4 pins:

        - master clock, to keep both sender and receiver in sync
        - left/right clock, which actually runs at the audio sample rate
        - data out (speakers/headphones/etc...)
        - data in (if you have a mic)

        On the last project I built, we used a CS4344 chip as a codec, which is nice in that it doesn't require you to fondle it via I2C in order to make it work -- it's all automatic. But there are some caveats: 1) there has to be a ratio of 256 between the master clock and left/right clock (e.g. 4MHz master clock gets you 15.625KHz audio rate), and 2) it takes it a little time to sense the master clock when you first start the I2S controller. I rigged my I2S controller driver to keep the master clock on all time, wait for about 200ms at boot time after first turning the master clock on to give the codec a chance to get its brains together, and then I would gate the left/right clock pin on and off to actually control sound playback.

        Now, I don't think the WM5102 has the same limitation, but the point is that a lot of the device behavior depends on the driver code. So there may be a kernel-side issue here, but it's likely in the driver for this card.

        I also see where 2016 Jamie said:

        [...]
        You know what this means, right? I voluntarily and of my own free will recompiled my kernel in order to get my audio card to work. I don't even know who I am any more. I don't even know what year this is. "And you did this on your home phone?"

        Because of course Raspbian doesn't come with the modules pre-built. Of course it doesn't.
        [...]

        Did you have to do that again after you upgraded, or did the new OS actually come with the modules you needed this time?

        I also see where past Jamie said:

        [...]
        The Pi has an audio output, but it's shit, and it's not powerful enough to drive the speaker in this handset.
        [...]

        Okay, maybe it blows, but does the new OS have the driver for it present, and if so, can you try dumping audio to it (even if you can't hear it) and see if it also blocks for 250-400ms between attempts? If it does, then that might point to a problem in the Linux sound framework code, but if not, then it's likely something in the specific driver for this card.

        Unfortunately getting this thing to work takes more than just sending data over I2S: you also have to configure the chips on the card via I2C/SPI/etc... in order to get them into the right state to start playing, which is a painful and error-prone juggling act.

        • jwz says:

          Stock kernel this time -- that was back when I was using that Cirrus Logic piece of shit, which was a terrible idea. These days I'm just using a little USB headset dongle. Also I think it's a Pi3b now, not a 2.

          • Bill Paul says:

            Oh, well crap. That changes things. Now you've dragged the USB stack into the picture. Who the hell knows what could be going wrong there.

            But I think the RPi3B also still has an on-board audio out controller, and it may still be worth it to see what happens when you use play(1) with that instead of the USB thinger.

  11. David Konerding says:

    I'm not going to make any technical suggestions for low-latency, high-control concurrent access to sound devices (and definitely not on how to live with raspberry pi and linux churn without going crazy), but this is an area I've explored before. I was amazed at how quickly my question "how do I receive several simultaneous requests to play audio, some of which should cancel previous audio, and others which should mux inline" lands me on pages that are basically state-of-the-art audio synthesis research (for example,

    One of the most important outcomes of the past 30 years is that each of the major OSes has low-level libraries for doing extremely sophisticated multichannel audio with low latency, fallbacks to CPU, Fm synthesis, and yet, most of us sort of clumsily use unix processes to manage concurrent access to a hardware device. Sadly, it's hard to exceed what Cycling 74, PureData, and PortAudio give you out of the box without investing a lot of software development time.

  12. andyjpb says:

    Try paplay.

    • andyjpb says:

      (It needs a .wav file.)

    • andyjpb says:

      I've done some more digging.

      `aplay` and `paplay` can both work for me but are unreliable in cron.

      I have an X Server running and if run them from an xterm then they both work.

      If I strip both DISPLAY and XDG_RUNTIME_DIR out of the environment with unset then it stops working.

      Adding XDG_RUNTIME_DIR back in again makes it work again, and also work from cron.

      So I added `export XDG_RUNTIME_DIR=/run/user/$(id -u)` to the top of my script.

  13. kwk says:

    I'd be very, very tempted to put in a tape player with an endless loop cassette tape.

  14. Matthew says:

    I have also struggled to build an advanced sound architecture atop linux (because unfortunately the software I need to use for audio requires linux). I can't just give you the answer because there isn't one but I can perhaps help. The most important thing is to get away from ALSA immediately and the best of the universally awful answers to how to do that seems to be the almost totally undocumented jack. Jack support is almost but not quite universal in linux audio software.

    For example here is how I persuade pulseaudio to start on the one machine which is forced to have it:

    speakers="sink connect=0 client_name=pulse-out channels=2"
    mic="source connect=0 client_name=pulse-in channels=1"
    daemon="--start --load=module-always-sink --load=module-always-source --exit-idle-time=0 --disallow-exit --load=module-native-protocol-unix"

    # --start
    # Start PulseAudio if it is not running yet. This is different
    # from starting PulseAudio without --start which would fail if PA
    # is already running. PulseAudio is guaranteed to be fully ini-
    # tialized when this call returns. Implies --daemonize.
    #
    # "guaranteed" as defined in the delusional mind of the <swearing>.

    while ! pgrep -f pulseaudio; do
      pulseaudio -n $daemon --load=module-jack-"$speakers" --load=module-jack-"$mic"
      sleep 2
    done # what a pile of wank

    It still doesn't quite work 100% of the time. On the other hand starting jack is "pgrep -f ^jackd || jackd -R -d dummy -p 256 &" and that part (there are problems elsewhere) does work 100% of the time. So far.

    The problems jack will solve, once you've got the start/restart routine suited to your environment, is: mix multiple audio streams regardless of source and without interacting with the kernel; respond in "real"-time (real real time if the stars are aligned); and not require exhaustive pissing about with ALSA or anything made by the Potter. It will also work the same way on every reboot.

    As mentioned it may as well not have documentation but I can figure out (and even document!) the necessary incantation if you like.

    Note that I've only done this on x86 (Debian) but since jack operates entirely in userspace the difference in linking it to ALSA on a pi should be (famous last words?) minor.

  15. islandnut says:

    This error?
    islandnut@blackhole:~$ play song.mp3
    play WARN alsa: can't encode 0-bit Unknown or not applicable

    Does this help?
    apt install libsox-fmt-mp3

    I mean I know this is really bad way to inform you sox can't play mp3 anymore without plugin...

  16. aka_nabla says:

    @jwz So, after the previous Herp, derp, did you finally managed to play multiple streams simultaneously?

  17. Phil says:

    I think the expected approach is to use a library that does mixing / sound management in your app insetead of shelling out to external binaries.

    libSDL / libSDL_mixer being the obvious choice I would guess. There’s Perl support on CPAN: https://metacpan.org/pod/SDL

    The Perl gamedev pdf book looks like it has useful sample code: https://raw.githubusercontent.com/PerlGameDev/SDL_Manual/master/dist/SDL_Manual.pdf

  • Previously