2023 twenty-four merry days of Perl Feed

Trimming audio files with Audio::Nama

Audio::Nama - 2023-12-18

Warped audio waveforms

Our task

The elves' workshop has been as busy as ever this year. To help them pass the time when not gossiping or recounting heroic pranks of Christmases passed, Igvik Graybeard, one of the IT elves, has been piping music to the office, canteen and tool benches using cabling left over from an obsolete phone service.

In the last years the elves have gotten into radio plays, and Igvik wants to edit some old recordings he has to remove unwanted segments. In other words, he wants to extract relevant parts of an audio file, stitching them together into a new file. Igvik runs Linux and prefers to do his work in the terminal wherever possible. We've decided to help.

Our tools

We'll be accomplishing this with Audio::Nama, a multitrack recording mixing and audio-processing application written in perl, and Ecasound, a general-purpose audio engine. We also need git, which Nama uses to manage project state, provide for branching, undo, etc. nama is the executable program script and also man page, so "man nama" for the docs, or use nama's internal help.

Nama configures Ecasound for recording, mixing and a variety of tasks via an audio network definition called a chain setup. Aside from housekeeping duties, Nama's main task is to regenerate the chain setup (and re-configure the engine) whenever the graph changes due to user input, such as deactivating a track or changing an output -- a bit like how a spreadsheet recalculates as necessary.

The first run of nama creates a $HOME/nama directory for project files and a configuration file $HOME/.namarc.

Nama assumes a linux environment: that you'll be recording or listening from the default ALSA soundcard device, usually your computer's built-in soundcard. You can change this setting in namarc.

section divider

Create a new project, add a track and import an audio file

We'll create a new project called 'christmas'.

$ nama -c christmas

We get a startup banner, then an empty track listing:

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              CH 1/2        100    50
          2  Mixdown               OFF     track Main            --             --    --

        nama christmas Mixdown >

In the prompt, 'christmas' is the project, 'Mixdown', the current track.

We create a separate track to accommodate our audio file, then import the file.

        nama christmas Mixdown > add-track carol
        nama christmas carol > import ~/music/christmas-carol.mp3

        format: s16_le,2,12000
        importing /home/jroth/music/christmas-carol.mp3 as /home/jroth/nama/christmas/.wav/carol_1.wav, converting to s16_le,2,44100,i

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              CH 1/2        100    50
          2  Mixdown               OFF     track Main            --             --    --
          3  carol                 PLAY    carol_1.wav           Main bus      100    50

We can now press SPACE to start/stop playback.

In the track listing above we see that track 'carol' with source carol_1.wav is set to PLAY. The output of 'carol' feeds the 'Main' track via the 'Main' bus. The 'Main' track serves as the master fader with output to the sound card. Like 'carol', all new tracks start as members of the 'Main' bus.

section divider

Define our clips

We will use a specialized type of bus called a 'sequence' to stitch together clips from our radio play. First we need to mark the beginning and end of each clip we want to keep.

We do that by listening to the track and using the clip-here command (bound to the F1 key) to mark clip starts and ends.

After some trial and error, we have the correct marks. In this example: four five-minute sections of content separated by 40-second breaks.

        nama christmas carol > list-marks

        0 40.0 clip-start-0020
        1 340.0 clip-end-0021
        2 380.0 clip-start-0022
        3 680.0 clip-end-0023
        4 720.0 clip-start-0024
        5 1020.0 clip-end-0025
        6 1060.0 clip-start-0026
        *7 1360.0 clip-end-0027

The final gathering

We can now assemble our clips.

        nama christmas carol > gather

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              CH 1/2        100    50
          2  Mixdown               OFF     track Main            --             --    --
          3  carol                 MON     carol sequence        Main bus      100    50

Note that the source for track 3 has changed from carol_1.wav to carol sequence.

Having listened to the output and resolved any hiccups (there were none!) we are ready to render it to an audio file.

        nama christmas carol > mixdown
        Enabling mixdown to file

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              Mixdown       100    50
          2  Mixdown               REC v1  Main                  --             --    --
          3  carol                 MON     carol sequence        Main bus      100    50

        Now at: 0:00
        Engine is ready.
        Press SPACE to start the mixdown.

        Engine is running
        Engine is finished at 16:27

        recorded: Mixdown_1.wav
        Setting mixdown playback mode.

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  OFF     Main bus              --            100    50
          2  Mixdown               PLAY    Mixdown_1.wav         --             --    --
          3  carol            MON  OFF     carol sequence        --            100    50

The mixdown step also generates ogg and mp3 encodings with oggenc and lame, respectively, if desired. These files appear in the project directory:

~/nama/christmas/christmas_1.mp3 ~/nama/christmas/christmas_1.ogg ~/nama/christmas/christmas_1.wav is a symlink to ~/nama/christmas/.wav/Mixdown_1.wav

section divider

How the sausage is made

Chain setup graph

An Ecasound chain setup is made of signal-processing chains--edges in the processing graph--and loop devices, which are nodes capable of summing their inputs.

A chain represents a stream with one input and one output.

Here is a setup of one chain, omitting general options related to buffer sizes, etc.

        # audio inputs

        -a:3 -f:s16_le,2,44100 -i:/home/jroth/nama/christmas/.wav/carol_1.wav

        # audio outputs

        -a:3 -o:alsa,default

There is one chain, with label 3. Its input is from a .wav file, and its output goes to the default soundcard. There is an -f argument that describes the audio signal bit width, channel count and sample frequency.

Chains can terminate at an audio file, a sound device or a loop device. A loop device is a buffer stage that can sum the signal from one or more chains. It supplies the sum to another chain (or possibly multiple chains.) A loop device with its sources and an output chain can represent a mixer or bus.

Significantly for our task, a chain can represent an audio clip, part of an audio file that plays at a particular time.

        -a:5 -f:s16_le,2,44100 -i:playat,40,select,80,300,/home/jroth/nama/christmas/.wav/carol_1.wav

The 'select,80,300' object extracts 300 seconds of audio starting at 0:80. The 'playat,40' object plays the audio after a delay of 40 seconds.

Chains also perform effects processing. They can host plugins and plugin controllers called chain operators included in Ecasound and provided by numerous LADSPA and LV2 authors.

section divider

Basic mixer configuration

Each time a change in track status occurs, Nama generates a Chain setup file and commands Ecasound to load it.

Let's start with the mixer configuration when we imported our radio play.

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              CH 1/2        100    50
          2  Mixdown               OFF     track Main            --             --    --
          3  carol                 PLAY    carol_1.wav           Main bus      100    50

This is the corresponding chain setup:

        nama christmas carol > chains

        # audio inputs

        -a:1 -f:s16_le,16,44100,i -i:loop,Main_in
        -a:3 -f:s16_le,2,44100 -i:/home/jroth/nama/christmas/.wav/carol_1.wav

        # audio outputs

        -a:1 -o:alsa,default
        -a:3 -f:s16_le,16,44100,i -o:loop,Main_in

Above we have two chains, labeled 1, and 3, corresponding to our tracks 'Main' and 'carol'.

Chain 3 reads carol_1.wav and sends its output to loop device Main_in. Chain 1 read the stream from Main_in and supplies it to the soundcard.

Debug output

By starting Nama with the -L ChainSetup logging option, we can see the steps from Nama's representation of the routing graph to the final chain setup. Initially, tracks in the graph are allowed to connect directly. Then the graph is simplified. Dead branches are removed. Finally, loop devices are introduced between tracks. I only include the graph where it changes.

        [1513] ChainSetup  (L 106) Graph after bus routing:
        wav_in-carol, carol-Main

        [1514] ChainSetup  (L 111) Graph after aux sends:
        [1514] ChainSetup  (L 114) Graph with paths from Main:
        wav_in-carol, carol-Main, Main-soundcard_out

        [1514] ChainSetup  (L 117) Graph with mixdown mods:
        [1514] ChainSetup  (L 216) Graph after simplify_send_routing:
        [1514] ChainSetup  (L 218) Graph after remove_out_of_bounds_tracks:
        [1514] ChainSetup  (L 220) Graph after recursively_remove_inputless_tracks:
        [1515] ChainSetup  (L 222) Graph after recursively_remove_outputless_tracks:
        [1515] ChainSetup  (L 122) Graph after pruning unterminated branches:
        [1515] ChainSetup  (L 126) Graph after adding loop devices:
        wav_in-carol, carol-Main_in, Main_in-Main, Main-soundcard_out

        [1515] ChainSetup  (L 130) Graph with inserts:

So this is our final graph:

        wav_in-carol, carol-Main_in, Main_in-Main, Main-soundcard_out

At this point, one end of each edge is a track. An edge--associating one track and one terminal--is sufficient to write an input or output line in the chain setup.

This happens through dispatch on the type of terminal, and generates IO objects, each used to write one line of the chain setup: a different class for each type of terminal.

        [1515] ChainSetup  (L 360) dispatch: wav_in -> carol
        [1515] ChainSetup  (L 362) name: carol, endpoint: wav_in, direction: input
        [1516] ChainSetup  (L 360) dispatch: Main -> soundcard_out
        [1516] ChainSetup  (L 362) name: Main, endpoint: soundcard_out, direction: output
        [1516] ChainSetup  (L 360) dispatch: carol -> Main_in
        [1516] ChainSetup  (L 362) name: carol, endpoint: Main_in, direction: output
        [1516] ChainSetup  (L 360) dispatch: Main_in -> Main
        [1516] ChainSetup  (L 362) name: Main, endpoint: Main_in, direction: input


        bless( {
          chain_id_ => 3,
          endpoint_ => "wav_in",
          track_ => "carol",
        }, 'Audio::Nama::IO::from_wav' )

        bless( {
          chain_id_ => 1,
          endpoint_ => "soundcard_out",
          track_ => "Main",
        }, 'Audio::Nama::IO::to_alsa_soundcard_device' )

        bless( {
          chain_id_ => 3,
          device_id_ => "loop,Main_in",
          endpoint_ => "Main_in",
          track_ => "carol",
        }, 'Audio::Nama::IO::to_loop' )

        bless( {
          chain_id_ => 1,
          device_id_ => "loop,Main_in",
          endpoint_ => "Main_in",
          track_ => "Main",
        }, 'Audio::Nama::IO::from_loop' )

These classes provide methods that can be overridden by field values annotated on the graph edges or graph nodes, with the edge (chain) values having the highest priority.

Overrides are used for mixdown and anywhere values for an IO object need to differ from those derived from the parent track.

Assembling our audio clips

We'll look at the project after we issued the gather command and entered <mixdown>, ready to render the sequenced clips. The output of track 'Main' feeds track 'Mixdown' which will write Mixdown_1.wav.

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ================================================================================
          1  Main                  MON     Main bus              Mixdown       100    50
          2  Mixdown               REC v1  Main                  --             --    --
          3  carol                 MON     carol sequence        Main bus      100    50

If we show hidden tracks, we see our four clips, each represented as a track. The source for each is the same audio file.

        nama christmas carol > sha # show all tracks

         No. Name       Requested  Status  Source                Destination   Vol   Pan
        ===============================================================================
          1  Main                  MON     Main bus              CH 1/2        100    50
          2  Mixdown               OFF     track Main            --             --    --
          3  carol                 MON     carol sequence        Main bus      100    50
          4  carol-1-carol-v       PLAY    carol_1.wav           carol sequen  100    50
          5  carol-2-carol-v       PLAY    carol_1.wav           carol sequen  100    50
          6  carol-3-carol-v       PLAY    carol_1.wav           carol sequen  100    50
          7  carol-4-carol-v       PLAY    carol_1.wav           carol sequen  100    50

In the chain setup below we see that select and playat objects are used in the inputs to define a region of the audio file and play it with a given delay. The values were generated from our list of marks.

        # audio inputs

        -a:1 -f:s16_le,16,44100,i -i:loop,Main_in
        -a:3 -f:s16_le,16,44100,i -i:loop,carol_in
        -a:4 -f:s16_le,2,44100 -i:select,40,300,/home/jroth/nama/christmas/.wav/carol_1.wav
        -a:5 -f:s16_le,2,44100 -i:playat,300,select,380,300,/home/jroth/nama/christmas/.wav/carol_1.wav
        -a:6 -f:s16_le,2,44100 -i:playat,600,select,720,300,/home/jroth/nama/christmas/.wav/carol_1.wav
        -a:7 -f:s16_le,2,44100 -i:playat,900,select,1060,300,/home/jroth/nama/christmas/.wav/carol_1.wav
        -a:Mixdown -f:s16_le,16,44100,i -i:loop,Main_out

        # audio outputs

        -a:1 -f:s16_le,16,44100,i -o:loop,Main_out
        -a:3 -f:s16_le,16,44100,i -o:loop,Main_in
        -a:4,5,6,7 -f:s16_le,16,44100,i -o:loop,carol_in
        -a:Mixdown -f:s16_le,2,44100,i -o:/home/jroth/nama/christmas/.wav/Mixdown_1.wav

Some overriding was involved, as we see -a:Mixdown, a chain identifier that is not a track number. I think there was some technical reason, but it does stand out.

Thanks for reading!

Showcase

Jeanette Claassen has produced a large number of original songs using Nama in combination with various other music software and hardware. She began recording music using Ecasound and migrated to Nama as it became sufficiently featureful.

Notable dependencies

Gravatar Image This article contributed by: Joel Roth <joelz@pobox.com>