Getting Started with ALSA Audio API


This article attempts to provide some introduction to the ALSA audio API. It is not a complete reference manual for the ALSA API, nor does it cover many of the unique problems that more complex software needs to solve. However, it does attempt to provide skilled programmers who are unfamiliar with the ALSA API with sufficient background and information to write simple programs that use the ALSA API.

Understanding audio interfaces

Let’s start by reviewing the basic design of an audio interface. As an application developer, you don't need to worry about this level of operation - it's all handled by the device driver (which is one of the components provided by ALSA). However, if you want to write efficient and flexible software, you do need to understand what's going on at a conceptual level.

An audio interface is a device that allows a computer to receive data from and send data to the outside world. Inside a computer, audio data is represented by a stream of bits, just like any other kind of data. However, audio interfaces can send or receive audio in the form of an analog signal (a voltage that changes over time) or a digital signal (a stream of bits). In either case, the collection of bits that the computer uses to represent a particular sound will need to be converted before being passed to the outside world. Likewise, the external signal received by the interface will also need to be converted before the computer can use it. These two conversions are the raison d’être of audio interfaces.

There is an area within the audio interface called the "hardware buffer". When an audio signal arrives from the outside world, the interface converts it into a bit stream usable by the computer and stores it in a portion of the hardware buffer used to send data to the computer. When it has collected enough data in the hardware buffer, the interface interrupts the computer to tell it that the data is ready. A similar process occurs in reverse for data sent from the computer to the outside world. The interface interrupts the computer, telling it that there is space in the hardware buffer, and the computer continues to store data there. The interface later converts these bits into whatever form is needed to pass them to the outside world, and then passes them on. It's important to understand that the interface uses this buffer as a "circular buffer", and when it reaches the end of the buffer, it continues to wrap around to the beginning.

There are many variables that need to be configured for this process to work properly. They include:

  • What format should the interface use when converting between a bitstream used by the computer and a signal used by the outside world?
  • At what rate should samples move between the interface and the computer?
  • How much data (and/or space) should there be before the device breaks the computer?
  • How big should the hardware buffer be?

The first two questions are the basis for controlling audio data quality. The latter two issues affect the "latency" of the audio signal. This term refers to the delay between

  1. Data gets from the outside world to the audio interface until it is available to the computer ("input lag")
  2. between the time data is passed to the computer and the time it is passed to the outside world ("output latency")

Both delays are important to a variety of audio software, although some programs don't need to care about them.

What a typical audio application does

A typical audio application has the following rough structure:

      open_the_device();
      set_the_parameters_of_the_device();
      while (!done) {
           /* one or both of these */
           receive_audio_data_from_the_device();
	   deliver_audio_data_to_the_device();
      }
      close the device

The device that receives audio data and the device that sends audio data to the outside world may or may not be the same.

An audio application may only need to send data to the outside world, that is, play audio data, it may only need to receive data from the outside world, that is, audio recording, or it may need both.

The smallest player program

This program opens an audio interface for playback, configuring it as stereo, 16-bit, 44.1 kHz, interleaved with regular read/write access. It then passes it a chunk of random data and exits. It represents the simplest use of the ALSA audio API and is not a real program.

#include <stdio.h>
#include <stdlib.h>
#include <alsa/asoundlib.h>
	      
int main(int argc, char *argv[]) {
  int i;
  int err;

  snd_pcm_t *playback_handle;
  snd_pcm_hw_params_t *hw_params;

  unsigned int sample_rate = 44100;
  const static int32_t kFrameLength = 44100 / 100 * 2;
  int16_t buf[kFrameLength];

  if ((err = snd_pcm_open(&playback_handle, argv[1], SND_PCM_STREAM_PLAYBACK, 0))
      < 0) {
    fprintf(stderr, "cannot open audio device %s (%s)\n", argv[1],
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_malloc(&hw_params)) < 0) {
    fprintf(stderr, "cannot allocate hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_any(playback_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot initialize hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_access(playback_handle, hw_params,
      SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) {
    fprintf(stderr, "cannot set access type (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_format(playback_handle, hw_params,
      SND_PCM_FORMAT_S16_LE)) < 0) {
    fprintf(stderr, "cannot set sample format (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_rate_near(playback_handle, hw_params, &sample_rate,
      0)) < 0) {
    fprintf(stderr, "cannot set sample rate (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_channels(playback_handle, hw_params, 2))
      < 0) {
    fprintf(stderr, "cannot set channel count (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params(playback_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot set parameters (%s)\n", snd_strerror(err));
    exit(1);
  }

  snd_pcm_hw_params_free(hw_params);

  if ((err = snd_pcm_prepare(playback_handle)) < 0) {
    fprintf(stderr, "cannot prepare audio interface for use (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  for (i = 0; i < 10000; ++i) {
    if ((err = snd_pcm_writei(playback_handle, buf, kFrameLength)) != kFrameLength) {
      fprintf(stderr, "write to audio interface failed (%s)\n",
          snd_strerror(err));
      exit(1);
    }
  }

  snd_pcm_close(playback_handle);
  exit(0);
}

To run this program, you need to pass the audio interface as a parameter. On Linux, the audio interface can be obtained through the API provided by udev.

#include <libudev.h>

static const char* retrieve_device_name(struct userdata *u, struct udev_device *dev) {
  const char *path;
  const char *t;

  path = udev_device_get_devpath(dev);
  if (!(t = udev_device_get_property_value(dev, "PULSE_NAME"))) {
    if (!(t = udev_device_get_property_value(dev, "ID_ID"))) {
      if (!(t = udev_device_get_property_value(dev, "ID_PATH"))) {
        t = path_get_card_id(path);
      }
    }
  }

  return t;
}

void process_device(struct userdata *u, struct udev_device *dev) {
  const char *action, *ff;
  if (udev_device_get_property_value(dev, "PULSE_IGNORE")) {
    printf("Ignoring %s, because marked so.\n",
        udev_device_get_devpath(dev));
    return;
  }

  if ((ff = udev_device_get_property_value(dev, "SOUND_CLASS"))
      && streq(ff, "modem")) {
    printf("Ignoring %s, because it is a modem.\n",
        udev_device_get_devpath(dev));
    return;
  }

  action = udev_device_get_action(dev);

  if (action && streq(action, "remove")) {
    remove_card(u, dev);
  } else if ((!action || streq(action, "change"))
      && udev_device_get_property_value(dev, "SOUND_INITIALIZED")) {
    retrieve_device_name(u, dev);
  }
}

static void process_path(struct userdata *u, const char *path) {
  struct udev_device *dev;

  if (!path_get_card_id(path)) {
    return;
  }

  printf("process_path path %s\n", path);
  if (!(dev = udev_device_new_from_syspath(u->udev, path))) {
    printf("Failed to get udev device object from udev.\n");
    return;
  }

  process_device(u, dev);
  udev_device_unref(dev);
}

int setup_udev(struct userdata *u) {
  int fd = -1;
  struct udev_enumerate *enumerate = NULL;
  struct udev_list_entry *item = NULL, *first = NULL;

  if (!(u->udev = udev_new())) {
    printf("Failed to initialize udev library.\n");
    goto fail;
  }
  if (!(u->monitor = udev_monitor_new_from_netlink(u->udev, "udev"))) {
    printf("Failed to initialize monitor.\n");
    goto fail;
  }

  if (udev_monitor_filter_add_match_subsystem_devtype(u->monitor, "sound", NULL)
      < 0) {
    printf("Failed to subscribe to sound devices.\n");
    goto fail;
  }

  errno = 0;
  if (udev_monitor_enable_receiving(u->monitor) < 0) {
    printf("Failed to enable monitor: %s\n", strerror(errno));
    if (errno == EPERM)
      printf("Most likely your kernel is simply too old and "
          "allows only privileged processes to listen to device events. "
          "Please upgrade your kernel to at least 2.6.30.\n");
    goto fail;
  }

  if ((fd = udev_monitor_get_fd(u->monitor)) < 0) {
    printf("Failed to get udev monitor fd.");
    goto fail;
  }

  if (!(enumerate = udev_enumerate_new(u->udev))) {
    printf("Failed to initialize udev enumerator.");
    goto fail;
  }

  if (udev_enumerate_add_match_subsystem(enumerate, "sound") < 0) {
    printf("Failed to match to subsystem.");
    goto fail;
  }

  if (udev_enumerate_scan_devices(enumerate) < 0) {
    printf("Failed to scan for devices.");
    goto fail;
  }

  first = udev_enumerate_get_list_entry(enumerate);
  udev_list_entry_foreach(item, first) {
    process_path(u, udev_list_entry_get_name(item));
  }

  udev_enumerate_unref(enumerate);

  return 0;

  fail:
  if (enumerate)
    udev_enumerate_unref(enumerate);
  return -1;
}

The path of the audio device in the sysfs file system can be obtained through udev's API, similar to the following:

/sys/devices/pci0000:00/0000:00:11.0/0000:02:02.0/sound/card0

The device ID of the device can be obtained from the path of the device in the sysfs file system. The device ID is the "card"number after the file name of the device file. In this example, the device ID is 0. According to the device ID, some information about the current status of the device can be found in the procfs file system. Information related to a specific audio interface is located in the directory in the procfs file system, where [id] /proc/asound/card[id]is the device ID. For example, in this example, the relevant information of this audio interface is located /proc/asound/card0in the directory. The contents of this directory are similar to the following:

$ ls /proc/asound/card0/
audiopci  codec97#0  id  midi0  pcm0c  pcm0p  pcm1p

You can view the status of the sub-devices of the audio interface through /proc/asound/card[id]/pcm[XX]/sub[Y]/statusfiles. For example, when the device is not open, the contents of this file are as follows:

$ cat /proc/asound/card0/pcm0p/sub0/status 
closed

When the audio device is opened through the ALSA program above, /proc/asound/card[id]/pcm[XX]/sub[Y]/statusthe contents of the file are as follows:

$ cat /proc/asound/card0/pcm0p/sub0/status 
state: RUNNING
owner_pid   : 180338
trigger_time: 1079770.211430254
tstamp      : 0.000000000
delay       : 16384
avail       : 0
avail_max   : 15758
-----
hw_ptr      : 129536
appl_ptr    : 145920

/proc/asound/card[id]/pcm[XX]/sub[Y]Several other files in the directory also contain important information about the audio device. For example, infothe content of the file is as follows:

$ cat /proc/asound/card0/pcm0p/sub0/info 
card: 0
device: 0
subdevice: 0
stream: PLAYBACK
id: ES1371/1
name: ES1371 DAC2/ADC
subname: subdevice #0
class: 0
subclass: 0
subdevices_count: 1
subdevices_avail: 0

sw_paramsThe contents of the file are as follows:

$ cat /proc/asound/card0/pcm0p/sub0/sw_params 
tstamp_mode: NONE
period_step: 1
avail_min: 221
start_threshold: 1
stop_threshold: 16384
silence_threshold: 0
silence_size: 0
boundary: 4611686018427387904

hw_paramsThe contents of the file are as follows:

$ cat /proc/asound/card0/pcm0p/sub0/hw_params 
access: RW_INTERLEAVED
format: S16_LE
subformat: STD
channels: 2
rate: 44100 (1445100000/32768)
period_size: 221
buffer_size: 16384

/proc/asound/card[id]/pcm[XX]Under the directory, among the directories whose names "pcm"begin with , "p"the directories ending with contain the information of the playback sub-devices of the audio interface, and "c"the directories ending with contain the information of the recording sub-devices of the audio interface.

The above retrieve_device_name()function struct udev_devicegets the device name of the device from the structure, similar to the following:

pci-0000:02:02.0

The device parameters accepted by the ALSA API are composed of device IDs based on some templates. For example, the templates defined by PulseAudio pulseaudio/src/modules/alsa/mixer/profile-sets/default.confinclude the following:

front:%f
iec958:%f
front:%f
front:%f
surround21:%f
surround40:%f
surround41:%f
surround50:%f
surround51:%f
surround71:%f
iec958:%f
a52:%f
a52:%f
dca:%f
hdmi:%f
hdmi:%f
hdmi:%f
dcahdmi:%f
hdmi:%f,1
hdmi:%f,1
hdmi:%f,1
dcahdmi:%f,1
hdmi:%f,2
hdmi:%f,2
hdmi:%f,2
dcahdmi:%f,2
hdmi:%f,3
hdmi:%f,3
hdmi:%f,3
dcahdmi:%f,3
hdmi:%f,4
hdmi:%f,4
hdmi:%f,4
dcahdmi:%f,4
hdmi:%f,5
hdmi:%f,5
hdmi:%f,5
dcahdmi:%f,5
hdmi:%f,6
hdmi:%f,6
hdmi:%f,6
dcahdmi:%f,6
hdmi:%f,7
hdmi:%f,7
hdmi:%f,7
dcahdmi:%f,7
front:%f
front:%f

Replace in the template "%f"with the device ID, which is snd_pcm_open()the device name parameter accepted by the ALSA API, such as front:0.

Minimal acquisition program

This program opens an audio interface for acquisition, configuring it as stereo, 16-bit, 44.1 kHz, interleaved with regular read/write access. Then it reads a chunk of random data from it and then exits, it's not really a program.

#include <stdio.h>
#include <stdlib.h>
#include <alsa/asoundlib.h>
	      
int main(int argc, char *argv[]) {
  int i;
  int err;
  snd_pcm_t *capture_handle;
  snd_pcm_hw_params_t *hw_params;

  unsigned int sample_rate = 44100;
  const static int32_t kFrameLength = 44100 / 100 * 2;
  int16_t buf[kFrameLength];

  if ((err = snd_pcm_open(&capture_handle, argv[1], SND_PCM_STREAM_CAPTURE, 0))
      < 0) {
    fprintf(stderr, "cannot open audio device %s (%s)\n", argv[1],
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_malloc(&hw_params)) < 0) {
    fprintf(stderr, "cannot allocate hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_any(capture_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot initialize hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_access(capture_handle, hw_params,
      SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) {
    fprintf(stderr, "cannot set access type (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_format(capture_handle, hw_params,
      SND_PCM_FORMAT_S16_LE)) < 0) {
    fprintf(stderr, "cannot set sample format (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_rate_near(capture_handle, hw_params, &sample_rate,
      0)) < 0) {
    fprintf(stderr, "cannot set sample rate (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_channels(capture_handle, hw_params, 2))
      < 0) {
    fprintf(stderr, "cannot set channel count (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params(capture_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot set parameters (%s)\n", snd_strerror(err));
    exit(1);
  }

  snd_pcm_hw_params_free(hw_params);

  if ((err = snd_pcm_prepare(capture_handle)) < 0) {
    fprintf(stderr, "cannot prepare audio interface for use (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  for (i = 0; i < 1000; ++i) {
    if ((err = snd_pcm_readi(capture_handle, buf, kFrameLength)) != kFrameLength) {
      fprintf(stderr, "read from audio interface failed (%s)\n",
          snd_strerror(err));
      exit(1);
    }
  }

  snd_pcm_close(capture_handle);
  exit(0);
}

Minimal interrupt driver

This program opens an audio interface for playback, configuring it as stereo, 16-bit, 44.1 kHz, interleaved with regular read/write access. Then wait until the interface is ready to play data, and pass random data to it at that time. This design makes your program easily portable to systems that rely on callback-driven mechanisms, such as JACK , LADSPA , CoreAudio, VST and many others.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <poll.h>
#include <alsa/asoundlib.h>
	      
snd_pcm_t *playback_handle;
int playback_callback(snd_pcm_sframes_t nframes) {
  int err;
  short buf[4096];
  printf("playback callback called with %ld frames\n", nframes);

  /* ... fill buf with data ... */

  if ((err = snd_pcm_writei(playback_handle, buf, 3072)) < 0) {
    fprintf(stderr, "write failed (%s)\n", snd_strerror(err));
  }

  if (err > 0) {
    err = nframes;
  }
  return err;
}

int main(int argc, char *argv[]) {
  snd_pcm_hw_params_t *hw_params;
  snd_pcm_sw_params_t *sw_params;
  snd_pcm_sframes_t frames_to_deliver;
  int nfds;
  int err;
  struct pollfd *pfds;

  unsigned int sample_rate = 44100;
  if ((err = snd_pcm_open(&playback_handle, argv[1], SND_PCM_STREAM_PLAYBACK, 0))
      < 0) {
    fprintf(stderr, "cannot open audio device %s (%s)\n", argv[1],
        snd_strerror(err));
    exit(1);
  }
  fprintf(stderr, "open audio device %s (%p)\n", argv[1],
      playback_handle);

  if ((err = snd_pcm_hw_params_malloc(&hw_params)) < 0) {
    fprintf(stderr, "cannot allocate hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_any(playback_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot initialize hardware parameter structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_access(playback_handle, hw_params,
      SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) {
    fprintf(stderr, "cannot set access type (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_format(playback_handle, hw_params,
      SND_PCM_FORMAT_S16_LE)) < 0) {
    fprintf(stderr, "cannot set sample format (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_rate_near(playback_handle, hw_params,
      &sample_rate, 0)) < 0) {
    fprintf(stderr, "cannot set sample rate (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params_set_channels(playback_handle, hw_params, 2))
      < 0) {
    fprintf(stderr, "cannot set channel count (%s)\n", snd_strerror(err));
    exit(1);
  }

  if ((err = snd_pcm_hw_params(playback_handle, hw_params)) < 0) {
    fprintf(stderr, "cannot set parameters (%s)\n", snd_strerror(err));
    exit(1);
  }

  snd_pcm_hw_params_free(hw_params);

  /* tell ALSA to wake us up whenever 4096 or more frames
   of playback data can be delivered. Also, tell
   ALSA that we'll start the device ourselves.
   */

  if ((err = snd_pcm_sw_params_malloc(&sw_params)) < 0) {
    fprintf(stderr, "cannot allocate software parameters structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }
  if ((err = snd_pcm_sw_params_current(playback_handle, sw_params)) < 0) {
    fprintf(stderr, "cannot initialize software parameters structure (%s)\n",
        snd_strerror(err));
    exit(1);
  }
  if ((err = snd_pcm_sw_params_set_avail_min(playback_handle, sw_params, 4096))
      < 0) {
    fprintf(stderr, "cannot set minimum available count (%s)\n",
        snd_strerror(err));
    exit(1);
  }
  if ((err = snd_pcm_sw_params_set_start_threshold(playback_handle, sw_params,
      0U)) < 0) {
    fprintf(stderr, "cannot set start mode (%s)\n", snd_strerror(err));
    exit(1);
  }
  if ((err = snd_pcm_sw_params(playback_handle, sw_params)) < 0) {
    fprintf(stderr, "cannot set software parameters (%s)\n", snd_strerror(err));
    exit(1);
  }

  /* the interface will interrupt the kernel every 4096 frames, and ALSA
   will wake up this program very soon after that.
   */

  if ((err = snd_pcm_prepare(playback_handle)) < 0) {
    fprintf(stderr, "cannot prepare audio interface for use (%s)\n",
        snd_strerror(err));
    exit(1);
  }

  while (1) {
    /* wait till the interface is ready for data, or 1 second
     has elapsed.
     */
    if ((err = snd_pcm_wait(playback_handle, 1000)) < 0) {
      fprintf(stderr, "poll failed (%s)\n", strerror(errno));
      break;
    }

    /* find out how much space is available for playback data */

    if ((frames_to_deliver = snd_pcm_avail_update(playback_handle)) < 0) {
      if (frames_to_deliver == -EPIPE) {
        fprintf(stderr, "an xrun occured\n");
        break;
      } else {
        fprintf(stderr, "unknown ALSA avail update return value (%d)\n",
            frames_to_deliver);
        break;
      }
    }

    frames_to_deliver = frames_to_deliver > 4096 ? 4096 : frames_to_deliver;

    /* deliver the data */
    if (playback_callback(frames_to_deliver) != frames_to_deliver) {
      fprintf(stderr, "playback callback failed\n");
      break;
    }
  }

  snd_pcm_close(playback_handle);
  exit(0);
}

Minimal full duplex program

Full duplex can be achieved by combining the playback and acquisition designs shown above. Although many existing Linux audio applications use this design, in the author's opinion it has serious flaws. The interrupt-driven example represents a fundamentally better design in many cases. However, extending this to full duplex is quite complex. That's why I suggest you forget everything here.

the term

Collection
Receives data from the outside world (unlike "recording", which means storing the data somewhere, this is not part of the ASLA API)

Play
Passes data to the outside world where it may, although not necessarily, be heard.

Full duplex
Simultaneous capture and playback on the same interface.

xrun
Once the audio interface starts running, it will continue to run until told to stop. It will generate data for the computer to use and/or send data from the computer to the outside world. For various reasons, your program may not be able to keep up with it. For playback, this can cause the interface to require new data from the computer but it is not present, forcing it to use the old data left in the hardware buffer. This is called an "underrun". For acquisition, the interface may have data to send to the computer, but nowhere to store it, so it must overwrite a portion of the hardware buffer that contains data that the computer has not yet received. This is called an "overrun". For simplicity, we use the generic term "xrun" to refer to both cases.

PCM
pulse code modulation. This phrase (and acronym) describes a method of representing analog signals in digital form. It is the method used by almost all computer audio interfaces and is used in the ALSA API as shorthand for "audio".

Number of channels (channel)

A frame
sample is a value that describes the amplitude of an audio signal at a single point in time on a single channel . When we talk about working with digital audio, we often want to talk about data that represents all channels at one point in time. This is a collection of samples, one for each channel, often called a "frame". When we talk about the passage of time in terms of frames, it's roughly equivalent to what people measure in terms of samples, but more accurate; and more importantly, when we talk about the amount of data required to represent all channels at a certain point in time, it's is the only meaningful unit. Almost every ALSA audio API function uses frames as a unit of measurement for data volume.

Interleaved
A data layout arrangement in which samples from each channel played simultaneously follow each other in sequence. See "non-interleaved".

non-interleaved
A data layout arrangement in which samples from a single channel follow each other sequentially; samples from another channel are either in another buffer or in another part of this buffer. Contrast with "staggered."

The sample clock
is a timing source used to mark the time when samples should be delivered to and/or received from the outside world. Some audio interfaces allow you to use an external sample clock, either a "word clock" signal (commonly used in many studios) or an "auto-sync" which uses a clock signal from the incoming digital data. All audio interfaces have at least one sample clock source that resides on the interface itself. Usually a small crystal clock. Some interfaces don't allow changing the frequency of the clock, and some interfaces' clocks don't actually run at the frequency you expect (44.1kHz, etc.). It's impossible to expect two sample clocks to run at exactly the same frequency - if you need two sample streams to stay in sync with each other, they must run from the same sample clock.

How to do. . .

Open the distinction

ALSA separates capture and playback…

Setting parameters

We mentioned above that there are many parameters that need to be set in order for the audio interface to do its job. However, since your program doesn't actually interact directly with the hardware, but with the device driver that controls the hardware, there are actually two different sets of parameters:

Hardware parameters

These are parameters that directly affect the audio interface hardware.

Sample Rate
If the interface has analog I/O, this controls the rate at which A/D/D/A conversions are completed. For an all-digital interface, it controls the clock speed used to move digital audio data to/from the outside world. On some audio interfaces, other device-specific configurations may mean that your program has no control over this value (for example, when the interface is told to use an external wordclock source to determine the sample rate).

Sample Format
This parameter controls the sampling format used to transmit data to and from the interface. It may or may not correspond to a format directly supported by the hardware.

Number of Channels
Hopefully this is self-explanatory.

Data Access and Layout
This parameter controls how the program passes data to or receives data from the interface. There are two parameters controlled by 4 possible settings. One parameter is whether a "read/write" model will be used, where explicit function calls are used to transfer data. Another option here is to use "mmap mode", where data is transferred by copying between memory regions and the API call only needs to pay attention to when it starts and ends.
Another parameter is whether the data layout is interleaved or non-interleaved.

Interrupt Interval
This parameter determines how many interrupts the interface will generate on each complete traverse of its hardware buffers. It can be set by specifying the period number and period size. Since this determines the number of frames of space/data that must accumulate before the interface interrupts the computer, it is central to controlling latency.

Buffer Size
This parameter determines how big the hardware buffer is. It can be specified in units of time or number of frames.

Software parameters

These are parameters that control the operation of the device driver rather than the hardware itself. Most programs that use the ALSA audio API will not need to set any of them; some applications may need to set some of their parameters.

When to turn on a device
When you turn on an audio interface, ALSA ensures that it is not active - no data is being moved to or from its external connectors. Presumably you want to start data transfer at some point. There are several options to do this.
The control point here is the launch threshold, which defines the number of frames of space/data required before the device is automatically launched. If some non-zero value is set for playback, the playback buffer needs to be prefilled before the device is started. If set to zero, the first block of data written to the device (or the first attempt to read data from the acquisition stream) will start the device.
You can also snd_pcm_startstart the device explicitly using , but this requires pre-filling the buffer in the case of streaming. If you try to start the stream without doing this first, you will get a return code -EPIPEindicating that there is no data waiting to be passed to the playback hardware buffer.

What to do during xruns
If an xrun occurs, the device driver can, if requested, take some action to handle it. Options include stopping the device, or muting all or part of the hardware buffer used for playback.

停止阈值

    如果可用的数据/空间的帧数达到或超过这个值,驱动将停止接口。

静音阈值

    如果播放流的可用空间的帧数达到或超过这个值,驱动将用静音数据填充部分播放硬件缓冲区。

静音大小

    当达到静音阈值级别时,这决定了将多少静音帧写入播放硬件缓冲区。

Minimum space/data available for wakeup Programs that use or to determine when audio data can be transferred to or from the interface can set this to control which point relative to the hardware buffer state they wish to wake up
from poll(2).select(2)

Transfer Block Size
This parameter determines the number of frames used when transferring data to/from the device hardware buffer.

There are some other software parameters, but we don't need to worry about them here.

Why can you forget everything here

In a word: JACK .

In short, it is best not to use such a low-level audio API as ALSA. The open source community has several very good Linux audio service implementations: JACK , PulseAudio , and PipeWire . The interfaces of these audio services are simpler and more convenient to use.

Reference document
A Tutorial on Using the ALSA Audio API

PipeWire Late Summer Update 2020

How to Use PipeWire to replace PulseAudio in Ubuntu 22.04

Recording and playback under Linux

Use PipeWire to replace PulseAudio

What’s so good about PipeWire after 200 days of use?

How use PulseAudio and JACK?

Writing an ALSA driver: PCM Hardware Description

Analysis of Android ALSA Audio System Architecture (1) - - Understanding Audio from Loopback

Guess you like

Origin blog.csdn.net/tq08g2z/article/details/125023630