SOX basic use audio processing tools

1 Introduction

SoX can read and write audio files in popular formats, and in the process of selectively add some sound effects. It can be a combination of a plurality of input sources and sound synthesis, in many systems can also be used as an audio player or a multi-track recorder used.

SoX tools on most Linux systems can be installed directly through the package manager
install commands in Ubuntu: sudo apt-get install sox
On the Mac you can use the brew install soxcommand

SoX all the features of the tool are available through a simple sox command and the corresponding parameter options to achieve. But it also provides a play command for playing audio files, REC command for recording audio, as well as soxi command for acquiring audio 文件头information included.

The basic format of the above-described several command is as follows:

SYNOPSIS
       sox [global-options] [format-options] infile1
            [[format-options] infile2] ... [format-options] outfile
            [effect [effect-options]] ...
            
       play [global-options] [format-options] infile1
            [[format-options] infile2] ... [format-options]
            [effect [effect-options]] ...

       rec [global-options] [format-options] outfile
            [effect [effect-options]] ...
       
       soxi [-V[level]] [-T] [-t|-r|-c|-s|-d|-D|-b|-B|-p|-e|-a] infile1 ...

2. Basic use

2.1 acquiring metadata of audio files (also called audio head)

In Linux can be used on soxior sox --icommand, but win only with sox --ithe command, the file header to analyze the audio file, which acquires metadata (e.g., the number of channels, sampling rate, coding , etc.).

$ soxi Faded.wav

Input File     : 'Faded.wav'
Channels       : 2
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:03:32.63 = 9376836 samples = 15947 CDDA sectors
File Size      : 37.5M
Bit Rate       : 1.41M
Sample Encoding: 16-bit Signed Integer PCM

soxi 命令Keep a particular option to fetch only the information corresponding to this option, such as an audio file only shows Faded.wavthe bit rate (Bit Rate):

$ soxi -B Faded.wav
1.41M

soxi 命令Support all options and their meanings are as follows:

$ soxi
Usage: soxi [-V[level]] [-T] [-t|-r|-c|-s|-d|-D|-b|-B|-p|-e|-a] infile1 ...

-t  Show detected file-type
-r  Show sample-rate
-c  Show number of channels
-s  Show number of samples (0 if unavailable)
-d  Show duration in hours, minutes and seconds (0 if unavailable)
-D  Show duration in seconds (0 if unavailable)
-b  Show number of bits per sample (0 if not applicable)
-B  Show the bitrate averaged over the whole file (0 if unavailable)
-p  Show estimated sample precision in bits
-e  Show the name of the audio encoding
-a  Show file comments (annotations) if available

With no options, as much information as is available is shown for
each given file.

2.2 audio obtain statistical information

You can use sox <inputfile> -n statcommands to obtain statistical information of an audio file. Examples are as follows:

$  sox Faded.wav -n stat
Samples read:          18753672
Length (seconds):    212.626667
Scaled by:         2147483647.0
Maximum amplitude:     0.977417
Minimum amplitude:    -0.977478
Midline amplitude:    -0.000031
Mean    norm:          0.229415
Mean    amplitude:    -0.000006
RMS     amplitude:     0.302594
Maximum delta:         1.765564
Minimum delta:         0.000000
Mean    delta:         0.202369
RMS     delta:         0.273320
Rough   frequency:         6339
Volume adjustment:        1.023

2.3 playback and recording

play and rec commands provide basic playback and recording capabilities.

  • Play:$ play existing-file.wav
  • Record:$ rec new-file.wav

The above command is equivalent to the following commands in the form of sox:

  • Play: $ sox existing-file.wav −d
  • Record:$ rec −d new-file.wav

Where the -doptions for the audio device to use when playing or recording, then use the default device is not specified.

It can be understood:

  • sox existing-file.wav -dIt is from the existing-file.wav reading audio data comprising the document, and then outputted to -d(the default audio device, a speaker) for playback;
  • sox -d new-file.wavIs from the -dread audio data (the default audio device, a microphone), and then the output (record) to the new-file.wav file.

In fact, all follow a basic format, that is sox <input> <output>. Of which <input>和 <output>required that either a specific audio file, it can be a specific audio device.

While playing or recording, you can also apply specific effects or editing operation options for audio files, the audio data before the application of certain effect, you can use the play command a "preview."

The trim effects may be cut the extracted fragment to the output file specified from the audio file.

play command to play a specified interval which directly effect:

$ play foo.wav trim 10.0 5.0或 $play foo.wav trim 10.0 =15.0

The above code is playing audio segments file foo.wav between 10-15s

Use echo effect to play Faded.wav file:

$ play Faded.wav echo 0.8 0.88 200.0 0.4

Faded.wav:

 File Size: 37.5M     Bit Rate: 1.41M
  Encoding: Signed PCM
  Channels: 2 @ 16-bit
Samplerate: 44100Hz
Replaygain: off
  Duration: 00:03:32.63

In:12.1% 00:00:25.82 [00:03:06.81] Out:1.14M [-=====|=====-] Hd:2.7 Clip:0

2.4 Audio Format Conversion

2.4.1 type of file format

For the description of the audio data format, through the following four kinds of properties:

  • Sampling rate (Sample Rate) : means the sound converted from an analog signal into a digital signal process, the second continuous signal extracted from the number of samples used to form discrete signals. As used Audio CD sampling rate 44100 Hz, digital audio tape, and many computer systems using 48000 Hz, professional audio systems typically use 96000 Hz.
  • Sample size (sample size or Precision) : When storing audio sample data for the number of bits per sample (bits). Today, 16 bit sampling size has been widely used, 24 bit is mainly used for professional audio field.
  • Encoding formats (encoding Data) : i.e. each audio sample representation (i.e., "code") mode. Common encoding types include floating-point, μ-law, ADPCM, singed-integer PCM, MP3 , and the like FLAC.
  • Channel (channel) i.e. the number of files contained in the audio channels: Wherein the mono (Mono) and two-channel (Stereo) are the two most common, "surround" audio (Surround sound) typically comprises six or more channels.

Further, the audio file is also used 比特率(Bit Rate)representing the storage space within a unit of time occupied by the encoded audio signal, the value of which depends on all the four general parameters.

MP3-encoded stereo music usually has a bit rate of 128-196kbps, FLAC encoded stereo music usually has a bit rate of 550-760kbps.

2.4.2 Format Conversion

The simplest form of sox command that the use of two file names as arguments, such as:

$ sox Faded.wav Faded.mp3

The format of the wav file into Faded.wav mp3

When the above command is executed, the SoX will start Faded.wavto read the audio data file, which is then output to Faded.mp3a file. SoX program and parameter estimation in accordance with the file name extension of the corresponding format, and automatically transcode process of copying the audio data.

SoX can handle self-describingand rawaudio file formats.

self-describingFile formats (e.g., WAV, FLAC, MP3) comprising a header and a coded signal for describing attributes, and rawor headlessaudio format of the information is not included.

So when raw audio format as the input file, you need to specify its properties in the signal encoding and formatting options sox's command.

Common audio format options:

Options description
-b, --bits BITS Each sample encoded data bits occupied
-c, --channels CHANNELS The number of channels included in the audio file
-e, --encoding ENCODING Encoding type of the audio file
-r, --rate RATE Sample rate of the audio file
-t, --type FILE-TYP E audio file types

Options applicable to the above-described input or output files, primarily for explaining RAW (or headless ) file specifies the specific parameters of the output file format as the input information, or format conversion.

$ sox −r 48k −e float −b 32 −c 2 input.raw output.wav

Convert a particular audio file format for raw wav format

$ sox Faded.wav Faded.raw

Faded.wav into raw audio file format

$ play -r 44800 -b 16 -e signed-integer -c 2 Faded.raw

Play raw audio file formats

$ sox Faded.wav -c 1 Faded-mono.wav

After converting the file into Faded.wav mono (-c 1) Output

3. audio effects

SoX audio processing tool can process, the number of audio data input application effect.

You can use the following command to check the effect of help:

$ sox --help-effect all | less
sox:      SoX v

Effect usage:
allpass frequency width[h|k|q|o]
band [-n] center [width[h|k|q|o]]
bandpass [-c] frequency width[h|k|q|o]
bandreject frequency width[h|k|q|o]
bass gain [frequency(100) [width[s|h|k|q|o]](0.5s)]
bend [-f frame-rate(25)] [-o over-sample(16)] {start,cents,end}

You can also view a specific use audio effects directly:

$  sox --help-effect echo

sox:      SoX v
Effect usage:
echo gain-in gain-out delay decay [ delay decay ... ]

4. By way of example scenarios or

4.1. Changing the number of channels

Sox command changes the number of audio channels in the file, as will be converted to a two-channel monaural audio:

$ sox foo.wav foostereo.wav channels 2 
或者
$ sox foo.wav -c 2 foostereo.wav

However, the above command does not create a "real" two-channel audio, but the mono audio to replicate exactly the same two channels and then merged into the output file.

Sox command by -Mtwo left and right monaural audio channels option merges into a double channel file:

$ sox -M left.wav right.wav stereo.wav

Of course, also be processed by homogenization file binaural two channels, and outputs the monaural:

$ sox original.wav mono.wav channels 1 
或者
$ sox original.wav -c 1 mono.wav

remix command

By sox command remixresults may also be fused or complete extraction of data channels.

Extracting two-channel audio data file as a single-channel mono audio outputs:

$ sox stereo.wav left.wav remix 1(提取左声道音频)
$ sox stereo.wav right.wav remix 2(提取右声道音频)

Fusion two-channel audio data file and the two-channel monaural audio output as:

$ sox stereo.wav mono.wav remix 1,2 
或者
$ sox stereo.wav mono.wav remix 1-2

Further, remixmay also be a plurality of channel data input file are fused.

As used -Moption to merge two-channel audio, and then through the remixfour channels combined fusion twenty-two obtained, generating a file that contains only two output channels.

$ sox -M stereo1.wav stereo2.wav output.wav remix 1,3 2,4

4.2 change the volume

sox command -voption can be used (multiply) to change the size of the volume:

$ sox -v 0.5 foo.wav bar.wav

After the above command foo.wav 0.5 times the volume of the audio amplifier output to a file bar.wav

Volume zoom function may be combined with the stat effect.

In sox foo.wav -n stat -vdigital magnification command returns as to maximize foo.wavthe volume and will not appear clipping :

$ sox foo.wav -n stat -v 2> vc
$ sox -v `cat vc` foo.wav foo-maxed.wav

There is also an option --normfor 归一化audio loudness. To maximize the intensity of the audio sound, this option can be set in processing the input audio -1:

sox --norm=-1 <inputfile> <outputfile>

A portion of the extracted files 4.3

sox 命令The trimeffect can be input for a piece of audio cut out and extracted to the output file.

trim It accepts two parameters, a cutting segment as the start position, the other as the segment duration.

You can use the integer +sparameter format to the number of samples as the unit of measurement may be used directly ((hh:)mm:)ss(.fs)time parameter form. When the argument is a simple integer seconds.

$ sox Input.wav Half1.wav trim 0 30:00 截取输入文件中前 30 分钟的音频
$ sox Input.wav Half2.wav trim 30:00 30:00 截取输入文件中从第 30 分钟开始到第 60 分钟的音频

4.4. Splicing file

In contrast with the previous crop extraction operation, sox splicing command may also be implemented on two or more audio files.

$ sox Half1.wav Half2.wav Full.wav

Will Half1.wavand Half2.wavmerge to Full.wavfiles. Note that before the merge audio files to be consistent with the type and sample rate and so on.

4.5 audio synthesis

sox command may be synthesized in many types of standard waveform and noise effect by synth.

$ sox -n sine.wav synth 1.0 sine 1000.0

Synthesis frequency of 1000 Hz sine wave length of 1 second, sine.wav saved to a file.

synth support synthesized voice types sine、square、triangle、sawtooth、trapetz (trapezoidal)、exp (exponential)、whitenoise、pinknoise 和 brownnoise.

4.6 acoustics

sox command creates muted audio clips, use the -noption means there is no input by trimthe specified segment needs muted effect.

$ sox -n -r 48000 silence.wav trim 0.0 0.250

In slience.wavcreating a length of silence is 250ms sampling rate 48000Hz file.

4.7 audio mix

sox command -moption to mix two audio files generate output files later.

$ sox -m sine100.wav sine250.wav sine100-250.wav

The sine100.wav and sine250.wav two audio files as audio data after integration sine100-250.wav file.

$ sox -m -v0.5 music.mp3 -v2 speech.wav presentation.wav

After background music (music.mp3) fused to one half the volume to reduce the volume of 2-fold amplification of the human voice data (speech.wav).

If you are unsure fusion effect, you can start by playthe results "Preview" command using the same parameters:

$ play -m -v0.5 music.mp3 -v2 speech.wav

PS:

The previous -Moptions, the -moptions tend to be 声道数据mixed, ie two mono files -mmixed after data output is still mono. A single channel output file contains the characteristics of the two input channels.

The -Moptions are more inclined to merge the audio file, the default right-channel data are mixed. Therefore, two mono files -Mafter merging default output two-channel audio. Two output channels each file corresponding to the two input channels (data not mixed). Except by -ca specified number of channel output file of the manual option.

4.8 Changing the playback speed

You can stretchchange the playback speed of the audio file, but will not lead to changes in pitch.

Such as playing at 2x speed Faded.wav file:

$ play Faded.wav stretch 0.5

You can also speedadjust the playback speed of the effect (the pitch will change accordingly):

$ play Faded.wav speed 2

Further, using pitchthe effect of adjusting the pitch of an audio fragment, in cents (kobo) units.

$ play Faded.wav pitch 200

The audio file Faded.wav by 200 cents, i.e., increase the pitch two semitones (each semitone interval equal to 100 cents).

Published 140 original articles · won praise 51 · views 30000 +

Guess you like

Origin blog.csdn.net/weixin_38819889/article/details/104067720