soxexam - Manpage - Tux24 Net - Linux Unix Network
A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z




NAME
    soxexam - SoX Examples (CHEAT SHEET)

CONVERSIONS
    Introduction

    In general, SoX will attempt to take an input sound file format and
    convert it into a new file format using a similar data type and sample
    rate.  For instance, "sox monkey.au monkey.wav" would try and convert
    the mono 8000Hz u-law sample .au file that comes with SoX to a 8000Hz
    u-law .wav file.

    If an output format doesn't support the same data type as the input
    file then SoX will generally select a default data type to save it in.
    You can override the default data type selection by using command line
    options.  This is also useful for producing an output file with higher
    or lower precision data and/or sample rate.

    Most file formats that contain headers can automatically be read in.
    When working with header-less file formats then a user must manually
    tell SoX the data type and sample rate using command line options.

    When working with header-less files (raw files), you may take advantage
    of the pseudo-file types of .ub, .uw, .sb, .sw, .ul, and .sl. By using
    these extensions on your filenames you will not have to specify the
    corresponding options on the command line.

    Precision

    The following data types and formats can be represented by their  total
    uncompressed bit precision.  When converting  from one data type to
    another care must be taken to insure it has an equal or greater preci-
    sion.  If not then the audio quality will be degraded. This is not
    always a bad thing when your working with things such as voice  audio
    and are concerned about disk space or bandwidth of the audio data.

     Data Format  Precision
     ___________  _________
     unsigned byte 8-bit
     signed byte 8-bit
     u-law     14-bit
     A-law     13-bit
     unsigned word  16-bit
     signed word   16-bit
     ADPCM     16-bit
     GSM     16-bit
     unsigned long  32-bit
     signed long   32-bit
     ___________  _________

    Examples

    Use the '-V' option on all your command lines. It makes SoX print out
    its idea of what is going on. '-V' is your friend.

    To convert from unsigned bytes at 8000 Hz to signed words at 8000 Hz:

  sox -r 8000 -c 1 filename.ub newfile.sw

    To convert from Apple's AIFF format to Microsoft's WAV format:

  sox filename.aiff filename.wav

    To convert from mono raw 8000 Hz 8-bit unsigned PCM data to a WAV file:

  sox -r 8000 -u -b -c 1 filename.raw filename.wav

    SoX may  even be used  to convert sample rates. Downconverting will
    reduce the bandwidth of a sample, but will reduce storage space on your
    disk.  All such conversions are lossy and will introduce some noise.
    You should really pass your sample through a low pass filter prior to
    downconverting as this  will prevent alias signals (which would sound
    like additional noise). For example to convert from a sample recorded
    at 11025 Hz to a u-law file at 8000 Hz sample rate:

  sox infile.wav -t au -r 8000 -U -b -c 1 outputfile.au

    To add a low-pass filter (note use of stdout for output of the first
    stage and stdin for input on the second stage):

  sox infile.wav -t raw -s -w -c 1 - lowpass 3700 |
   sox -t raw -r 11025 -s -w -c 1 - -t au -r 8000 -U -b -c 1 ofile.au

    If you hear some clicks and pops when converting  to u-law or A-law,
    reduce the output level slightly, for example this will decrease it by
    20%:

  sox infile.wav -t au -r 8000 -U -b -c 1 -v .8 outputfile.au

    SoX is great to use along with other command line programs by passing
    data between the programs using pipelines. The most common example is
    to use mpg123 to convert mp3 files in to wav files. The following com-
    mand line will do this:

  mpg123  -b 10000 -s filename.mp3 | sox -t raw -r 44100 -s -w -c 2 -
    filename.wav

    When working with totally unknown audio data then the "auto" file for-
    mat may be of use. It attempts to guess what the file type is and then
    you may save it into a known audio format.

  sox -V -t auto filename.snd filename.wav

    It is important to understand how the internals of SoX work with com-
    pressed audio including u-law, A-law, ADPCM, or GSM.  SoX takes ALL
    input data types and converts them to uncompressed 32-bit signed  data.
    It will  then convert this internal version into the requested output
    format. This means additional noise can be introduced from decompress-
    ing data and then recompressing.  If applying multiple effects to audio
    data, it is best to save the intermediate data as PCM data. After the
    final effect is performed, then you can specify it as a compressed out-
    put format. This will keep noise introduction to a minimum.

    The following example applies various effects to an 8000 Hz ADPCM input
    file and then end up with the final file as 44100 Hz ADPCM.

  sox firstfile.wav -r 44100 -s -w secondfile.wav
  sox secondfile.wav thirdfile.wav swap
  sox thirdfile.wav -a -b finalfile.wav mask

    Under a DOS shell, you can convert several audio files to an new output
    format using something similar to the following command line:

  FOR %X IN (*.RAW) DO sox -r 11025 -w -s -t raw $X $X.wav

EFFECTS
    Special thanks goes to Juergen Mueller (jmeuller@uia.au.ac.be) for this
    write up on effects.

    Introduction:

    The core problem is that you need some experience in using effects in
    order to say "that any old sound file sounds with effects absolutely
    hip". There isn't any rule-based system which tell you the correct set-
    ting of all the parameters for every effect. But after some time you
    will become an expert in using effects.

    Here are some examples which can be used with any music sample. (For a
    sample where only a single instrument is playing, extreme parameter
    setting may make well-known "typically" or "classical" sounds. Like-
    wise, for drums, vocals or guitars.)

    Single effects will be explained and some given parameter settings that
    can be used to understand the theory by listening to the sound file
    with the added effect.

    Using multiple effects in parallel or in series can result either in a
    very nice sound or (mostly) in a dramatic overloading in variations of
    sounds such that your ear may follow the sound but you will feel unsat-
    isfied. Hence, for the first time using effects try to compose them as
    minimally as possible. We don't regard the composition of effects in
    the examples because too many combinations are possible and you really
    need a very fast machine and a lot of memory to play them in real-time.

    However,  real-time playing of  sounds will greatly speed up learning
    and/or tuning the parameter settings for your sounds in order to get
    that "perfect" effect.

    Basically, we will use the "play" front-end of SoX since it is easier
    to listen sounds coming out of the speaker or earphone instead of look-
    ing at cryptic data in sound files.

    For easy listening of file.xxx ("xxx" is any sound format):

    play file.xxx effect-name effect-parameters

    Or more SoX-like (for "dsp" output on a UNIX/Linux computer):

    sox file.xxx -t ossdsp -w -s /dev/dsp effect-name effect-parame-
    ters

    or (for "au" output):

    sox file.xxx -t sunau -w -s /dev/audio effect-name effect-parame-
    ters

    And for date freaks:

    sox file.xxx file.yyy effect-name effect-parameters

    Additional options can  be used. However, in this case, for real-time
    playing you'll need a very fast machine.

    Notes:

    I played all examples in real-time on a Pentium 100 with 32 MB and
    Linux 2.0.30 using a self-recorded sample ( 3:15 min long in "wav" for-
    mat with 44.1 kHz sample rate and stereo 16 bit ). The sample should
    not contain any of the effects. However, if you take any recording of a
    sound track from radio or tape or CD, and it sounds like a live concert
    or ten people are playing the same rhythm with their drums or funky-
    grooves, then take any other sample. (Typically, less then four dif-
    ferent instruments and no synthesizer in the sample is suitable. Like-
    wise, the combination vocal, drums, bass and guitar.)

    Effects:

    Echo

    An echo effect can be naturally found in the mountains, standing  some-
    where on a mountain and shouting a single word will result in one or
    more repetitions of the word (if not, turn a bit around and try again,
    or climb to the next mountain).

    However,  the time difference between  shouting and repeating is the
    delay (time), its loudness is the decay. Multiple echos can have dif-
    ferent delays and decays.

    It is very popular to use echos to play an instrument with itself
    together, like some guitar players (Brain May from Queen) or vocalists
    are doing. For music samples of more than one instrument, echo can be
    used to add a second sample shortly after the original one.

    This will sound as if you are doubling the number of instruments  play-
    ing in the same sample:

    play file.xxx echo 0.8 0.88 60.0 0.4

    If the delay is very short, then it sound like a (metallic) robot play-
    ing music:

    play file.xxx echo 0.8 0.88 6.0 0.4

    Longer delay will sound like an open air concert in the mountains:

    play file.xxx echo 0.8 0.9 1000.0 0.3

    One mountain more, and:

    play file.xxx echo 0.8 0.9 1000.0 0.3 1800.0 0.25

    Echos

    Like the echo effect, echos stand for "ECHO in Sequel",  that is the
    first echos takes the input, the second the input and the first echos,
    the third the input and the first and the second echos, ... and so on.
    Care should be  taken using many echos (see introduction); a single
    echos has the same effect as a single echo.

    The sample will be bounced twice in symmetric echos:

    play file.xxx echos 0.8 0.7 700.0 0.25 700.0 0.3

    The sample will be bounced twice in asymmetric echos:

    play file.xxx echos 0.8 0.7 700.0 0.25 900.0 0.3

    The sample will sound as if played in a garage:

    play file.xxx echos 0.8 0.7 40.0 0.25 63.0 0.3

    Chorus

    The chorus effect has its name because it will often be used to make a
    single vocal sound like a chorus. But it can be applied to other
    instrument samples too.

    It works like the echo effect with a short delay, but the delay  isn't
    constant. The delay is varied using a sinusoidal or triangular modula-
    tion. The modulation depth defines the range the modulated delay is
    played before or after the delay. Hence the delayed sound will sound
    slower or faster, that is the delayed sound tuned around  the original
    one, like in a chorus where some vocals are a bit out of tune.

    The typical delay is around 40ms to 60ms, the speed of the modulation
    is best near 0.25Hz and the modulation depth around 2ms.

    A single delay will make the sample more overloaded:

    play file.xxx chorus 0.7 0.9 55.0 0.4 0.25 2.0 -t

    Two delays of the original samples sound like this:

    play file.xxx chorus 0.6 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32 0.4
    1.3 -s

    A big chorus of the sample is (three additional samples):

    play file.xxx chorus 0.5 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32 0.4
    2.3 -t    40.0 0.3 0.3 1.3 -s

    Flanger

    The flanger effect is like the chorus effect, but the  delay varies
    between 0ms and maximal 5ms.  It sound like wind blowing, sometimes
    faster or slower including changes of the speed.

    The flanger effect is widely used in funk and soul music, where the
    guitar sound varies frequently slow or a bit faster.

    The typical delay is around 3ms to 5ms, the speed of the modulation is
    best near 0.5Hz.

    Now, let's groove the sample:

    play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -s

    listen carefully between the difference of sinusoidal and triangular
    modulation:

    play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -t

    If the decay is a bit lower, than the effect sounds more popular:

    play file.xxx flanger 0.8 0.88 3.0 0.4 0.5 -t

    The drunken loudspeaker system:

    play file.xxx flanger 0.9 0.9 4.0 0.23 1.3 -s

    Reverb

    The reverb effect is often used in audience hall which are to small or
    contain too many many visitors which disturb (dampen) the reflection of
    sound at the walls. Reverb will make the sound be perceived as if it
    were in a large hall. You can try the reverb effect in your bathroom
    or garage or sport halls by shouting loud some words. You'll hear the
    words reflected from the walls.

    The biggest problem in using the reverb effect is the correct setting
    of the (wall) delays such that the sound is realistic and doesn't sound
    like music playing in a tin can or has overloaded feedback  which
    destroys  any illusion of playing in a big hall. To help you obtain
    realistic reverb effects, you should decide first how long the reverb
    should take place until it is not loud enough to be registered by your
    ears. This is be done by varying the reverb time "t".  To simulate
    small halls, use 200ms. To simulate large halls, use 1000ms. Clearly,
    the walls of such a hall aren't far away, so you should define its set-
    ting be  given every wall its delay time. However, if the wall is to
    far away for the reverb time, you won't hear the reverb, so the nearest
    wall will be best at "t/4" delay and the farthest at "t/2". You can try
    other distances as well, but it won't sound very realistic. The  walls
    shouldn't stand  to close to each other and not in a multiple integer
    distance to each other ( so avoid wall like: 200.0 and 202.0, or  some-
    thing like 100.0 and 200.0 ).

    Since audience halls do have a lot of walls, we will start designing
    one beginning with one wall:

    play file.xxx reverb 1.0 600.0 180.0

    One wall more:

    play file.xxx reverb 1.0 600.0 180.0 200.0

    Next two walls:

    play file.xxx reverb 1.0 600.0 180.0 200.0 220.0 240.0

    Now, why not a futuristic hall with six walls:

    play file.xxx reverb 1.0 600.0 180.0 200.0 220.0 240.0  280.0
    300.0

    If you run out of machine power or memory, then stop as many applica-
    tions as possible (every interrupt will consume a lot of CPU time which
    for bigger halls is absolutely necessary).

    Phaser

    The phaser effect is like the flanger effect, but it uses a reverb
    instead of an echo and does phase shifting. You'll hear the difference
    in the examples comparing both effects (simply change the effect name).
    The delay modulation can be sinusoidal or triangular, preferable is the
    later for multiple instruments. For single instrument sounds, the sinu-
    soidal phaser effect will give a sharper  phasing effect.  The  decay
    shouldn't be to close  to 1.0 which will cause dramatic feedback. A
    good range is about 0.5 to 0.1 for the decay.

    We will take a parameter setting as for the flanger before (gain-out is
    lower since feedback can raise the output dramatically):

    play file.xxx phaser 0.8 0.74 3.0 0.4 0.5 -t

    The drunken loudspeaker system (now less alcohol):

    play file.xxx phaser 0.9 0.85 4.0 0.23 1.3 -s

    A popular sound of the sample is as follows:

    play file.xxx phaser 0.89 0.85 1.0 0.24 2.0 -t

    The sample sounds if ten springs are in your ears:

    play file.xxx phaser 0.6 0.66 3.0 0.6 2.0 -t

    Compander

    The compander effect allows the dynamic range of a signal to be com-
    pressed or expanded. For most situations, the attack time (response to
    the music getting louder) should be shorter than the decay time because
    our ears are more sensitive to suddenly loud music than to suddenly
    soft music.

    For example, suppose you are  listening to Strauss' "Also Sprach
    Zarathustra" in a noisy environment such as a car. If you turn up the
    volume enough to hear the soft passages over the road noise, the loud
    sections will be too loud. You could try this:

    play file.xxx compand 0.3,1 -90,-90,-70,-70,-60,-20,0,0 -5 0 0.2

    The transfer function ("-90,...") says that very  soft sounds between
    -90 and  -70 decibels (-90 is about the limit of 16-bit encoding) will
    remain unchanged. That keeps the compander from boosting the volume on
    "silent"  passages such  as between movements. However, sounds in the
    range -60 decibels to 0 decibels (maximum volume) will be boosted so
    that the 60-dB dynamic range of the original music will be compressed
    3-to-1 into a 20-dB range, which is wide enough to enjoy the music but
    narrow enough to get around the road noise. The -5 dB output gain is
    needed to avoid clipping (the number is inexact,  and was derived by
    experimentation).  The 0 for the initial volume will work fine for a
    clip that starts with a bit of silence, and the delay of  0.2 has the
    effect of causing the compander to react a bit more quickly to sudden
    volume changes.

    Changing the Rate of Playback

    You can use stretch to change the rate of playback of an  audio sample
    while preserving the pitch. For example to play at 1/2 the speed:

    play file.wav stretch 2

    To play a file at twice the speed:

    play file.wav stretch .5

    Other related options are "speed" to change the speed of play (and
    changing the pitch accordingly), and pitch, to alter the pitch  of a
    sample. For example to speed a sample so it plays in 1/2 the time (for
    those Mickey Mouse voices):

    play file.wav speed 2

    To raise the pitch of a sample 1 while note (100 cents):

    play file.wav pitch 100

    Other effects (copy, rate, avg, stat, vibro, lowp, highp, band, reverb)

    The other effects are simple to use. However, an "easy to use manual"
    should be given here.

    More effects (to do !)

    There are a lot of effects around like noise gates, compressors, waw-
    waw, stereo effects and so on. They should be implemented, making SoX
    more useful in sound mixing techniques coming together with a  great
    variety of different sound effects.

    Combining effects by using them in parallel or serially on different
    channels needs some easy mechanism which is stable for use in  real-
    time.

    Really missing are the the changing  of the parameters and start-
    ing/stopping of effects while playing samples in real-time!

    Good luck and have fun with all the effects!

   Juergen Mueller    (jmueller@uia.ua.ac.be)

SEE ALSO
   sox(1),play(1),rec(1)

AUTHOR
    Juergen Mueller   (jmueller@uia.ua.ac.be)

    Updates by Anonymous.