VSpace Script Language Documentation

Overview

VSpace, MNLib and its documentation are Copyright 1999-2000 Richard W.E. Furse (all rights reserved).

This document provides an exhaustive description of the VSpace script language. For learning purposes it may be easier to start out with an example script and use this as a reference. This, with relevant input files, is available for download.


Command Line

vspace [options] script

VSpace expects to be passed at least a script file on the command line. The syntax of this file is described below.

Options supported are:

-e This option disables the early reflection engine.
-l This option disables the late reflection engine.
-s seconds This option causes VSpace to skip a number of seconds of audio output. This can be useful when working with a large script. This is cumulative with the skip command and has a similar effect (see below).
-v Switch to verbose operation. The application provides additional information about what it is doing. This includes speaker locations and decode matrices.

VSpace Scripts

A VSpace script normally has file suffix .vsp. The script contains the following sections:

  1. General Settings (distance unit, tempo, early reflection quality)
  2. Room Declaration (room dimensions, surface reflections, reverb characteristics)
  3. Recording Declarations (locations, microphone types, additional material to mix in)
  4. Tracks (sound files to mix to assemble the track audio, motion of the track audio within the room)

Comments may be placed on any line in the script. Comments are indicated by beginning the line with a hash symbol, for example:

# This is a comment.

Sound Files

VSpace works only with RIFF Wave files at the current time and only at a single sample rate. VSpace is usually built to run at 44.1kHz. If you need to work with other file formats you will need to convert them. SOX is a useful program for this task.

This document holds to the convention of using the file suffix .wav to indicate mono and stereo Wave files, .wxyz for four-channel first-order Ambisonic (B-Format) Wave files and .fmh for nine-channel second-order Ambisonic (Furse-Malham Higher Order Format) Wave files. Ambisonic Wave files cannot be played back correctly by conventional audio players without decoding first. See the MNLib Ambisonic Player and Ambisonic Decoder.

Coordinate System

VSpace uses a standard three-dimensional Cartesian coordinate system. Sounds and microphones are located using vectors with form <x,y,z>. By convention the three axes are generally interpreted relative to the the origin (<0,0,0>) so that positive X is forwards, negative X is backwards, positive Y is to the left, negative Y is to the right, positive Z is upwards, negative Z is downwards. By default the system uses metres as its unit of distance, so the vector <1,0,0> indicates a location one meter directly ahead of the origin and <3,2,1> indicates a location three metres ahead, two meters two the left and one metre up from the origin. Placing your main microphone at the origin pointing ahead can make these vectors easier to read. The unit of distance can be changed to the foot or yard using the distance unit general setting.

Note that it is possible to interpret the coordinate system in different ways, however the convention above is assumed with the encodings produced with Ambisonic microphones in this system. If you are using microphones you could in principle interpret the axes in other ways but in the opinion of the author this is likely to cause confusion.


General Settings

Each setting in this section of the script ends in a semicolon. All general settings have default values and are optional although it sensible always to include early reflection time and minimum early reflection gain lines as they can be used to control the quality of processing used during previewing and high-quality sound production.

Distance Unit

distance unit [metre|foot|yard];

This sets the unit which is used for all distances throughout the script. By default the distance unit used by VSpace is the metre.

Early Reflection Time

early reflection time time;

Early reflection time is specified in seconds.

The acoustic model in VSpace uses an accurate early reflection model for the first part of the response to a sound in the room. This model is used up to the time specified by this configuration item after which time the late reflection (traditional reverb) model takes over. This time defaults to 0.25. Note that large early reflection times can quickly require a very large number of actual early reflections. Use of verbose will provide feedback on the number of early reflections in use. As more early reflections are introduced the system will run proportionally slower. The number of reflections can be reduced by removing particularly faint reflections using minimum early reflection gain.

Early Reflection Minimum Gain

early reflection minimum gain gain;

The early reflection minimum gain cuts out faint early reflections to save computation time. Higher values cut out more early reflections. Values should be between zero and one.

The gain of an image is produced by starting from 1 and then multiplying the reflective rating of each wall the sound has been reflected off (see room reflections below). Further, the gain is divided by the distance of the centre of the room image from the centre of the real room. If this gain is greater than the minimum early reflection gain then reflected sound images there will be included in the output of the model. The number of room images in use is reported in verbose mode and has a very direct effect on the amount of time it takes for vspace to run.

This minimum gain can be used in conjunction with early reflection time to produce a significant set of early reflections for a room without having to compute a large number of insignificant early reflections.

The default setting for this minimum is 0.001.

Disable Early Reflections

disable early reflections;

This disables the early reflection engine.

Late Reflection LPF Cutoff Frequency

late reflection cutoff frequency;

During late reflection reverb the signal is processed using a low-pass filter. This option allows the cutoff frequency of this filter to be changed. By default this setting is derived from the shape of the room.

Late Reflection Gain

late reflection gain gain;

This gain modifies the late reflection reverb level of the room. A value of 1 leaves the calculated reverb level unchanged, values above 1 increase the level of the reverb and values below 1 reduce it.

Setting late reflection gain to 0 not only silences the late reflection reverb output but stops the late reflection reverb unit from running at all. This can save time when previewing audio.

Late Reflection Time

late reflection time time;

Late reflection (or reverb) time is automatically calculated based on room characteristics. However late reflection time can be overridden using this setting. The late reflection time is the time taken (in seconds) for the late reflection level to decay by 60dB.

Late Reflection Echo Density

late reflection echo density density;

This allows the echo density to be scaled for the late reflection reverb. Echo density determines the number of late reflections that occur per second. A value of 1 leaves the echo density unchanged at a value approximated for the room. Values above 1 increase the echo density and values below 1 reduce it.

Late Reflection Frequency Density

late reflection frequency density density;

This allows the frequency density to be scaled for the late reflection reverb. Frequency density controls the spacing of late reflections in frequency terms. A value of 1 leaves the frequency density unchanged, values above 1 increase the frequency density and values below 1 reduce it.

Disable Late Reflections

disable late reflections;

This disables the late reflection engine.

Tempo

tempo bpm;

This sets the tempo (in beats per minute) for the entire script. All times in the script are interpreted as beats relative to this tempo. The default tempo is 60bpm, so by default beat values are second values.

Skip

skip beats;

This causes the processor to skip to a particular point in the script and start producing audio from there. It is designed to save time when working on a later section of a long script. Note that the skip is not precise particularly for the first couple of seconds of output. This feature is designed for auditioning purposes only.

This command is cumulative with any skip operation set on the command line. Note that the command line is measured in seconds whereas the skip instruction is measured in beats.

Verbose

verbose;

This enabled verbose operation. This results in more information being displayed during audio production. In particular signal peaks are indicated.


Room Declaration

The room declaration block consists of the keyword room followed by a collection a room configuration lines enclosed in curly braces. Each room configuration line is ended with a semicolon. The room declaration is optional as a default room is available.

Room Dimensions

room {
...
dimensions back_x, front_x, right_y, left_y, floor_z, ceiling_z;
...
}

This setting determines the size and shape of the virtual acoustic space. The space is assumed to be a box shape with the origin within the box and with the walls of the box parallel to the three axes. Thus front_x indicates the X value corresponding to the front wall - setting front_x to four indicates that the front wall will be four metres ahead of the origin. By convention the room is assumed to enclose the origin thus front_x, left_y and ceiling_z are all expected to be positive and back_x, right_y and floor_z are all expected to be negative. This convention makes it easier to remember to place all microphones and sound sources within the confines of the room.

The default room dimensions of the acoustic space are:

dimensions -10, 20, -9, 11, -1, 8;

Assuming the normal listener location is at the origin, this places the listener in a hall thirty metres long, ten metres from the back. The hall is twenty metres wide and the listener is slightly to the right of the centre of the hall. The ceiling is eight metres above the listener and the floor a metre below so the room is nine metres high.

Room Reflections

room {
...
reflections back, front, right, left, floor, ceiling;
...
}

This setting determines how much audio is reflected off each wall. Values should be between 0 (no reflection) and 1 (complete reflection).

The default room reflection settings for the acoustic space are:

reflections 0.5, 0.5, 0.5, 0.5, 0.2, 0.6;

These settings produce moderate reflections from all walls, slightly higher reflection from the ceiling and little reflection from the floor.


Recording Declarations

A recording declaration block consists of the keyword recording followed by a filename and then configuration lines enclosed in curly braces. Each recording configuration line is ended with a semicolon. At least the recording device, location and direction (where appropriate) must be specified.

A number of recording devices are available. They record sound in the virtual acoustic space and write the recording to the sound file. The number of channels recorded depends on the microphone (or microphones in the case of `cardioid pair') in use.

A typical coincident `stereo pair' recording declaration (facing forward with 90 degree separation) is:

recording stereo.wav {
device cardioid pair;
location <0,0,0>, <0,0,0>;
direction <1,1,0>, <1,-1,0>;
}

A typical Ambisonic recording declaration is:

recording ambisonic.wxyz {
device ambisonic;
location <0,0,0>;
}

Recording Device

recording filename {
...
device device_type;
location location1[, location2];
[direction location1[, location2]];
...
}

The device keyword declares the kind of recording device in use. A recording device must be specified. Different recording devices need different location and direction information. Recordings made with Ambisonic microphones can be played back using the MNLib Ambisonic Player. The recording devices available are:

cardioid A single cardioid microphone is used to record a mono sound file. A location and direction is required. See above for example.
cardioid pair Two cardioid microphones are used to record a stereo sound file. Two locations and directions are required.
omnidirectional An omnidirectional microphone is used to record a mono sound file. A location is required.
simple omnidirectional An omnidirectional microphone is used to record a mono sound file. A location is required. This microphone is a cut-down version of the normal omnidirectional microphone. It operates faster than any other microphone. No delay lines are used in signal generation so no early reflections or Doppler shift is induced. This microphone is included for preview purposes during composition.
figure-of-eight A figure-of-eight microphone is used to record a mono sound file. A location and direction is required.
ambisonic This microphone is used to record a four-channel first-order (B-Format) Ambisonic sound file. These files are normally suffixed .wyxz. A location is required. See also the `simple ambisonic', `second order ambisonic' and `simple second order ambisonic' microphones.
second order ambisonic This microphone is used to record a nine-channel second-order Ambisonic sound file. These files are normally suffixed .fmh. A location is required. See also the `ambisonic', `simple ambisonic' and `simple second order ambisonic' microphones.
simple ambisonic This microphone is used to record a four-channel first-order (B-Format) Ambisonic sound file. A location is required. This microphone is a cut-down version of the `ambisonic' microphone. It operates much more quickly. No delay lines are used in signal generation so no early reflections or Doppler shift is induced. This microphone is intended for preview purposes only during composition.
simple second order ambisonic This microphone is used to record a nine-channel second-order Ambisonic sound file. A location is required. This microphone is a cut-down version of the normal `second order ambisonic' microphone. It operates much more quickly. No delay lines are used in signal generation so no early reflections or Doppler shift is induced. This microphone is intended for preview purposes only during composition.

Note that there is no dummy head implementation in this version of MN. This is because I don't have a suitable set of HRTFs to work from. I played with the MIT Kemar set of impulse responses with the MLib3 audio processing library but though reasonably effective, they colour the audio too much to be musically useful (IMHO). If you have something better, please let me know and if I like them I'll reimplement the dummy head recorder in MNLib and VSpace.

Note that directions are specified using a vector in the relevant direction. The magnitude of this vector is ignored.

Simple Microphones and Early Reflections

When using simple microphones it is usually a good idea to turn off early reflections. This is because simple microphones do not use delay lines, so the earl reflections arrive simultaneously. The early reflections will therefore take extra time to compute and will merely confuse the gain of the original sound. The gain usually decreases as immediate early reflections are in antiphase to the direct signal.

Core Radius

recording filename {
...
core radius radius;
...
}

Core radius settings can be used to break the laws of physics to arrange stable microphone behaviour when sources are very close to the microphone.

This sets the radius of the `core' sphere. The core sphere is the volume in which the inverse square law is not applied. For an Ambisonic microphone the gain of the sound is smoothly limited and directional information is made vague to make transitions across the core sphere more continuous. For other microphones, the gain simply remains constant within the core radius.

The default core radius is 1m for Ambisonic microphones and 1cm for other microphones.

Recording Gain

recording filename {
...
gain gain;
...
}

This allows the recording level to be changed to make full use of the S/N ratio available from the microphone and sound file format.

When the recording level is set to its default value (a gain of 1) sounds will pass through VSpace without a level change if they are 1m from the microphone and no reverb is in use. Note that changing the distance will affect the recorded level as will using reverb or changing the gain on the sound source.


Tracks

Any number of tracks may be included in a script, but large numbers will slow the program up. Each track has the following form:

[mute] track {
mix {
...mix lines...
}
motion {
...motion lines...
}
[direction {
...direction lines...
}]
}

Mix lines specify what mono sounds files are to played back to make up the track's audio and when and how loudly they are to be played. This section determines the mono track to be spatialised. The motion lines determine where the track is within room and how it moves as time progresses. Sound sources can be made directional by the addition of the direction section above.

For instance, to play a sound file `input.wav' back one metre to the right of the origin the following track description could be used:

track
mix { 0 input.wav; }
motion { 0 fixed <-1,0,0>; }

More complex constructs are often required.

Inclusion of the mute keyword excludes the entire track from playback.

Mix Lines

start_beat sound_filename [gain gain] [from start_point] [to end_point];

This specifies that the requested mono soundfile will be played back starting at the start beat. Use of the gain keyword allows a gain to be set for the sound playback. By default the whole of the mono sound file will be played back, but the section of it to be played can be selected by using the from and to keywords. These specify the start and end points of the audio to be played back in beats.

Note that although gain, from and to are optional, if they appear they must appear in the above order.

Motion Lines

Fixed location

beat fixed location_vector;

This line causes the sound to jump to the specified location instantly at the specified beat.

Linear Motion

start_beat to end_beat line start_vector to end_vector;

This instruction causes the sound to move between the two points over the time period leaving the sound at the end point when the end time is reached.

Arcs and Circular Motion

start_beat to end_beat arc centre centre_vector start start_vector to to_vector in to_time

This instruction causes the sound to move steadily in a circular arc during the time period between the start and end beats. The sound starts at the start_vector and moves in a circle around the centre_vector. The direction and speed of rotation is determined by the to_vector and to_time values chosen. The to_vector is a vector on the edge of the circle. The to_time is the amount of time taken for the sound to move from the start_vector to the to_vector using the shortest path possible on the circle. Generally these two vectors are chosen to be at right-angles to each other. For instance:

0 to 16 arc centre <0,0,1> start <4,0,1> to <0,4,1> in 2;

Assuming the listener is at the origin looking along the X axis then this produces a horizontal circular path centred one metre above the listener. The sound starts four metres ahead (and one metre up) and moves around in the first two beats to a point four metres to the left (and one metre up). For the rest of the sixteen beats the sound will continue on this path forming two complete circles.

If the to_vector chosen is not the same distance from the centre of the circle as the start_vector then it will be scaled until it is.

Direction Lines

Direction lines have the much same format as motion lines with the addition of the magnitude construct (see below), however the vector path generated is not used to guide the sounds location. Instead it is interpreted as the direction the sound source is pointing with its magnitude determining the cardioid behaviour of the sound source: a magnitude of 0 (the default when no directionality is specified) indicates that a sound is not directional, a vector magnitude of 1 indicates that the source will demonstrate a cardioid response. Values above 1 will produce an even louder `front' effect and the beginnings of an anti-phase effect at the back of the source. Note that direction lines can use the fixed, line and arc constructs.

The magnitude construct can be used after a vector (in fact any vector in VSpace) to change its magnitude while keeping its direction. For instance <21,28,0> magnitude 5 will produce vector <3,4,0>. This can be used to help translate a set of motion lines to produce a set of directions depending on the direction of the sound's motion:

track {
mix {
0 track_audio.wav;
}
motion {
0 to 4 line <4,0,2> to <-4,0,4>;
9 to 10 line <-4,0,4> to <4,0,2>;
}
direction {
0 fixed <-8,0,2> magnitude 0.5;
9 fixed <8,0,-2> magnitude 0.5;
}
}

This produces a weak cardioid shape in the direction of last motion. Note that the direction of the sound will have an abrupt change of direction at beat 9.

Direction lines can be confusing, however they can produce useful effects. For instance playing a sound pointing away from the recording device can bring out the early reflections in use and produce a more reverberant effect as in a real hall.


Links

The author Richard Furse can be emailed as richard@muse.demon.co.uk.

"Ambisonics" is a registered trademark of Nimbus Communications International.