

This page concerns itself with techniques for reproducing three-dimensional sound images. The focus is on use of multiple speakers to reproduce complete `fields' of sound in which the listener can sit or move around. This is rather different to HRTF-based techniques which are either reliant on headphones or a motionless listener facing in the correct direction at the `sweet spot' of a listening area. Tools are introduced which also construct more conventional mono or stereo recordings.
Ambisonics is a technique developed initially by Michael Gerzon in the early 70s. It provides a way to encode (sometimes by recording) three dimensional soundfields. These encoded soundfields can then be reproduced over various different speaker arrangements. This is known as Ambisonic decoding.
Ambisonics, unlike some other `surround' techniques, is based on solid mathematics. It records an approximation to the complete soundfield, the Ambisonic `order' indicating what level of accuracy is in use. (For the mathematicians out there, each order corresponds to an order of spherical harmonic.) Zeroth order corresponds to Mono, first order to the prevailing form in use at present, B-Format. This uses a four channel encoding which is usually decoded over a square or cube of speakers. Note that the four channels that make up B-Format are not themselves a speaker feed, merely an efficient way to carry the soundfield information.
I've set up a company specialising in Ambisonics and related technologies, Blue Ripple Sound.
"Ambisonics" is a registered trademark of Nimbus Communications International.
The Furse-Malham Set (sometimes known as FuMa or FMH) is a channel
layout for Ambisonics allowing extension to second (9 channel) or
third (16 channel) order. The extra channels allow extra detail to be
captured in the sound image. These formats (along with some "mixed
order" subsets) are supported with tags in the WAVE format, which is
typically indicated using a .amb file suffix.
To produce valid FuMa audio, "pan" using the first four channels for first order (equivalent to traditional B-Format), the first nine channels for second order, or the full set of sixteen for third order and place the corresponding channels of audio in the same sequence within a soundfile.
| Label | Order | Angle/Elevation Representation | Cartesian Representation |
|---|---|---|---|
| W | 0 | sqrt(1/2) |
sqrt(1/2) |
| X | 1 | cos(A)cos(E) |
x |
| Y | 1 | sin(A)cos(E) |
y |
| Z | 1 | sin(E) |
z |
| R | 2 | (1/2)(3sin(E)sin(E)-1) |
(1/2)(3zz-1) |
| S | 2 | cos(A)sin(2E) |
2zx |
| T | 2 | sin(A)sin(2E) |
2yz |
| U | 2 | cos(2A)cos(E)cos(E) |
xx-yy |
| V | 2 | sin(2A)cos(E)cos(E) |
2xy |
| K | 3 | (1/2)sin(E)(5sin(E)sin(E)-3) |
(1/2)z(5zz-3) |
| L | 3 | sqrt(135/256)cos(A)cos(E)(5sin(E)sin(E)-1) |
sqrt(135/256)x(5zz-1) |
| M | 3 | sqrt(135/256)sin(A)cos(E)(5sin(E)sin(E)-1) |
sqrt(135/256)y(5zz-1) |
| N | 3 | sqrt(27/4)cos(2A)sin(E)cos(E)cos(E) |
sqrt(27/4)z(xx-yy) |
| O | 3 | sqrt(27/4)sin(2A)sin(E)cos(E)cos(E) |
sqrt(27)xyz |
| P | 3 | cos(3A)cos(E)cos(E)cos(E) |
x(xx-3yy) |
| Q | 3 | sin(3A)cos(E)cos(E)cos(E) |
y(3xx-yy) |
Note that sqrt() indicates a square root,
A is angle measured anticlockwise from the front (in the
horizontal plane) and E is elevation from horizontal. If
you're using the Cartesian representation, then
x*x+y*y+z*z=1 (i.e. these are the direction cosines) with
x forwards, y to the left and z
upwards. Note that you must use the direction cosines (i.e. a
unit vector) here - using the actual sound coordinate will not
work.
Dave Malham also provides a further discussion of this encoding, including its relationship with other formats such as N3D/SN3D which are used typically by software for Ambisonic maths internally.
Note also that incorrect versions of the third order equations have appeared in some software. We've hopefully found most of these now, but please be careful and check the actual encoding equations in use. We've not seen issues at second order.
Speaker decoding equations and other information for the Furse-Malham Set at first and second order is available on this site.
B-Format soundfields can be encoded using Ambisonic Microphones and it may not be long before second order recordings can be made in this way. For the moment, second and third order encodings are most easily generated in software or hardware, live or by batch process. VSpace can be used to generate first or second order recordings using soundfiles and a script language.
The encoded soundfield can be worked with in its own right. Soundfields can be mixed by simple linear combination and rotated by matrix operations. Other manipulations are possible.
Pages on this website:
Some External Links and References:
The author
