Controlling Amplitude and Loudness (Learn Web Audio from the Ground Up, Part 3)

Posted on Tuesday Aug 30, 2016 by Tero Parviainen (@teropa)

In the previous article we discussed how we can change the frequencies of sounds to alter their pitch. Now we're going to talk about stretching sound waves along another axis, to change their amplitudes. This affects their loudness.

This post is part of a series I'm writing about making sounds and music with the Web Audio API. It is written for JavaScript developers who don't necessarily have any background in music or audio engineering.

Part 0: What Is the Web Audio API?
Part 1: Signals and Sine Waves
Part 2: Controlling Frequency and Pitch
Part 3: Controlling Amplitude and Loudness
Part 4: Additive Synthesis And the Harmonic Series

The amplitude of an audio signal is a measure of how high (and low) the wave extends from the x axis. For the sine waves we've been working with so far, this has been 1, since 1 (and -1) has been the maximum value these waves take.

Changing the amplitude of a signal is straightforward: We just need to multiply each sample with some constant number. In the case of sine waves, if what we want is a wave with amplitude a, our wave function becomes y = a * Math.sin(x). It's like changing the "size" of the oscillator that produces the wave:

More precisely, what we're talking about here is the peak amplitude of the sound wave, which measures how far from the x axis the highest points of the signal are. There are many other measures of amplitude as well. All of them measure the "height" of the signal, but they do it in different ways.

When we listen to a sound signal, the amplitude controls how loud we perceive the signal to be. Larger amplitude means louder sound. Zero amplitude means no sound.

How Do I Set Signal Amplitude in Web Audio?

A Web Audio sine wave oscillator always produces values between 1 and -1. This peak amplitude of 1 is constant and cannot be changed.

Instead, we use a separate gain control to modify the amplitude. There is a Web Audio node called GainNode exactly for this purpose.

To get a sine wave with half the amplitude of the default, we can connect an OscillatorNode to a GainNode whose gain parameter is set to 0.5. The result is a sine wave that oscillates between -1 * 0.5 = -0.5 and 1 * 0.5 = 0.5:

let audioCtx = new AudioContext Run / Edit

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 0.5;

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start and stop
osc.start();
osc.stop(audioCtx.currentTime + 2);

What we have built here is a Web Audio graph with three nodes in it. The audio signal is generated by the oscillator, and flows through the gain before reaching the destination. The gain alters the signal by halving its amplitude.

What Are the Limits of Amplitude and What Happens If I Exceed Them?

In theory, a sound wave can have any amplitude, and we can boost a wave by multiplying its samples with any arbitrarily large number. But in practice the Web Audio API limits amplitude to a certain threshold, as most digital audio systems do.

In Web Audio, all signals that reach the destination node should be between -1 and 1. [-1, 1] is the range inside which all of our audio signals should fall.

This means that the sine wave coming from an OscillatorNode is already at the maximum amplitude and we should only ever attenuate it, not intensify it.

But if you do send signals beyond the [-1, 1] limit to the destination, you may start to hear clipping. This is the result of the sine wave's peaks being squared off when they fall beyond the supported range. The resulting waveform is very different from the original and also sounds completely different.

I cheated while making this demo. Current versions Chrome and Firefox do not actually clip the signal beyond [-1, 1] at the moment, but instead seem to play a compressed version of the original signal, whatever its peak amplitude may be. Safari, on the other hand, does clip the signal. In order to demonstrate the clipping effect consistently across browsers I ran the signal through a ScriptProcessorNode.

The Web Audio spec leaves the handling of values outside the [-1, 1] undefined, and apparently different browser vendors have chosen different paths. Some have clipping, some don't. This alone is a good reason to always try to make sure your final audio signal does not go beyond the specified range.

How Do I Control Amplitude Changes Over Time?

In the previous article we discussed different ways of changing the frequency parameter of an OscillatorNode over time in order to play different pitches. Everything we discussed there also applies to controlling the gain parameter of a GainNode over time. We can set it to specific values at specific moments and ramp it up or down.

The similarity between the OscillatorNode.frequency and GainNode.gain APIs comes from the fact that they're actually both instances of the same Web Audio interface: AudioParam. This is a powerful abstraction that allows controlling different kinds of values with the exact same API.

So, we can set the gain using setValueAtTime:

let audioCtx = new AudioContext(); Run / Edit

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 1;

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start, change, and stop
osc.start();
gain.gain.setValueAtTime(0.5, 1);
osc.stop(audioCtx.currentTime + 2);

The problem with controlling gain this way though is that you often hear a very audible "pop" sound at the moment of the amplitude change. That's usually not what you want, but it's a direct consequence of the fact that there's a discontinuity in the sound wave when the amplitude changes. The wave instantaneously jumps from one y value to a different one:

This behavior also seems to differ between browser implementations. I hear a much more pronounced "pop" on Firefox than I do on Chrome. In any case it's better to avoid instant changes in gain because they may present a problem.

For this reason, we want to use a ramp even for loudness changes that should sound instantaneous. If we use a ramp duration of something like 30 milliseconds, we will eliminate the pops but still get a seemingly instant change, which is exactly what we want:

let audioCtx = new AudioContext(); Run / Edit

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 1;

// Schedule parameter changes
// A ramp from 1 to 0.5 between times 0.97 and 1
gain.gain.setValueAtTime(1, 1 - 0.03);
gain.gain.linearRampToValueAtTime(0.5, 1);

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start and stop
osc.start();
osc.stop(audioCtx.currentTime + 2);

And of course, ramps can also be used for an actual "fade in" or "fade out" effect, by just making them longer.

let audioCtx = new AudioContext(); Run / Edit

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 0;

// Schedule parameter changes
// Fade in during the first second
gain.gain.setValueAtTime(0, 0);
gain.gain.linearRampToValueAtTime(1, 1);

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start and stop
osc.start();
osc.stop(audioCtx.currentTime + 2);

Just like with frequencies though, exponential ramps often sound better than linear ones. But the problem with exponential ramps is that they don't really work when zero values are involved, as they are if we want to fully fade something in or out. This is because of the way the exponential formula is defined in the Web Audio spec. The source value is used as a divisor which means there would be a division by zero.

We could use fades with numbers very close to bot not exactly zero to work around this, but we can also use an alternative exponential ramping function that AudioParam provides: setTargetAtTime. It uses a different formula and requires a bit more work because you need to figure out an additional "time constant" argument. But with a suitable value we get a ramp that we perceive as a very smooth fade:

let audioCtx = new AudioContext(); Run / Edit

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 0;

// Schedule parameter changes
// Start ramping up to 1 right away, with a suitable time
// constant.
gain.gain.setValueAtTime(0, 0);
gain.gain.setTargetAtTime(1, 0, 5);

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start and stop
osc.start();
osc.stop(audioCtx.currentTime + 2);

What About Decibels?

We've seen how the perceived loudness of a sound can be controlled by changing its amplitude. But usually when people talk about loudness, they use a different measure: They talk about how many decibels the volume level of a sound is.

A tech rider for the band Sunn O))) mentions they operate at "125dB on stage". They play some fantastically loud shows.

Decibels indeed measure the loudness of sounds and it is often useful to use them in Web Audio applications as well. But there are two key points to understand about how decibel scales work:

Decibels are always a relative measure.
There's an exponential relationship between the amplitude of a sound wave and its loudness in decibels.

What Are Decibel Measurements Relative To?

When you say how loud an audio signal is in decibels, you're always comparing it to some other audio signal. One signal is 80dB relative to something else.

So theoretically there's an infinite amount of different decibel scales you could use, because you could compare any two possible sound signals with each other.

If this was all there was to it, it would be very difficult to talk about loudness though. To talk about loudness in a meaningful way you need shared reference points.

The reference point we usually use to measure sounds in the physical world is the Sound Pressure Level, or dB SPL. Its reference sound is a very quiet one: The lower threshold of human hearing, or, "roughly the sound of a mosquito flying 3 m away". So effectively, when we want to talk about loudness in decibels, we always compare it to the sound that mosquito at the other end of the room is making.

In Web Audio, and digital audio in general, this isn't quite as useful a measurement as you might think though. There is no way for us to make a sound in Web Audio that's, say, "ten times louder than a mosquite flying 3 m away". This is because the sound volume that a user actually hears depends on the air pressure changes caused by soundwaves reaching their eardrums. And there's no JavaScript API for that!

To begin with, pretty much every laptop and smartphone in the world has a "master volume" control which impacts how loud everything is. Not only that, but the volume a user perceives also depends on how far away from their speakers they happen to be sitting. That's because soundwaves in the air become weaker the farther they get. These are only some of the reasons why dB SPL is not a useful measure for us in Web Audio.

Because of external factors such as master volume controls, we cannot do much about the actual perceived volume in Web Audio. Photo: Ondřej Vokoun.

All we can really do in Web Audio is measure wave amplitudes and let the user's audio system determine how loud they actually end up being. But what we can do is define how loud our sounds are relative to each other. For this purpose, a useful decibel measurement for us is Decibels relative to full scale (dBFS), which is anchored on the maximum peak level possible in the system.

As we already discussed, in Web Audio our signal peak level is 1, since that is the highest sample value we can have. This is our anchoring point for dbFS. The wave coming from an OscillatorNode is at the peak level already, so its loudness is exactly 1 dBFS.

But what about levels other than 1? What is our dBFS scale?

The Relationship Between Amplitude and Decibels

In the previous article we established that there's an exponential relationship between the frequency of a soundwave and the pitch that we hear when we listen to it. Interestingly enough, there's a very similar exponential relationship that relates to loudness: When we make a sound a certain amount of decibels louder, the underlying sound wave's peak amplitude grows exponentially.

The specific formula for the decibel level of a sound wave is

d = 20 * log₁₀(a / a₀)

The a here is the amplitude of the wave and a₀ is the amplitude of the reference wave - the sound we are anchoring to. In dB SPL (the mosquito three meters away) a₀ is 0.00002, and in dBFS (which we commonly use in Web Audio) it is 1. In the latter case the formula conveniently simplifies to

d = 20 * log₁₀(a).

We can also flip the formula around, so that if we know the desired decibel level (dBFS), we can get the corresponding amplitude from that:

a = 10^d/20

What these formulas are saying is that as we increase the decibel level d, the sound wave amplitude grows very quickly, by a factor of 10^d. Or, conversely, as we increase the amplitude, the decibel level grows more slowly than that. We need a large change in amplitude to get an audible difference in the sound level.

And what's with the magic number 20? We can think of it as consisting of two factors: 10 * 2. The 10 comes from the unit of measure, decibel, or "one tenth of a bel". The 2 comes from the square relationship between signal power and amplitude.

The main practical benefit of using decibels is that now we can talk about loudness differences in straightforward linear terms. If we "decrease loudness by 3 dBFS", it's going to mean roughly the same thing whether it's from 0 to -3 or from -12 to -15, even though in terms of wave amplitude changes these two are very different.

How Do I Set The Decibel Level in Web Audio?

In JavaScript, if we know our gain and want to see what it means in dBFS, we can just get the logarithm and multiply it by 20:

let dbfs = 20 * Math.log10(gain);

Math.log10() was only added to JavaScript in ES2015. If you need to support older browsers and don't have a polyfill, you need to define it in terms of the natural logarithm first.

But what we more often want to do is adjust the volume in decibels and then convert that to a gain that we can give to a Web Audio GainNode, because GainNode does not understand decibels. For this we can invert the decibel formula to

let gain = Math.pow(10, dbfs / 20);

Here we go from maximum gain a couple of -12 dBFS steps down at one second intervals:

function dBFSToGain(dbfs) {  Run / Edit
  return Math.pow(10, dbfs / 20);
}

let audioCtx = new AudioContext();

// Create nodes
let osc = audioCtx.createOscillator();
let gain = audioCtx.createGain();

// Set parameters
osc.frequency.value = 440;
gain.gain.value = 1;

// Schedule parameter changes
// Drop to -12 dBFS at ~1s
gain.gain.setValueAtTime(1, 1 - 0.03);
gain.gain.linearRampToValueAtTime(dBFSToGain(-12), 1);
// Drop to -24 dBFS at ~2s
gain.gain.setValueAtTime(dBFSToGain(-12), 2 - 0.03);
gain.gain.linearRampToValueAtTime(dBFSToGain(-24), 2);

// Connect graph
osc.connect(gain);
gain.connect(audioCtx.destination);

// Schedule start and stop
osc.start();
osc.stop(audioCtx.currentTime + 3);

Tero Parviainen is an independent software developer and writer.

Follow @teropa

Tero is the author of two books: Build Your Own AngularJS and Real Time Web Application development using Vert.x 2.0. He also likes to write in-depth articles on his blog, some examples of this being The Full-Stack Redux Tutorial and JavaScript Systems Music.

Learn Web Audio from the Ground Up, Part 3: Controlling Amplitude and Loudness