Recording Voiceovers From Home — Part 3: Pushing the Big Red Button

As I mentioned back in part one of this series—Recording Voiceovers From Home — Part 1: Choosing A Microphone—the software I use to record my voiceover work is Adobe Audition. It’s certainly not the only option, but being part of the Creative Cloud suite means that a lot of us have it to hand.

That said, I’m going to assume that the tools and principles detailed here are not unique to Audition and that you can find them (or at least something very similar) in other digital audio applications. So it shouldn’t take too much effort to translate this into your software of choice.

Recording parameters

So now that we’ve got our recording environment under control, it’s time to get things moving. Open Adobe Audition. Click on File-New-Audio File (Cmd/Ctrl+Shift+N) to choose the recording parameters for your voiceover. Shift+Spacebar will also work if you don’t have a file open already.

Sample rate

Sample rate is the number of times per second that your source audio is captured. Think of it as your audio file’s frame rate. Setting this value also determines the highest audio frequencies your interface can detect, which might need a little explanation.

Rise and fall

Sound is a wave, so the positive and negative stages both need to be captured to establish its frequency, which is the time it takes to move through both. Because of this, you need to divide the sample rate by two in order to calculate the maximum audio frequency attainable. For example, a 48kHz sample rate will be able to “hear” frequencies up to 24kHz, while a 44.1kHz sample rate won’t be able to capture anything beyond 20,500Hz.

As we’ve already discussed, the range of human hearing is (at best) 100Hz to 20kHz, so should we really care about anything beyond this mark? Some would say yes. But, for voiceover work, I disagree. And here’s why.

Male voices occupy the 100Hz to 8kHz range, and female voices live in the 350Hz to 17kHz space. Extreme sibilance can take you up to the 30kHz range.

But even if you could capture that 30kHz sound, it’s unlikely you’d want to keep it and your export format is likely to strip it out, anyway—especially if your target is a video streaming platform. So there’s really no need to sample voiceover work at anything above 48kHz unless you’re planning to stretch it out in post. (Like you’d shoot video at 120fps if you were planning to slow-mo it in the edit.)

Bit depth

Bit depth is harder to squeeze into a video metaphor as it affects both the dynamic range of your audio and the resolution at which it’s captured. But higher bit rates are better, right?

The simple answer is yes, but maybe not for the reason you think. Spoken vocals have a narrow dynamic range so they won’t sound all that different as a result of being recorded in the audio equivalent of HDR. And while more bits means that your source audio can be reproduced more accurately, I’ve never been able to hear a difference between spoken vocals recorded in 16- and 24-bit.

16-bit uncompressed WAV

24-bit uncompressed WAV

Instead, higher bit rates increase the signal to noise ratio (SNR), which means that the noise floor—which is as close to silence as your equipment allows—can potentially be quieter in relation to your vocals. Given that silence is our goal when we’re not speaking, this might be worth the extra storage and processing requirements that come with more bits. But, like higher sample rates, it’s a case of diminishing returns.

It’s also worth noting that some audio drivers (like low-latency ASIO) may default to the native bit depth of the hardware and cannot be changed.

A case of mono

Oh, and in case you didn’t know, all microphones are monophonic (single-channel). Stereo microphones like the one shown below are just a matched pair in a single array.

So leave the Channels setting on Mono. Changing it Stereo will just duplicate the signal onto an identical pair left and right channels. This offers no benefit and will take up screen space that you’ll need later on.

Maxed out

If you’re not sure what’s best, just dial in the maximum settings supported by your interface, so that your recording matches the input signal as closely as possible. Audition’s default is to limit settings to those that are supported, and there’s no benefit to be had from overriding this behavior.

Setting your recording level

Input level is determined by your operating system, interface, and microphone, and isn’t something you can adjust from inside Audition. So you’ll need to either use the gain controls on your hardware, or dive into the settings for your device in Windows or MacOS and use the sliders to adjust your input.

What we’re aiming for is an input level for your spoken vocals that’s typically between the -18dBFS to -4dBFS range on the Levels meter, and never goes beyond the 0dBFS mark (see below), as this causes clipping and distortion. (dBFS stands for decibels at full scale, with zero being the maximum level possible in a digital recording.)

 
Adobe Audition Levels Meter

Double-click the Levels meter or hit Opt/Alt+I to enable live monitoring of your input, then right-click on it and choose Static Peaks. Grab your script and start a sample read-through. Keep an eye on the readout, but try to focus on your performance rather than fixating on the levels at this stage.

Because you’ve enabled Static Peaks, the peak indicator (the thin yellow line in the example below) will remain at the highest level reached during your read through—unless your levels exceed 0dB, in which case you’ll get a red indicator at the top of the meter.

 
Enabling Static Peaks will fix the peak indicator at the highest level reached during playback.

Using this as your guide, try and get your vocals sitting around the yellow section of the levels meter. But if in doubt, it’s safer to set the levels too low as this can be fixed in post with gain. Setting your levels too “hot” is the same as over-exposing a photograph. It’s data you can’t recover.

Here are a couple of tips before you hit that red button.

Limit noise makers

Take off any loose jewelry that might introduce noise, especially bracelets and rings that can knock against the desk you’re sitting at. Clear your desk, and turn off any noisy devices like fans, air conditioners, mobile phones, and computer system alerts.

Also, if you print your script out, make sure that no sentences span the pages, otherwise you’ll find yourself having to turn the page while speaking. And, if you’re reading off a screen, make sure you pause while you’re using the mouse to scroll. Using a mobile device with a touchscreen can be a better approach.

Loosen up

Vocal exercises really do help to get you prepared for recording, and can reduce the amount of corrections you’ll need for a decent take. There are dozens of techniques to choose from, but I favor these:

Sighing

Breathe deeply through your nose, relaxing your diaphragm and stomach muscles so your belly bulges out. Hold for a moment, then slowly release through your mouth, bringing tension back to your diaphragm and stomach. Repeat five times, and if you find that this induces a yawn, lean into it.

Cheek puffing

Close your mouth and puff your cheeks out as much as you can. Move your jaw and cheeks like you’re rinsing your mouth with water, only with air. Repeat five times.

Lip flapping

Take a deep breath and briefly hold it. Then relax your lips and exhale so that the expelled air causes your lips to flap vigorously. Do this two or three times.

Stay hydrated

It stands to reason that you should have some water on hand during the recording (no ice—it rattles, and some believe that the cold can tighten your vocal chords). But it’s just as important to be hydrated well in advance, as this will help to stop your mouth from drying out while speaking. So drink plenty of liquids at least two hours prior.

And, to be on the safe side, don’t eat or drink anything other than water from T-30min onwards. Especially coffee, or sugary, milky drinks. Or anything that contains nuts. (It should also go without saying that smoking or vaping is off the list, too, unless you want to deliberately alter your vocal timbre and are willing to accept the significant health risks involved.)

Use a pop filter/talk past the mic

Microphones are highly sensitive to vibration, so the wall of air that leaves your mouth during words that contain plosives (like ‘p’ and ‘b’) can cause audible thumps or “pops” in the recording. Placing a pop filter between your mouth and the microphone will diffuse this air, and significantly reduces the chances of this happening.

If you still get pops through your filter, talking “past” the microphone rather than directly at it will help, as the air will avoid the microphone pickup rather than hitting it square on. Just make sure that you don’t go so far off-axis that it causes an unwanted change to the audio quality.

(I’ve also heard that you can improvise a pop filter by affixing a pencil in front of the pickup. Never tried it.)

Rehearse. Rehearse. Rehearse.

You’ll be reading off a script, so you don’t need to learn your lines. But you absolutely need to rehearse. And this means reading it out loud. This will build a sense of familiarity with the flow and content of the script, allowing you to focus less on what’s coming next, and more on your modulation, emphasis, and timing.

If it’s a short piece, read it out at least twice before you start. This will also help to warm up your voice.

Get in position. And stay there.

As we’ve discussed, the proximity effect can dramatically color your vocals, but voiceovers need consistency. So use your headphones and monitor the live feed (if your interface offers it) to establish the best position relative to the microphone, and try to stay in that area for the entire take.

A lot of artists use the “hang ten” hand gesture to set the distance between microphone and mouth (don’t measure from the pop filter), or the distance of two clenched fists. But that’s only a guide—you do you.

And if you’re sitting, make sure that you’re up straight and not resting against the back of your chair.

Breathe easy

It’s tempting to take a deep breath at the beginning of a new sentence or paragraph. Don’t. Filling your lungs with air will significantly increase the speed at which it’s released during the first part of the sentence, and will make your first few words much louder than the rest.

Instead, take a medium breath through your nose or mouth (whichever’s quieter) and pause very slightly before you start to speak. This will result in a more balanced delivery, and the pause makes it easier to isolate and remove any unwanted breath noise later on.

If you find yourself running out of breath during a sentence, don’t be tempted to push on to the end, as your voice will sound starved. Just stop at a punctuation point or pause in the script, take another breath, and move on.

The time of day, the air quality, your emotional or energetic state, changes to the script—just about anything can alter the tone and timbre of your voice.

Put it all in one take

When you’ve done enough of this kind of work, you’ll no longer be surprised at how different your voice sounds at different times, even in a controlled studio environment. Sometimes even between takes.

The time of day, the air quality, your emotional or energetic state, changes to the script—just about anything can alter the tone and timbre of your voice. So my advice is to get everything you need in a single take, even if it means constantly repeating and retrying phrases and words to get them right. It’s a lot easier to cut the bad bits out of a single recording than to patch one together using takes that were recorded separately.

Did I say rehearse?

Yeah. Maybe run through it one more time. Just to be on the safe side.

Okay, NOW you can hit record

It’s your chance to shine, so hit Shift+Space and start recording. You might want to minimize Audition during the recording so you can focus on your performance, but that’s up to you.

Pro tip:Let the recording run for up to ten seconds before you start talking. Having a clean sample of the noise caused by your environment and equipment will prove useful during cleanup.

You’re going to have to be your own director, which means paying attention to your delivery, accuracy, pace, emphasis, tone, modulation, repetition… It takes practice to do this well, so don’t be disheartened if you have to record and re-record several times to get your performance sounding the way you want.

As mentioned, it’s best to get everything you need into one take. But you can go back and record over sections by placing the time selector tool (T) at the beginning of the part you want to lose, and hitting Shift+Space.

To limit this overwrite to a specific part of an otherwise good recording (like a single sentence), make sure you use the time selector to highlight the section you want to redo before you hit record. This will limit the recording to the In and Out points of your selection. Otherwise it’ll just keep on going, causing your previous work to be lost, and Bad Words to be spoken.

Pro tip: Audition has a recording mode called Punch and Roll (right click on the Record button to enable it). This will play back a segment of the preceding audio before your recording starts instead of going in cold. You’ll find more detail on it in this article on time-saving Audition tools.

When you’ve got what you need, save a copy of this raw, original file before you move into post. There’s no way to save a single audio file in Audition as a project that contains the data and any non-destructive effects, and there’s no versioning either (like After Effects’ Increment and Save).

So make a backup now, and then we can move on to our final part—cleanup, edit, and mastering.

Laurence Grayson

After a career spanning [mumble] years and roles that include creative lead, video producer, tech journalist, designer, and envelope stuffer, Laurence is now the managing editor for Frame.io Insider. This has made him enormously happy, but he's British, so it's very hard to tell.