Mixing Spoken Word and Music – Finding the Right Balance for your Audio Drama

Music and sound effects are a wonderful way to bring stories to life. An original soundtrack can take your fiction podcast to the next level and engage your audience in a deeper way. But the blending of music and spoken word can be tricky to mix properly, and the two can interfere with each other if not done properly. Since the story is the most important aspect of any fiction podcast (science fiction or otherwise) the music should not overpower the words but rather add a layer that helps tell the story.

In this article I would like to show you several techniques that I have found immensely helpful for making sure that both the spoken word and the music are both audible, so that the audience can enjoy the story without missing a beat and hear every word spoken.

The first technique, and probably the most important, is using the EQ (equalizer) to remove overlapping frequencies. EQ can be thought of as frequency specific volume control. Basically what happens is that when you add music to spoken word many of the frequencies in the sound overlap and tend to blur together, making both less audible and clear. It is easy to fix this problem with some basic EQ settings. If you are a podcaster EQ is an essential and easy to use tool for improving the quality of your audio recordings.

The human voice tends to sit in specific frequency ranges. Higher register voices tend to lie between 165 to 255 Hz. Lower register voices tend to lie between 85 to 180 Hz. Because the human voice tends to sit in this area of the lower mid-range, these frequencies can be turned down in the music to aid in making both clear. When these vocal frequencies and the same frequencies in the music are added together the result is that both are difficult to hear, mostly getting in the way of the clarity of the voice. We can use the EQ to remedy this issue. Open your EQ plugin on the music track and set the points to lay just outside of the vocal range, from there you can lower the DB of those particular frequency ranges (see pictures 1 & 2*). You may have to experiment a bit with how much to lower the volume of those frequencies, but once you find the right balance you will find that the vocal tracks are much easier to hear without interfering with the overall perceived volume of the music.

Example of EQ dip for higher register voice


Example of EQ dip for lower register voice

Another issue that can come up is bass masking and is also addressed using the EQ. This happens when there is a buildup of low frequencies, these frequencies along with their overtone series can cause “masking”, which will make the mix both sound muddy, and get in the way of the clarity of the vocal tracks. To fix this potential issue we again reach for the powerful tool of the EQ. The human hearing range is from 20 to 20k Hz. Most of the bass frequencies that can lead to masking lie in the 100 Hz and under range. Because we cannot hear the bass frequencies under 20 Hz we can lower their volume without changing our perception of the bass, while at the same time removing potential issues of masking. One might think that this removal of information would cause the music to sound less rich, but the truth is that when we convert our audio files to MP3 this frequency range tends to be removed anyway. One might think at this point that we could just let the MP3 compression take care of the removal of these frequencies, however this would leave the overtone series intact and still have potential issues.

Dealing with masking can be an easy process, simply open your EQ plugin and roll off the frequencies under 20 Hz (see picture 3*). This is also often called high-pass filtering, basically letting the high frequencies through while reducing the amplitude, or blocking all together, the low frequencies.

High-pass filtering

While these issues can get complicated, particularly the math involved, familiarizing yourself with the use of EQ and playing around with these simple tricks can greatly improve the audio quality of your podcast. By all means experiment and see what the powerful tool of EQ can do for your recordings, it is a simple tool to use and once you get the hang of it can go a long way in helping your podcast be listenable and clear.

* The EQ plugin shown for example is from Audacity (freeware mixing software), but these settings will work the same in any EQ plugin with any DAW.

Jean-Paul L. Garnier

Jean-Paul L. Garnier lives and writes in Joshua Tree, CA where he is the owner of Space Cowboy Books, a science fiction bookstore, independent publisher, and producer of Simultaneous Times podcast. In 2020 his first novella Garbage In, Gospel Out was released by Space Cowboy Books and in 2018 Traveling Shoes Press released Echo of Creation, a collection of his science fiction short stories. He has also released several collections of poetry: In Iudicio (Cholla Needles Press 2017), Future Anthropology (currently being translated into Portuguese), and Odes to Scientists (audiobook - Space Cowboy Books 2019). He is a two time Elgin Nominee and also appeared in the 2020 Dwarf Stars anthology. His new collection of SF poetry, Betelgeuse Dimming has just been released and is available as a free download audiobook / ebook at spacecowboybooks.bandcamp.com. He is also a regular contributor for Canada’s Warp Speed Odyssey blog. His short stories, poetry, and essays have appeared in many anthologies and webzines.