Music and Sound Design in Video Games: A 2-Part Series

 

The Melbourne International Games Week took place on October 5th – 13th, 2019 and was jam-packed with great conferences and panels! One of them being High Score, which focused on game music composition styles.

The event reunited renowned video game music composers such as Manami Matsumae (Mega Man, Shovel Knight), Takahiro Izutani (Metal Gear Solid, Bayonetta) but also newer faces like Chipzel (OctaHedron, Interstellaria).
 
 
 
 
This conference inspired us to explore a topic that’s been a core part of game design since its inception – the influence of video game soundtracks on a player’s immersion. This exploration will be divided into two articles: first we will discuss the evolution of technology from the late 70s to current times, and in our second part, we will approach the atmospheric implications of music and sound design in video games.
 
 

From the 8-Bit Era to Modern Times

Between Pink Floyd, Stevie Wonder, Led Zeppelin and David Bowie, the 70s sure left their mark on our creative musical history. These golden years also saw the advent of the first gaming consoles and, as opposed to the explosive live scene at the time, early consoles induced a set of creative restrictions in terms of music and sound design. Developers were aware of the importance of audio in video games very early on and, with these limitations in mind, they had to resort to ingenious techniques to be able to convey the musical atmosphere they desired for their game

In current times, we tend to think about video game soundtracks as a way to soothe the experience and gameplay while complementing the narrative. But how has video game music changed over time? We’ve benefited from decades of technological improvements and discoveries to get where we’re at now, let’s explore some of them!
 
 

Music and sound design in video games were shaped by technological limitations

Let’s jump back in time 42 years. The year is 1977 and while Donna Summer was singing I Feel Love, Digital Audio Workstations (DAWs) were only a mere concept. Music production and diffusion was a lot more daunting than it is today. Besides writing and composition, musicians had to make use of tape machines to record their music and mixing had to be done entirely by hand in real-time. Currently, most music in games is created via computers thanks to DAWs (computer-based mixing board consoles) which can be used as universal tools for writing, composing, recording and mixing music.

But to get this far, DAWs had to bypass two major technological limitations: The high price of storage space and low processor speeds.
 
 
 
 

What was the impact of those two limitations on video game soundtracks?

While we can still hear faint echoes of arcade game music, warping back 42 years most importantly brought us back to the advent of the Atari 2600 VCS, a console belonging to the infamous category of 8-bit machines. Let’s take a closer look at what “8-bit” actually means and how it shaped audio production and the video game music industry as a whole. As reference, today’s computer processors are in either 32- or 64-bit.

To clarify, the processor is the main calculation unit of the machine. It allows programs to function by processing information in the forms of electrical signals. We come across the term “bits” every day without even noticing it. 128kBits/s MP3 files, Mbps internet connection, 64-bits operating systems, etc.
 
 

What exactly is a “bit”?

Bits are binary units of information. One bit stores one piece of information. They’re labeled binary because the information is either “true” or “false”, respectively encoded by a “1” or a “0”.

Now, let’s use our short trip to the past and remember our IT classes. In computers, these two pieces of information, either 1 or 0, can be represented by the state of a flip-flop circuit. If the bit status is set to 1 or true, then electricity will run through the circuit. If the bit has been encoded to a 0 or false, no current will be circulating. By combining these bits, complex pieces of information can be assembled, stored and transmitted.

In the case of an 8-bit machine, the combination of those 8 bits offers 256 possible outcomes.
 
 
 
 

What does it have to do with early home consoles?

At the time, audio in games was not only defined by the capacity of the processor but also by the capacity of the audio chip of the console. Audio was more than often directly synthesized by the audio chip and not pre-recorded like today. Bear with us, it’s going to get a bit technical!

In the case of the Atari 2600 VCS, the pitch (or frequency of the sound) was encoded at 5 bits. Thus, the resolution of the audio was based on a combination of 5 bits, giving an interval of 32 possible outcomes for the pitch. In other words, only 32 notes of music could be played by a specific synthesis pattern. A few years later, the NES (also considered an 8-bit console due to its processing capacity) was made with an audio chip which could encode the pitch at 11 bits. This allowed the console to produce a range of 2048 possible frequencies for one synthesis pattern.
 
 
 
 
A synthesis pattern is a waveform which can take multiple shapes (triangles, sines or pulses) and is used to create sounds.

Here’s an example of the influence of the soundwave on the audio output. Here, you’ll mainly observe triangle waveforms which are represented via an oscillator.
 
 
 
 
While present in old video game music, real-time synthesis is a technique that you can also still find today in electronic music. To get a feel for it, have a look at live concerts of modular music.
 
 
 
 
In the video below, you can see some of the possibilities that modular synthesis offers. Some typical waveform shapes used to produce the sounds are present on the board of this Michigan Synthworks SY0.5.
 
 
 
 
Another limitation of this era was related to the number of circuits dedicated to sound synthesis. Audio chips were created with a defined number of synthesizer circuits attached to them. Thus, the consoles could only play a certain number of notes simultaneously. To illustrate, each and every note you heard while playing E.T. was synthetized in real-time by the console and ultimately limited by its synthesis capacity. If you haven’t played E.T., that’s alright—the game is really bad. But check out the music; it’s quite impressive for an Atari 2600 VCS piece.
 
 
 
 

Programming methods were also a limiting factor

Audio chips weren’t the only things restricting music creation—programming methods also had also a part to play. And yes, the music had to be manually programmed.

It may sound a bit strange, but music was a programmer’s job. The music engines they had to deal with were restricting the notes’ pitch and length while having no intuitive visual interface to input the notes. This means that programmers were creating the music with numbers and code rather than with guitars and drums. In addition to that, at the time storage capacity was lacking and musical pieces were often repeated over the game to save in cartridge storage space.

The musical atmosphere conveyed by games of the 8-bit era was consequently grounded in its technical limitations. But this did not prevent some musical pieces to cement themselves into our most fond memories. In technical terms, one could say that the video game music of the 8-bit era had few notes and audio channels while having limited tuning and synthesis. Now, what makes us agree with this statement but ultimately feel another way when we hear the first notes of the music of Metroid on NES?
 
 
 
 

CD-ROMs revolutionized video game soundtracks

Let’s jump in our time machine again and take a look at a piece of data storage equipment that boomed in the 90s. Yup, you guessed it, we’re going to talk about the CD-ROM.
 
 
 
 
CDs allowed one major thing—an increase of cheap storage space and with that, the pre-recording of music. No need for real-time synthesis anymore! Nowadays, physical versions of games are sold on CDs and rely on playing music and sounds that have already been recorded by musicians and sound designers.
 
 

What was the impact of CDs on audio quality?

The rise of CDs during the 90s was accompanied by the appearance of MP3 files and other encoding formats like WAV, AAC, FLAC, etc. With these new encoding formats, the quality of the audio output on consoles jumped up significantly.

In contrast to the encoding capacity of the audio chips of early home consoles, MP3 files format are defined by the number of bits used for the compression of the music recording before it’s burned on a CD. If we take the example of a 128 kB/s MP3 file, 128kBits are used per second to encode the music. Consequently, the scale of possible output frequencies is vastly superior—depending on the sound system. While the 8-bit era was about a pre-defined pattern rendered live by synthesizers, the audio on a CD is just a playback of pre-recorded music.

Below, you'll find a music comparison between the series Sonic the Hedgehog on the 16-bit console SEGA Genesis and Sonic Mania (2017, various platforms).
 
 
 
 

How did this shift influence the ways of audio production?

Presently, audio production in the game industry is a calibrated mix of various expertise based on the layering of two main elements: the music and the sound effects including dialogues.

For bigger titles, the typical way of putting together a soundtrack is to first produce and record the sound effects and dialogues. In parallel, resources are allocated for composing and recording the music. The audio content will then undergo master mixing and game implementation.

One of the most important things that music composers deal with when it comes to creating music for video games is the natural arrangement of the music and sound effects. More often than not, sound effects and dialogues will prevail over the music. Consequently, music composers have to take into account the vibe conveyed by the sound effects as well as their dedicated audio space in order to produce memorable musical pieces.

Below, Jonathan Mayer (senior music manager at Sony) explores the building blocks of sound design in games.
 
 
 
 
When it comes to managing audio space, there is another factor that comes into play: mixing. A big part of audio implementation is related to mixing, which consists of bringing together all the audio materials made for the game to create a well-balanced and coherent soundtrack. With the right mixing, a game will convey the emotions it was designed to without blasting your ears with over-the-top SFX.

Now, there are various types of methods that can be used to mix the audio of a game (active vs. passive mixing, snapshots, HDR, etc.), but we will not dive deeper into this subject here. Instead, let’s focus on what’s on the horizon for big production titles in terms of audio implementation.
 
 

Adaptive music, the next big step.

The advances in audio implementation processes have become a huge focus of the modern video game industry.

We’ll take a closer look at a technique of audio implementation which is changing the way we create and hear soundtracks in games—adaptive music. With the implementation of adaptive soundtracks in games, the main challenge is to play with in-game cues, chains of events, environmental time or location to dynamically layer part of the soundtrack, thus creating an ever-changing audio environment for the player. While interactive music is nothing new (it consists of launching a dedicated musical piece, let’s say a high-pitched melody, during a specific event such as a boss fight), adaptive music technology is taking the concept even further.

To discuss the meaning of adaptive music, let’s use the practical example of any open world game. Say you’re leaving a forest to explore one of the caves below. When shifting terrain, some of the musical layers of the forest theme (high, medium and low) will start fading away to be replaced by some of the layers of the dungeon theme. Then, you stumble upon a hidden treasure room which also has a dedicated theme. These layers will add to the dungeon theme without breaking any musical continuity because the game is dynamically layering the music by itself. Middleware such as FMOD and Wwise allows this triggering of the dedicated audio mixes created with DAWs.

The video below demonstrates a demo piecing together interactive and adaptive music under FMOD for the game Mirror’s Edge.
 
 
 
 
All in all, storage capacity and processing power had a significant impact on the video game industry and ultimately on the soundtracks of games. Even if we retrospectively explored the evolution of audio production and its affiliated technologies over the last decades, we still have to ask what makes the quality of a game’s musical atmosphere and how sound design and music affect immersion in video games.

While the current industry is making the most out of the new audio technologies, most of all recent games are not considered to be atmospheric or even to have a recognizable atmosphere. On the other hand, some older games such as the first entries of the Mario, Zelda and Metroid series are commonly recognized as being atmosphere-heavy. We’ll explore the possible reasons in the second part of this series.
 
 
References
Braguinski, N. 2018. The resolution of sound: Understanding retro game audio beyond the “8-bit” horizon. Necsus.com. Retrieved from Necsus.
Delaney, M. 2019. How video game tech has taken music-making to another level. MusicTech. Retrieved from Musictech.
Kilford, M. 2017. Composing music for video games. Audient.com. Retrieved from Audient.
McDonald, G. 2005. A history of video game music. Gamespot.com. Retrieved from GameSpot.
Murray, D. Ivy, R. 2015. How 8-bit music was made on the NES and Commodore 64. The 8-bit guy. Retrieved from Digg.
Vaughn, M. 2014. History of DAW. Logitunes.com. Retrieved from Logitunes.
2012. Blessed are the noisemakers. Gameaudionoise.com. Retrieved from GameAudioNoise.
2014. The evolution of music in video games. The405.com. Retrieved from The405.
 
 

Interested in knowing more about our audio services?

 
English