Asbjoern Andersen

Some years ago, veteran sound designer and game audio director Zachary Quarles (DOOM, Quake, Killer Instinct, Wolfenstein & many more) wrote an excellent guide on how to do an audio design document – and this is his followup to that guide; one that may very well change your life game for the better!

Here’s Zachary Quarles, with his insights on what you need to consider when doing an audio design document today – and why it’s such an essential tool for making your game sound its best:


Several years ago, I wrote this blog post over on my webpage. It essentially broke down how important establishing an audio vision is when an audio director starts their project and what high-level topics should be addressed when writing the primary audio design document that should become your roadmap when going into pre-production and subsequently building your game. It’s a fairly long read but it’s pretty fancy. It has bullet points, people…BULLET POINTS. I am so sophisticated.

We are knee-deep in a new console cycle where games are even more complex and in need of strong direction and commitment from all disciplines

Well, quite a bit of time has passed. We are knee-deep in a new console cycle where games are even more complex and in need of strong direction and commitment from all disciplines in order to execute at the level that our customers expect. How do we prepare for that? Game production can change at a moment’s notice, which can cause chaos if you’re trying to keep a large design document up to date and relevant throughout a busy production cycle. We need to start with an incredibly strong foundation and yet we need to be nimble, be able to make adjustments very rapidly, and not feel weighed down to a massive tome.

Has this new world of game production changed my approach as an audio director?

Good question. I hadn’t really thought about it much as I’ve been in the thick of a pretty aggressive release cycle with Killer Instinct (the game is broken up by seasons and has been in an active release cadence since 2013…which is a whole different article that I should write at some point) and spinning several other unannounced projects around, but I recently received that very inquiry from Asbjoern via a Twitter direct message. This caused me to take a step back and look at the last few years both as a developer AND a publishing audio director to see if my approach has adjusted.

I present to you: Writing an Audio Design Document: Part II!

Oooohhh…a sequel!

In my original article, I mapped out tons of high-level components that make up a game’s sonic identity and items that should be discussed and planned on when starting a project, moving into pre-production, and continued throughout production. Obviously, I still believe that is incredibly important…however, as we all know, games change and people are busy. Production cycles aren’t always predictable and the needs of the project evolve over time. So too, must the audio direction of the game if you want them to be a symbiotic entity.

This article is an addendum to my original. It’s geared toward the audio director, but it ties into all disciplines. I’ll go a bit more down the rabbit hole in terms of how I personally format audio documentation and feature sets to make sure they are clearly decipherable not only to the audio team but also to the creative director, designers, artists, programmers, and producers. A big part of my job as a publishing audio director is to make sure that everyone is on the same page and has buy-in with what we are doing from an audio standpoint. That means lots of streamlining and having laser-sharp clarity on the needs of other departments.

So how do you distill your game’s audio vision down to a digestible format that people can completely understand at a glance?

No one wants to have to read a novel if they are already incredibly busy (which everyone is). So how do you distill your game’s audio vision down to a digestible format that people can completely understand at a glance?

My current approach focuses on three primary phases. Step one in this journey is establishing what we call “Audio Pillars”.


Audio Pillars are high-level “filters” that distill the game’s full audio aesthetic and feature-set down to a handful of descriptors. I tend to use 3-5 of these and use somewhat flowery language to give an emotional connection to each pillar so people can relate to them at a human level and can understand what sort of feature (or set of features) would have to be implemented in order to properly have the Audio Pillar realized. Use of additional descriptive text also clues the creative team into the aesthetic that should be established and maintained. I approach this almost like writing prose as opposed to a technical document. The art, design, and programming departments utilize this same process; so all disciplines are in lock step on what the big-picture goals of the game are. They serve as your “north star” as you’re plodding through pre-production and on into production.

It usually takes a bit of iteration and interaction with the art, design, and programming departments as we figure out what the game actually “is” to get these Pillars locked down. After they are established I often print them out and hang them on my wall to make sure they are always there and I’m constantly reminded of where I need to go. This is also helpful for me because I am usually working on multiple projects at any given time, so if I can glance at them quickly I can drop into the proper headspace for that particular project very easily and realign my brain to live in that world as needed.

Here is an example of a possible Audio Pillar:

The Sound of a Worn-Down World that Evolves With You
Our small community is among the few that still remain on Earth. Our world is overrun with fantastical creatures and supernatural events that can change the surroundings that we inhabit, but, in the end, we are still fighting for our very existence with the weapons and tools that we scrounge and forge together.


On surface that’s just some descriptive text…clumsily constructed by yours truly; but if you read further into it, an aesthetic and a set of features begin to emerge. Here are a few that we can glean:

Multiplayer/Coop – Use of “community” suggests that there is a multiplayer component and that it’s not necessarily competitive. This will require local & non-local functionality and everything that comes with that (dynamic sound bank streaming, latency compensation, voice chat, etc…)

Stylized aesthetic based in reality – Use of “worn-down”, “fantastical”, and “supernatural” which takes place on Earth. So, sound design will lean towards gritty, textured, stylized, and saturated. While the sounds themselves will be hyped and larger than life, they will maintain a certain amount of realism.

Specialized content geared towards an action game – Character Foley/Interaction, Creatures, Weapons, Combat, etc…

An evolving ambient sound system – “…supernatural events that can change the surroundings that we inhabit” suggests that scripted events, enemy/player interactions, and other situations can change the world around you, which would require a system to track what world you’re in and what the world is becoming. For example, you’re in the middle of an abandoned steel mill and you come across a slobbering beast. When it screams it rips a hole into a different dimension, which pulls in a firestorm from a raging volcano that is on the other side of the portal. Which ignites the decaying building around you and causes a huge inferno.

Robust crafting system – tools and weapons that are found and constructed by the player(s) and NPCs.

“Okay, that’s great, but that seems like a pretty esoteric exercise, Quarles. What purpose does this actually serve? Wouldn’t it be more straight-forward to map everything out as discrete features and have a style-guide for the aesthetics?”

That’s a fair point, but keep in mind that this isn’t only for the audio department. This is for the rest of the team and to maintain high level filters for the project as a whole while you’re going towards the finish line.

Keep in mind that this isn’t only for the audio department

If any of you have written a huge audio design document that maps out every single feature and a full style-guide, how many people have you actually gotten to read it? This is more to establish the overall “tone” of the game that the rest of the departments can get behind without having to be mired down in the details.

Feature breakdowns and stuff like that are handled differently, which I’ll address soon…but NOT YET. There is an important step that takes place after the Audio Pillars are written. That is the “Audio Target”.

Popular on A Sound Effect right now - article continues below:

  • Destruction & Impact Bullet Impacts Play Track 320 sounds included
    4 out of 5

    Prepare for impact! This EFX Bullet Impact collection features a huge number of impacts into cars, metal, walls, water, body impacts, as well as passbys, ricochets and underwater passbys.

    A must-have for for actual bullet and combat sounds – and for adding oomph to many other types of impact sounds too!

  • GRANULAR is the ultimate source library for anyone looking to add compelling and distinctive granular textures into their sound design; it contains designed sounds, a large collection of processed sounds, and some additional raw recordings.
    This library is the culmination of many hours of exploration and experimentation in the area of granular sound processing, using a variety of plugins, audio tools, recording techniques and processes.
    GRANULAR features 395 24bit / 96kHz WAV files, each embedded with full metadata.

  • Need real snake sounds? This compact, hand-picked pack delivers recordings of actual snakes hissing, inhaling and exhaling, shaking, rattling – and striking. Features snakes such as Boa, Eastern Diamondback Rattlesnake, Mexican Rattler, African Puff Adder, and Gopher Snake.

Latest releases:  
  • The ‘Universal Emotes’ sound library is the largest collection of human voice reactions and emotions spanning across all ages and sexes.

    Ubiquitous Variety

    This comprehensive bundle totalizes 8771 files, more than 2850 unique sounds, recorded by 20 actors through more than 40 various actions and emotions.

    These sounds are meant to be as ubiquitous as possible, with no speech or actual wordings (only interjections and grunts).

    Meticulous edit and embedded metadatas

    Each of the 8771 files included are thoroughly edited, named and metatagged, making them easy to retrieve and ready to use!


    The meticulous categorization of the emotions follows the work of the professor/psychologist Robert Plutchik on the theory of emotions and its famous wheel helping us to distinguish and classify the variety of human being emotions.

    Thanks to “The Audioville” and “Vincent Fliniaux” for their support on the Edit and Metadata.

    This library comes with 2 download options included : the simple version 96KHz (1 simple file per sound), and the full version 192 Khz with 2 mics variations (2 files) per sound.

    What’s inside:
    • 20 folders, one for each actor
    • Women and men from 7 to 77 years old: children, teen, adult, elderly
    • Various grunts and efforts sounds: fight, strike, attack, ninja, hurt, pain, jump, lift, push, pull, choke, fall, die, …
    • +20 different emotions/reactions well-categorized : fear, happiness, anger, trust, agreement/disagreement, surprise, joy, disgust, sadness, admiration…
    • Miscellaneous actions: sneeze, gargle, eat, shiver, sip, cough, smack, slurp, hail,…
    • Thorough metadatas: precise description with each sound onomatopoeia (written pronunciation)
    34 %
    Ends 1519862400
  • From the birdbrained creator of Squeaky Toys comes Squeaky Creatures, a fun, raucous library that offers you nearly 600 sound files of chattering, gabbing, squawking nonsense to use for your cute critters in any media landscape. Use them as one-off emotes or sew them together for complex, expressive phrases. Come meet:

    Come meet:

    Roundy • Flappy • Bones • Hoser • Giggler • Longneck • Oinky – and many more!

    However you use it, this misfit squeaker gang will ensure that your fictional specimens will communicate in cute, boisterous tones that will carry across a crowded room full of adorable little monsters.

    20 %
    Ends 1522454400
  • World Sounds Authentic Thailand Vol. 4 Play Track 203 sounds included, 703 mins total

    Authentic Thailand Vol.4 is the latest volume in the Authentic Thailand series.

    This massive collection contains various City and Village Sounds, Beaches, Jungle, and Nature Sounds. As a special it contains Transportation Sounds such as a Offroad ride with 4×4 Pickup, a Longtailboat, Ferry Boat, Public Taxi, driving with a TUK TUK, Railway Station Platform and Sports Sounds such as traditional Muay-Thai-Boxing in many situations, Indoor Volleyball Competition, Snooker-Club, Ping Pong and Volleyball Training in Elementary School etc.

    It features typical Streets, Temple, Funeral Mass, Procession Day, Markets, Traffic Sounds, Walla, Island Beaches, Nature/Jungle Sounds with Birds, Winds, Insects and Residential Village Sounds.

    It also contains some Bonus Tracks with wooden Door Sounds with open/close from int/ext, and Condominum Roomtones in the Heart of Bangkok.

    20 %
    Ends 1519776000
  • A versatile collection of interactive audio elements exploring both the abstract and the traditional side of User Interface sound design. From familiar notification tones and subtle alerts through to future computer processing sequences and manic glitch outbursts, these sonic assets were designed to fit a multitude of scenarios and will add a unique identity to a variety of interface types.

  • Drones & Moods Harmonic Series Drones Play Track 48+ sounds included, 48 mins total

    • Harmonic Series Drones is an extension of several of my music compositions. Over the past few years, I've been very interested in data sonification, writing several pieces that turn real-time weather data into music. This library was created using a drone generator that turns weather data into sound. I built this drone generator for my piece Sitka for piano and seasonal electronics.

    • Instead of pulling weather data from Sitka, Alaska, as I did in the piece, I used 2016 daily temperature data from the Arctic National Wildlife Refuge, a fragile place wrapped up in current political drama and now open to drilling. This library is a sonification of 2016 temperature data in ANWR. The highest global temperature on record was recorded in 2016 (2016 Global Climate Report). It is even worse in the Arctic where temperatures are warming at twice the rate of lower latitudes (2016 Arctic Report Card). With this library, you'll be able to hear the warming, and I hope this library helps to draw attention to the rapidly changing environment of the Arctic.

    • Each of the forty-eight drones corresponds to the average temperature of a day in 2016, with twelve drones from each of the four seasons. The drones are built from pure tones made from tightly filtered pink noise. These pure tones are then stacked in harmonic series relationships. For instance, a winter drone might consist of the fundamental and four lowest partials. A summer drone could have the fundamental and partials seven, fourteen, and thirty-two.

    • The drones stand on their own without the story. If you never knew they started with weather data, you would still find a variety of pure, rich, microtonal drones with enough variety to fit the mood of any project.

    • Two percent of the price of this library is donated to an environmental cause. I view it as an “artist royalty” for the planet!

    • All files are sample seamless loops. So, there are no fade-ins and outs and the files are ready for looping in any DAW.
    • Consonant and dissonant drones
    • Simple and complex drones
    • Each of the 48 drones is between 1 and 1.5 minutes in duration
    • Non-periodic undulating drones
    • Each harmonic has a randomized gain, so the drones pulse non-periodically.
    • Drones were made from tightly filter pink noise, so drones have a small amount of soothing and airy pink noise.
    • Microtonal harmonic series relationships create beautiful beating between drone harmonics.
    • All drones have a fundamental of G1 (49Hz). Transpose them as needed to fit any project.
    • Partials are included in the metadata, for example, “fundamental_G1_harmonics_1_2_3_4_7_16”.
    • Weather data included in the Soundminer Description and BWAV description fields.
    • Temperature data was translated as a combination of lowest harmonic, number of harmonics, and highest harmonic.
    • Generally, lower temperatures include bass harmonics and higher temperatures include treble harmonics. For example, the hottest day of the year only included harmonics 18 and 24.
    • Location: weather data is from the NOAA station at Helmut Mountain, Alaska in the Arctic National Wildlife Refuge.

    • View in Browser or Download CSV

    • Max/MSP 7
    • Harmonic series artwork by Jason Charney:
    • 2016 Arctic Report Card:
    • 2016 Global Climate Report:
    • NOAA Weather Data:
    33 %
    Ends 1519862400

Need specific sound effects? Try a search below:


After all of the disciplines on the team understand what the game actually is and the Audio Pillars are in place, establishing a solid Audio Target is an important step to lay the groundwork for the aesthetic direction in a tangible way. The Audio Target can be any number of things:

A “rip-o-matic” video — This would contain clips of previously released material from multiple sources (ie: movies, television, other games, etc) that guides the overall aesthetic direction for sound design, music, voice, and everything else. This can be something that is a few seconds long to a few minutes. It doesn’t really have to make sense in terms of a narrative—unless that is a large component of your game. It’s more like a very high-level reference piece to get the creative juices flowing and to illustrate the direction of the audio vision to the other disciplines using familiar material.

A post-scored video — A video crafted by the audio team using unique content that will help define the game’s aesthetic in a very tangible and actionable way. Again, this can use clips like the rip-o-matic but would not use previously released content as your reference point. You would strip out all audio from the clips and build it yourself. This is a good way to start building up your audio library at the very beginning of a project to start creating a ton of source material that you can use moving forward.

A “beautiful corner” — A slice of gameplay that represents key systems and content. This might be in the final engine that the game uses or it might be something that the audio team can do quickly in middleware to show off a feature in a very streamlined way. Whatever the choice might be; this option should be quick and dirty to basically act as a test-bed to see how complex and worthwhile something might become. I’ve been on projects before where a small strike team of disciplines would work on a vertical slice of a very specific feature in the game for a couple of weeks. The team would work together to make sure it was being treated seriously and would be brought up to representative quality so everyone understood the scope of what it would take to bring a feature from a test to a fully realized shippable component of the game.

A combination of any of these options — The Audio Target doesn’t have to be one specific thing. It can be smaller, digestible pieces of reference that tie to a single feature/content-type or it can be a high-level piece to give an overall aesthetic direction. You will more than likely find yourself doing multiple small audio targets over the course of the game’s pre-production cycle. It’s whatever best serves the game and whatever fits in your particular workflow.

Remember: showing is better than telling. I used to spend lots of time writing up how something should sound and getting frustrated when people wouldn’t “get it”. That was a hole in my communication style. When you give tangible examples, you’ll get buy-in.

After you have established what your Audio Target(s) will be for your title, you need to actually build it. To build it, you need features. To build features, you need cross-discipline support. To get cross-discipline support, you need to map out what you need in a succinct and straightforward way. How do you do this? Well…join me in the next section, won’t you?


When mapping out specific features and systems, it’s important to be as descriptive as possible. This is when you can get very technical with your designs. You want these Audio Feature documents to be as concise as humanly possible. Not only are they for the audio team to come back to as reference over the course of a project and for other disciplines to read and be able to understand very easily, they also need to be clear enough that if you add someone new to your audio team over the course of development, they can get up to speed quickly and without any roadblocks.

If you’re using third party middleware (Audiokinetic’s Wwise or Firelight Technologies’ FMOD Studio, for example), this is a great place to have the audio team breakdown exactly how to build a system or feature set without having to roll code in very deeply. You can do the brunt of the work on your own and then you can involve the programming team for specific game hooks. Now, if you’re using proprietary technology or a codebase that doesn’t have an elaborate content creator’s authoring tool, then you might need to get a bit more detailed with the workflow, tool interface mockups, and reference examples of what you’re looking to achieve.

There are a few ways of documenting these Audio Features. I used to create one master document that contained ALL of the ideas and information that I have been discussing in this article—Audio Features, Targets, reference material, etc. But, what I’ve discovered over the years is that people tend to be a bit more responsive if you provide smaller singular chunks of data that they can consume quickly. I like to make things as easy for people to process as humanly possible and if someone starts digging into the audio documents directory, they can find what they are looking for just based off of document title. So, I have started breaking Audio Features and Systems out into individual document files. They might only be a couple of pages long each, so I can have a ton of files on any given project; but if they are named and organized in a sensible manner, I’ve found that people will actually read them.

A high-level template flow that I use when writing up a Feature or a System is as follows:

Name of the feature/system.

Which project this is designed for – This is a bit of a requirement for me and for my job since I ping-pong from project to project. That might not really be a necessity for you if you are primarily focused on a single game. Just a personal preference.

Vision Statement – Emotional design description goes here. The purpose of the emotional design is to give a high level goal to the feature. For instance engine roars that create lust, bullet whizzes that make you duck, etc. Any flowery language should go here and guide asset creators in the overall direction

Technical Design – Break these up into design goal bullet points and write up technical design description sub bullets for each design goal. Go into painstaking detail and explain how you use the asset definitions. If using something like Wwise, you can roll any RTPC (Real-Time Parameter Control) needs into this section. You might only have one or two design goals for any given feature…but…there might be some pretty beefy systems that will require multiple goals. Using bullet points keeps things straightforward and easy to read. Plus, it provides a nice reference point if you need to go back over it in the future.

Event Design – Breakdown the necessary events that will be required for this feature. This includes Create, Play, Stop, and Destroy events.

References – this is a bit of an optional section but if you are working on proprietary technology, this would be a good place to have mock-up screenshots, links to similar tools, etc…

Now, how you organize this stuff and what you choose to include in your Feature documentation is obviously completely up to you. This is my process and it allows me write up descriptive Audio Features quickly, but in a robust and focused manner. It also allows for easy maintenance and re-direction if the game’s focus changes over the course of pre-pro (or even in the middle of production). The last thing you want to have to do is constantly update documentation if you are neck-deep in shipping a game.


So, I recognize the contradiction of me constantly harping on being streamlined and succinct in your writing and then I go ahead and compose a gargantuan article that drones on and on.

What can I say? I’m a complicated person.

Seriously though. Don’t consider my process to be the “be all, end all” of how to approach a project by any stretch. This is my personal method that I’ve cobbled together after years of being a developer and then making some adjustments once I became a publishing audio director. I’m always open to new ideas, learning new methodology, and trying new practices and techniques.

While my overall goal and philosophy of audio direction hasn’t changed much over the last few years, some of the details in distributing information and communication to other departments have. I try not to be quite as myopic and specialized in my documentation any more; but rather focus on the human aspect and how audio has to be an anchor for the player experience. I try to show more than tell as often as I can. Talk to me in a couple of years and I’ll probably have another ridiculously long-winded article ready to go about how my process has changed.


A big thanks to Zachary Quarles for sharing his insights on how to create an audio design document!


Please share this:



About Zachary Quarles
Zachary Quarles has worked at companies such as id Software, Raven Software, Day 1 Studios, and is currently audio director for Microsoft in the Xbox division. He has worked on franchises such as: Killer Instinct, DOOM, Quake, RAGE, Wolfenstein, Soldier of Fortune, X-Men: Legends, and Marvel Ultimate Alliance. He occasionally writes the odd blog entry at his website. He also runs the independent game company, Winter Night Games with his brother, Josh.

A Sound Effect gives you easy access to an absolutely huge sound effects catalog from a myriad of independent sound creators, all covered by one license agreement - a few highlights:
  • Prepare for impact! This EFX Bullet Impact collection features a huge number of impacts into cars, metal, walls, water, body impacts, as well as passbys, ricochets and underwater passbys.

    A must-have for for actual bullet and combat sounds – and for adding oomph to many other types of impact sounds too!

  • Need real snake sounds? This compact, hand-picked pack delivers recordings of actual snakes hissing, inhaling and exhaling, shaking, rattling – and striking. Features snakes such as Boa, Eastern Diamondback Rattlesnake, Mexican Rattler, African Puff Adder, and Gopher Snake.

  • We are extremely proud to present our first library, WINGS – a one-of-a-kind sound library.

    From tiny insects to small birds, from fairies to dragons, WINGS offers a creative palette with a diverse range of sounds to choose from.

    With over 1400 files (more than 4 GB for the 192 kHz version ) we’re confident you will find the perfect sound.

    When purchasing WINGS you get 2 packs, our Design category that includes 180 files and the Source category that offers more than 1200 sounds. Featuring the very best of our foley sessions.

    All single flaps have been careful edited, allowing for unique speed or rate adjustments.

    Pick your preferred version at the introductory prices below:

Explore the full, unique collection here
Just enter your details below to get the newsletter and free sound effects (soundlist):

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.