virtual reality game audio Asbjoern Andersen


Noted audio director Garry Taylor, from Sony Interactive Entertainment Europe, recently gave a talk about the possibilities and challenges for audio in virtual reality – from a non-technical perspective. I’m really happy to be able to share it here on the A Sound Effect blog, and without further ado, here’s Garry Taylor:
 
This is a talk given at the VRX Europe Conference in London in May 2016 to a non-technical audience. It was not aimed at audio designers or engineers, and so I didn’t go in-depth into any of the technical issues. My goal was to get across to a lay-audience some of the many issues VR development teams may face, in terms of audio, when developing content.

– Garry Taylor. Audio Director, WWS Creative Services Group, Sony Interactive Entertainment Europe.


 

Someone asked me recently what the difference was between audio for TV based games and audio for VR.  After giving it a bit of thought, I came to the conclusion that getting it wrong on a TV is mildly annoying, but getting it wrong on VR and the player will want to kill you.  By that I mean badly implemented audio in VR can be so off-putting, it can seriously hinder people’s acceptance of their virtual reality, to the point that it may put some people off completely, and this is a big problem.  

Michael Abrash at Oculus said that 3D sound in VR is ‘not an addition, it’s a multiplier’. Everybody talks about VR in terms of ‘presence’ and ‘immersion’. The truth is that without a certain level of competence in audio design, there is no presence.  What’s more, because it is a multiplier, there is an extremely fine line between what we would call presence, the illusion that you’re actually there, and annoyance.

The truth is that without a certain level of competence in audio design, there is no presence

Our teams have been experimenting to find out what works, and what doesn’t, in terms of audio.  A lot of the work we’ve done revolves around the player’s acceptance of their virtual environment and the sounds that emanate from it.  We’ve made lots of mistakes, but by making them, we’ve learned where the boundaries are, and how far we can push things before they break.

 

Engineering Immersion

One of the most fundamental problems any developer with no experience of audio on VR will have is externalising sounds.  Let me explain what I mean.

If you listen to any film or TV show that has a narrator, you’ll notice that the sound of the voice of the narrator, due to the way in which it was recorded, sounds very different from the sound of the people you’re actually seeing on screen.  The narrator is in effect, the voice inside your head, and it’s this that we need to avoid if we want things to sound like they’re in the virtual environment.

Next time you watch a film or TV show listen for it, and notice the difference between perspectives.  The narrator’s voice sounds a lot fuller and richer.  Generally, it’s recorded closer than ‘on set’ dialogue.  With a condenser microphone, the closer you are to the microphone, the bassier the recording will be, due to something called ‘the proximity effect’.

Although this is a tad simplistic, if there’s a lot of bass in a voice, your brain will tell you it’s close.  If we want dialogue or indeed any sound to sit in a 3D space, and to sound like it’s part of that space, we need to ensure that the perspectives are convincing.

How we give players information about distance is absolutely critical for them to be able to localise something accurately in VR

If something is close, it needs to sound close, and if something is far away it needs to sound far away.  How we give players information about distance is absolutely critical for them to be able to localise something accurately in VR.

If you were to shut your eyes, you’d usually be able to tell what sort of room you were in just by listening to how sounds tail off within it.  When working in VR, we have to recreate that acoustic behaviour in our virtual reality in order for it to be convincing. There’s a delicate balance between the volume of the sound waves that travel directly to your ears, and the ones that bounce around the room, and we use the perceived loudness of the sound, the length of the reverberation, as well as the ratio between the direct and the indirect sound to judge the size and type of space we’re in and the distance between us and whatever it was that made the sound.

Not only that, but we also have to accurately model how our own heads affect the sounds we hear. We use something called Head Related Transfer Functions or HRTF which gives us the ability to make sounds appear as though they’re behind us, or in fact any direction, including above or below us.  These HRTFs when coupled with head-tracking are very very convincing.

This is different to how we’ve done audio for games in the past.  Now, these might seem like very minor things, but when we get it wrong, it’s little things like this that jar with people.  They might not know why something isn’t quite right, but it will flag something up in the back of their mind that says ‘this isn’t convincing’.  Audio for VR can be difficult like that.  To do it right requires a decent toolset and knowledgeable and experienced sound engineers who know about psychoacoustics; how the brain interprets sound.

 

Information

Audio’s function within any game, film or any other medium is either to give the player or viewer information or to influence their emotional state.

3D Audio is a very powerful way of communicating information, be it information conveyed through dialogue, or information about their environment through directional sounds, but there are limits to the amount of auditory information that can be processed by the brain.

Now imagine that a 3rd person was giving another talk on something else in that corner at the same volume.  That’s where you hit a wall.

You can hear me speaking from this stage.  There’s nothing else really going on, vying for your attention.  Imagine someone else was doing a talk in that corner of the room on something else with their microphone at the same volume.  You would probably be able to pick up limited information on both talks, just about.  Now imagine that a 3rd person was giving another talk on something else in that corner at the same volume.  That’s where you hit a wall.  You wouldn’t be able to distinguish between the 3 separate voices, they would come together to form a single incoherent mess.  It would annoy you.

Walter Murch, film editor, called this the ‘Law of Two and a Half’.  One or two sets of footsteps, for example, can easily be isolated by the brain, but 3….instead of being individual elements, 3 becomes a group of things happening, and out goes your ability to distinguish individual elements.

We’ve found that these limits of the brain to process multiple audio cues together must affect how we design our titles, and the events or situations that happen within those titles.  Too much going on will disorientate the player, or stop them making sense of the information presented to them.  This could be dialogue, or it could be the positions of enemies trying to shoot you, or important audio cues the player needs in order to progress through a game.  Any more than 2 positional cues at a time, and the player may lose the ability to accurately place them in a space.

Because of this, we need to make sure that audio considerations are taken into account at the early design stage of any project, and care must be taken to respect the limits of the brain’s ability to process auditory information.  

Having said that, if you want to briefly disorientate the player on purpose, it can be used to great effect, but like anything, you need light and shade.  

 

Bending Reality

One of the more interesting things we’ve found is that whilst audio can help to make or break presence or immersion, it also allow us to bend reality without breaking it, and in some instances mask problems in other areas.

What happens if the player decides to stand up?

In one of our titles called London Heist, the player is a passenger in a van, driving down a road.  Now, there’s nothing stopping the player opening the door and leaning out or putting their head out of the window, and when they do, they hear the wind rushing past them, as you would expect.  However, what happens if the player decides to stand up?

Well, we could create a barrier that would stop the player’s camera from going through the roof of the van, but messing with people’s perception of movement is a dangerous thing, and can cause motion sickness which could put a lot of people off.  However, if we allow the player to put their head through the roof, to break reality, we must also make their experience of doing so consistent with what they would expect if they could actually do it.  So when they do, they also hear the wind rushing past them.  This is surprisingly acceptable.  In some cases, good audio design allows us to ‘paper over the cracks’, and in certain circumstances, as long as the audio is consistent, liberties can be taken in the virtual world.
 


Popular on A Sound Effect right now - article continues below:

 

Latest releases:  
  • Mechanical Gearbox Play Track 3551 sounds included, 279 mins total $149.99

    We've ventured to obscure boutiques, prop houses and vintage shops to capture mechanical contraptions from around the world. Ranging from bizarre creations, to steampunk gadgetry, gizmos and machines, GEARBOX clocks in at over 10 GB of high definition, precision mastered sounds spanning across 2987 construction kit sounds and 584 designed sounds.

    GEARBOX equips Sound Designers with a literal toolbox of mechanical gadgetry. Ranging from tiny to huge, GEARBOX's machines and gizmos provide coverage for interacts, mechanism, machine or device in your scene or game.

    INTRODUCING BUILDING BLOCKS

    In addition to CONSTRUCTION KIT and DESIGNED SOUND content, GEARBOX features BUILDING BLOCKS. This category of sound consists of designed phrases and oneshots utilized for our designed machinery, empowering Sound Designers with maximum flexibility when trying to get that particular phrase from an existing DESIGNED SOUND. GEARBOX features over 468 BUILDING BLOCKS ranging from levers, hits, grinds, snaps, and more.

    Video Thumbnail
    Add to cart
  • This is a unique bicycle library that captures this characteristic bike in clean, quiet, nicely performed true exterior rides. Including multiple perspectives, speeds and actions. From fast passbys on asphalt to slow onboard recordings and smooth stops.

    The UglyBike is a typical old bicycle that’s working fine, but needs some TLC. It is a bicycle that’s just average, a little rattle a gentle scrape, a bike that everyone has had but got traded in for a newer one. A story of unrequited love.. :)

    Speeds and actions:
    Three speeds. Departures from slow, medium to fast getaways. Arrivals from slow stops with gently squeaking handbrakes to heavy stuttering skids.

    Five perspectives:
    1. Onboard Front: captures the whirring tire and surface sound.
    2. Onboard Pedal: nice overall combination of pedaling, crank creaks, chain rattle, tire and surface sounds.
    3. Onboard Rear: close up sound of the rear axle, with chain, sprocket and switching of gear.
    4. Tracking shot: mono recording of the passby, keeping the bike in focus while passing by.
    5. Static XY shot: stereo recording of the passby that emphasizes speed.

    Overview of perspectives and mic placement:

    Onboard recordings are 2-3 minutes long depending on speed. Higher speeds > shorter duration.
    All 3 onboard mics are edited in sync with one another to make layering easy.
    All Passbys, Arrivals and Departures move from Left to Right.

    Metadata & Markers:
    Because we know how important metadata is for your sound libraries we have created a consistent and intuitive description method. This allows you to find the sound you need easily, whether you work in a database like Soundminer/Basehead/PT Workspace work, or a Exporer/Finder window.

    However, we are aware that some people have different needs for different purposes, so we’ve created a Metadata Reference Guide that explains the structure. And because we’ve automated the metadata proces, you can be confident that a ‘find & replace’ command will always replace all instances.

    Download our Metadata Reference Guide

    Download complete metadata PDF

    If you have any questions about this, contact us!

    Additionally, we added Markers to some wave files, so specific sound events are easy to spot in Soundminer or other database apps.

    Need more?
    The UglyBike library is part of the complete ‘City Bicycles’ library package available at www.frickandtraa.com. It consists of all 4 bicycles and includes additional surfaces and extras ranging from one-off  bicycle passes captured in the city and bounces and rattles. The extra bicycles surfaces and additional effects are also available seperately here on ‘a Sound Effect’. If you’ve bought a single library and want to upgrade to the full package, contact us for a reduced price on the complete City Bicycles library. Every part of City Bicycles that you paid for will get you an extra reduction on the full package.

    Video Thumbnail
    Responses:

    344 AUDIO:City Bicycles has a plethora of content, for a great price. The perfect balance between a great concept, great presentation and outstanding execution, lands them an almost perfect score of 4.9..

    The Audio Spotlight: City Bicycles is worth getting if you are in need of great sounding and well edited bicycle sounds.

    Watch a video created by Zdravko Djordjevic.

    Video Thumbnail

     

    Add to cart
  • Environments Museums & Galleries Play Track 272 sounds included, 800 mins total $100 $80

    This library features a wide range of recordings from various museums and galleries, each differentiated by the nuances of their size and space. All recordings feature pristine echos, walla and movement. The library includes stereo & 5.0 recordings from:

    • War Museums
    • History Museums
    • State Museums
    • Science Museums
    • Art Galleries
    • Photography Galleries
    • State Galleries

    All sounds were recorded using a stereo pair of DPA 4060s, DPA 5100, Sound Devices Mix-Pre 6 and Sound Devices 788T.

    20 %
    OFF
    Ends 1556056800
    Add to cart
  • The American M5 High Speed Tractor includes over 20 gigabytes of recordings of a WWII US military vehicle with a Continental 6572 six-cylinder petrol engine with 207 horsepower. 188 sound fx document a full suite of performances from M5, also known as versions M5A1, M5A2, M5A3 and M5A4.

    The performances include starting, idling, departing, arriving, and passing by from 6 exterior perspectives at slow, medium, and fast speeds. 10 additional perspectives feature motor, interior, exhaust, tracks, and other locations that capture idles, driving, and steady RPMs from onboard the tractor.

    Includes extensive Soundminer metadata.

    Add to cart
  • Cars Volvo 242 DL 1975 Play Track 364 sounds included $249

    The Volvo 242 sound fx collection includes 271 sounds in 13.51 gigabytes of audio. The 242 is a DL 1975 version of the car, also known as models 240, 244, and 245. It features 25 takes of recordings from the Swedish vehicle and its 4-cylinder B20 A, 82 horsepower engine.

    16 synchronized perspectives capture both onboard and exterior performances. Eight onboard perspectives (12 channels, including 4 in AMBEO) recorded driving at steady RPMs, with gearshifts, and ramps using microphones mounted in the engine, interior, and exhaust. Eight other exterior perspectives (18 channels) showcase driving at fast, medium, and slow speeds approaching, departing, and passing by. There are also steadies in neutral, blips, and performed effects, as well as an Altiverb impulse response.

    All clips have 18 fields of Soundminer, BWAV, and MacOS Finder metadata.

    Add to cart

Need specific sound effects? Try a search below:
 

Listen to the player

Most developers have been thinking about how their sound and music functions within VR.  We’ve also been thinking about what sounds we can take from the player, and what we could do with them.

PlayStation VR has a microphone on the bottom of the visor.  This allows us to capture sounds or speech from the player, and either incorporate it into the world, for example voice chat, but also it allows us take the sound, manipulate that data and then use that data to control certain parameters within the game.

For example, in the London Heist, there’s a drink on the dashboard.  What if the player decides they want to pick it up and drink it?  Obviously, they can’t really drink it.  That would be silly.  But they will try.  And if they do try, having the world react in a believable way will increase the sense of immersion.  One of our technical designers, SImon Gumbleton came up with a technique of measuring the power of the microphone input, and then using it to trigger a drinking sound.

Again, audio can paper over the cracks.  Audio, this time from the player, will allow another level of interaction between the player and the virtual world.

How can the player affect the world through the sounds that they make?

So the question our teams should ask is; instead of the player just listening to the world the developers have created, how can the player affect the world through the sounds that they make?
 

Linear VR Video

Before I go, I want to speak briefly about audio for linear VR video, as opposed to interactive content.  At the moment, most teams I know that are creating VR video are doing it in Unity, or some other game engine.  This, at the moment, is by necessity.  There is no support for VR in any of the off-the-shelf audio packages at the moment.  There are though quite a few plugin manufacturers working on tools to allow teams to design audio for VR, and as time progresses those tools will improve.

Ambisonics is a 40 year old sound format for encoding 3D audio that up until recently was considered a bit of a relic, but it translates perfectly to VR, so expect it to make a resurgence in the coming years.  This will be helped by Google’s adoption of it for VR and 360 video on YouTube, which was rolled out a couple a weeks ago.  The new MPEG-H format also supports 3D audio, but it’s very new and no applications support it at the moment.  The same goes for AC4.

 

So, to sum up, the switch from developing for screen-based entertainment to VR is not straightforward.  It’s literally a whole new world, and we’re still finding out the rules.

However, because of the fragility of the player’s acceptance of the virtual world within VR, audio should be an integral part of the design process, to be considered from the very start of a project, from both a creative and technical point of view.

If immersion and presence is your goal, your sound team will be the ones that will have to deliver it.

 

Please share this:


 

A big thanks to Garry Taylor for his insights on audio in VR!

 
 
THE WORLD’S EASIEST WAY TO GET INDEPENDENT SOUND EFFECTS:
 
A Sound Effect gives you easy access to an absolutely huge sound effects catalog from a myriad of independent sound creators, all covered by one license agreement - a few highlights:
 
 
  • Destruction & Impact Bodyfall vol 1-4 Play Track 929 - 3804+ sounds included From: $35

    Bodyfall is an exhaustive multi-volume sound library, designed in collaboration with prize-winning French Foley artist Florian Fabre, and recorded in the famous Hiventy foley studio.

    Each volume features 2 different falling surfaces. On each of them, sounds of different parts of the human body:

    Chest (simulated with 4 textures of distinct densities, labeled M1,M2,M3 and M4), feet, knees, and hands, have been separated.

    All recordings were made from 3 distances (close, mid, distant) with 3 strength levels (hard, medium, soft). Finally, this huge toolbox provides you with infinite combinations to make your own and unique bodyfalls.

    Bodyfall vol 1: Generic / concrete & Hard Metal:

    Files included: 972 .WAV, stereo & mono files , 24 bit / 96 kHz (416 MB)
    Bodyfall vol 2: Wood / Rustic & Metal Grid:

    Files included: 974 .WAV, stereo & mono files, 24 bit / 96 kHz (438 MB)
    Bodyfall vol 3: Dirt & Wood / Hardfloor:

    Files included: 929 .WAV, stereo & mono files, 24 bit / 96 kHz (433 MB). No distant files for Dirt soft impacts.
    Bodyfall vol 4: Metal / Composite & Wood / Parquet:

    Files included: 929 .WAV, stereo & mono files, 24 bit / 96 kHz (433 MB)
    Bodyfall Bundle:

    All files & surface variations from vol 1-4 included
  • Destruction & Impact Bullet Impacts Play Track 320 sounds included
    Rated 4.00 out of 5
    $35

    Prepare for impact! This EFX Bullet Impact collection features a huge number of impacts into cars, metal, walls, water, body impacts, as well as passbys, ricochets and underwater passbys.

    A must-have for for actual bullet and combat sounds – and for adding oomph to many other types of impact sounds too!

    Add to cart
  • Introducing Artillery, a new powerful sound library covering a wide range of elements including cannon shots, electric systems, mechanical parts, distant artillery barrage, impacts, whooshes, grenade launchers and more.

    The sounds are organized in the following categories:

    • Artillery: Falling rubble, explosion impacts
    • Beeps
    • Grenade Launchers
    • Howitzer: Electric System Background
    • Howitzer: Falling rubble, explosion impacts
    • Howitzer: Mechanical Parts Handling
    • Howitzer: Shot Metallic Parts
    • Howitzer: Shot Distant Explosions
    • Shell Trajectories

    Add to cart
 
Explore the full, unique collection here

Latest sound effects libraries:
 
  • Mechanical Gearbox Play Track 3551 sounds included, 279 mins total $149.99

    We've ventured to obscure boutiques, prop houses and vintage shops to capture mechanical contraptions from around the world. Ranging from bizarre creations, to steampunk gadgetry, gizmos and machines, GEARBOX clocks in at over 10 GB of high definition, precision mastered sounds spanning across 2987 construction kit sounds and 584 designed sounds.

    GEARBOX equips Sound Designers with a literal toolbox of mechanical gadgetry. Ranging from tiny to huge, GEARBOX's machines and gizmos provide coverage for interacts, mechanism, machine or device in your scene or game.

    INTRODUCING BUILDING BLOCKS

    In addition to CONSTRUCTION KIT and DESIGNED SOUND content, GEARBOX features BUILDING BLOCKS. This category of sound consists of designed phrases and oneshots utilized for our designed machinery, empowering Sound Designers with maximum flexibility when trying to get that particular phrase from an existing DESIGNED SOUND. GEARBOX features over 468 BUILDING BLOCKS ranging from levers, hits, grinds, snaps, and more.

    Video Thumbnail
  • This is a unique bicycle library that captures this characteristic bike in clean, quiet, nicely performed true exterior rides. Including multiple perspectives, speeds and actions. From fast passbys on asphalt to slow onboard recordings and smooth stops.

    The UglyBike is a typical old bicycle that’s working fine, but needs some TLC. It is a bicycle that’s just average, a little rattle a gentle scrape, a bike that everyone has had but got traded in for a newer one. A story of unrequited love.. :)

    Speeds and actions:
    Three speeds. Departures from slow, medium to fast getaways. Arrivals from slow stops with gently squeaking handbrakes to heavy stuttering skids.

    Five perspectives:
    1. Onboard Front: captures the whirring tire and surface sound.
    2. Onboard Pedal: nice overall combination of pedaling, crank creaks, chain rattle, tire and surface sounds.
    3. Onboard Rear: close up sound of the rear axle, with chain, sprocket and switching of gear.
    4. Tracking shot: mono recording of the passby, keeping the bike in focus while passing by.
    5. Static XY shot: stereo recording of the passby that emphasizes speed.

    Overview of perspectives and mic placement:

    Onboard recordings are 2-3 minutes long depending on speed. Higher speeds > shorter duration.
    All 3 onboard mics are edited in sync with one another to make layering easy.
    All Passbys, Arrivals and Departures move from Left to Right.

    Metadata & Markers:
    Because we know how important metadata is for your sound libraries we have created a consistent and intuitive description method. This allows you to find the sound you need easily, whether you work in a database like Soundminer/Basehead/PT Workspace work, or a Exporer/Finder window.

    However, we are aware that some people have different needs for different purposes, so we’ve created a Metadata Reference Guide that explains the structure. And because we’ve automated the metadata proces, you can be confident that a ‘find & replace’ command will always replace all instances.

    Download our Metadata Reference Guide

    Download complete metadata PDF

    If you have any questions about this, contact us!

    Additionally, we added Markers to some wave files, so specific sound events are easy to spot in Soundminer or other database apps.

    Need more?
    The UglyBike library is part of the complete ‘City Bicycles’ library package available at www.frickandtraa.com. It consists of all 4 bicycles and includes additional surfaces and extras ranging from one-off  bicycle passes captured in the city and bounces and rattles. The extra bicycles surfaces and additional effects are also available seperately here on ‘a Sound Effect’. If you’ve bought a single library and want to upgrade to the full package, contact us for a reduced price on the complete City Bicycles library. Every part of City Bicycles that you paid for will get you an extra reduction on the full package.

    Video Thumbnail
    Responses:

    344 AUDIO:City Bicycles has a plethora of content, for a great price. The perfect balance between a great concept, great presentation and outstanding execution, lands them an almost perfect score of 4.9..

    The Audio Spotlight: City Bicycles is worth getting if you are in need of great sounding and well edited bicycle sounds.

    Watch a video created by Zdravko Djordjevic.

    Video Thumbnail

     

  • Environments Museums & Galleries Play Track 272 sounds included, 800 mins total $100 $80

    This library features a wide range of recordings from various museums and galleries, each differentiated by the nuances of their size and space. All recordings feature pristine echos, walla and movement. The library includes stereo & 5.0 recordings from:

    • War Museums
    • History Museums
    • State Museums
    • Science Museums
    • Art Galleries
    • Photography Galleries
    • State Galleries

    All sounds were recorded using a stereo pair of DPA 4060s, DPA 5100, Sound Devices Mix-Pre 6 and Sound Devices 788T.

    20 %
    OFF
    Ends 1556056800
  • The American M5 High Speed Tractor includes over 20 gigabytes of recordings of a WWII US military vehicle with a Continental 6572 six-cylinder petrol engine with 207 horsepower. 188 sound fx document a full suite of performances from M5, also known as versions M5A1, M5A2, M5A3 and M5A4.

    The performances include starting, idling, departing, arriving, and passing by from 6 exterior perspectives at slow, medium, and fast speeds. 10 additional perspectives feature motor, interior, exhaust, tracks, and other locations that capture idles, driving, and steady RPMs from onboard the tractor.

    Includes extensive Soundminer metadata.

  • Cars Volvo 242 DL 1975 Play Track 364 sounds included $249

    The Volvo 242 sound fx collection includes 271 sounds in 13.51 gigabytes of audio. The 242 is a DL 1975 version of the car, also known as models 240, 244, and 245. It features 25 takes of recordings from the Swedish vehicle and its 4-cylinder B20 A, 82 horsepower engine.

    16 synchronized perspectives capture both onboard and exterior performances. Eight onboard perspectives (12 channels, including 4 in AMBEO) recorded driving at steady RPMs, with gearshifts, and ramps using microphones mounted in the engine, interior, and exhaust. Eight other exterior perspectives (18 channels) showcase driving at fast, medium, and slow speeds approaching, departing, and passing by. There are also steadies in neutral, blips, and performed effects, as well as an Altiverb impulse response.

    All clips have 18 fields of Soundminer, BWAV, and MacOS Finder metadata.

 
FOLLOW OR SUBSCRIBE FOR THE LATEST IN FANTASTIC SOUND:
 
                              
 
GET THE MUCH-LOVED A SOUND EFFECT NEWSLETTER:
 
The A Sound Effect newsletter gets you a wealth of exclusive stories and insights
+ free sounds with every issue:
 
Subscribe here for free SFX with every issue

One thought on “How to unlock the creative power of audio in VR:

  1. Thanks for this post!

    Michael Abrash’s quote about 3D sound (‘not an addition, it’s a multiplier’) reminds me of a quote from Akira Kurosawa, the famed film director, sharing his thought about the relationship between sound and image: “sound is that which does not simply add to but multiples the effect the image.”

Leave a Reply

Your email address will not be published. Required fields are marked *