Here, sound artist Todd Baker shares how he managed to deliver the game's equally immersive soundtrack, what the scope of his work was as a one-man band working on both music and sound effects — and reveals some secrets about his success as a game audio freelancer.
Interview by Anne-Sophie Mongeau
Monument Valley 2 Official Trailer
Hi Todd, could you introduce yourself and describe briefly your contribution to the Ustwo games?
Hello! So I’ve been working in music and audio design for over 10 years, mainly on games and interactive projects, but also in other media, and also more recently as a music artist / band member. I worked in-house for a couple of big games studios, and switched to freelance about 5 years ago, forming close relationships with some amazing teams such as Ustwo Games, Media Molecule, Tarsier and The Chinese Room.
I started working with Ustwo on Lands End VR, shortly after the release of the original Monument Valley. This led to the work on Monument Valley 2, for which I created all of the music and audio design.
I’ve been trying to cultivate an ‘artist’ oriented approach to my work in recent years, thinking less in terms of a jack of all trades composer / sound designer for hire, and instead seeking projects that will suit my stylistic strengths, interests and skillset.
On Monument Valley 2, you were the only audio person working on the project, taking on both sound and music, as well as their implementation into the game, can you describe in more details what that actually involves?
This means thinking about the complete interactive audio experience, rather than treating music and sound design as separate disciplines
Essentially it means taking on responsibility for creating and shaping every aspect of the game’s audio. So when and where music and sounds occur, stylistic / artistic decisions, branding and trailer music, systems and solutions for implementation, memory and optimisation, recording mixing and mastering, occasional overwhelmed panicking…
I’ve been using the word “holistic” a lot recently to describe how I approach music and audio design. To me this means thinking about the complete interactive audio experience, rather than treating music and sound design as separate disciplines. Obviously not every game would suit this approach, but the aesthetic and scope of a project like MV2 lends itself very well to this way of working.
[tweet_box]Behind the beautiful sound of Monument Valley 2 – with Todd Baker[/tweet_box]Taking on everything had it’s down sides, and towards the end of development it became an intense workload for one person. It was definitely on the limit in terms of what I could have handled. On reflection, having an extra pair of hands in the later months may have been wise – in terms of both workload and creative perspective – but it’s also amazing to see a finished game that I have such a personal investment in.
I love working within an audio team, but MV2 felt like an opportunity for me to ‘do my thing’ on a fantastic project that would reach a lot of people. Having been in the industry for a while now, I know you either have to be very lucky or work really hard to get this kind of opportunity!
Can you share your inspirations for MV2’s music and sound aesthetic, and what are your main composition tools?
As the general / high-level concept, I wanted to create a sound aesthetic that felt gentle and spacious, creating a sonic space where players would be soothed and encouraged to appreciate the beauty of the game.
I wanted to create a sound aesthetic that felt gentle and spacious, creating a sonic space where players would be soothed and encouraged to appreciate the beauty of the game
Sonically it sits in between organic and electronic, and has a nostalgic, lo-fi textural edge to it. There’s a very broad palette of sound sources in the game: Lots of acoustic instruments, tuned percussion, nylon string guitar, piano, gamelan and orchestral textures (I actually learned to “play” the Low Whistle for the mother character’s flute instrument), then there’s plenty of electronic / synth / processed audio – but always with an emphasis on more organic and analog-sounding textures. As with Lands End, I felt it was important to steer clear of new-age cliche – particularly with the more electronic ambient sounds.
Bibio’s production style is a big reference point, as is Steve Reich’s minimalism and Brian Eno’s early ambient music. I’d also definitely point to Martin Stig Anderson’s work on Limbo and Inside, particularly in terms of the ‘holistic’ approach I mentioned earlier. I actually wrote a blog piece for Ustwo on some of the specific musical influences, which you can read here!
Reaper is pretty much my main sound design DAW and I’m using it more and more for music – but I still go back to Logic for a lot of music work. One or more of the following plug-ins would have touched every sound in the game:
Uh-e Satin, XLN RC-20, Valhalla Plate, Fabfilter Pro-Q2
Did you do anything sonically to try and represent the underlying architectural illusions which form the basis of the game? Or the relationship between mother and daughter that is illustrated? In other words, how did you translate some of that narrative into sound?
Storytelling is the most exciting aspect of a project to me (nowadays so much more than fiddling around with audio toys / instruments etc!) Being immersed in places, characters and narratives is what makes an experience meaningful and leaves a lasting impression.
Being immersed in places, characters and narratives is what makes an experience meaningful and leaves a lasting impression
I’m glad MV2 focused on a more defined story structure than the original game, as it gave me the opportunity to shape the audio experience around this journey of the mother and child. It’s a simple story, and is presented in a fairly impressionistic and abstract way, but this is really a sweet spot for sound and music to play their part – with a lot of freedom.
Popular on A Sound Effect right now - article continues below:
There are some recurring melodic motifs and sonic themes that run throughout the game (tied to particular places and key moments) but I don’t feel this is anything particularly sophisticated or meticulously thought out. The working style of the Ustwo team made this practically impossible (a bit more on this in the next question!).
The main focus for me was to provide an emotional guide for the player, supporting the narrative journey and balancing the contrast from level to level. I wanted to be broad-stroked and impressionistic in the spirit of the art and design, but I think the music and audio also help to add a level of unity to the aesthetic. In the final stages of development, members of the team noted how the audio was acting as a kind of glue, or backbone to an experience where art and design was characterised by the contrasting styles of the different artists working on the game..
While working as an outsourcer with Ustwo Games, how did you manage the relationship with the studio?
I think I have quite an usual audio set-up, at least in the sense that I’m very portable and have a variety of environments that I work in. I divide my time between working from my studio (a professional recording / monitoring environment), the Ustwo studio, and also a home setup.
In some ways Ustwo Games are incredibly easy to work with. The level of autonomy, trust, respect and support I had from the team was amazing. But there’s also a very spontaneous, iterative (occasionally chaotic and unstructured!) side to their process that can make any kind of planning difficult. For example, over the course of the 18 month development, the characters and story outline were not really locked down until the last 10 weeks, and the final level of the game didn’t even exist about 3 weeks before we went live on the app store.
The key was to really embrace the spontaneity
On one hand the freedom and flexibility is great, but the lack of structure and reluctance to lock down design at an earlier stage makes it difficult to realise detailed and nuanced ideas with the audio. The key was to really embrace the spontaneity. Many of the audio moments that fans have noted as favourites were ideas that came together very quickly, when I had no time to over-intellectualise and was simply bouncing off inspiration from the art and design work.
Ustwo are always keen to show games in a different light by challenging perceptions of the medium, and reaching out to a wider audience. I certainly share this passion. One of the post-launch events recently was a live session at London’s Victoria & Albert museum, where we performed arrangements of some of the MV2 soundtrack live with my band.
Lydian Collective performing the music of Monument Valley 2 Live
On a technical point of view, what audio engine did you use and did you have an audio programmer in the studio helping you? What were your most important technical challenges?
We were using Wwise with Unity. For a project like this, I think it’s reasonable to say that Wwise is absolutely essential, and I can’t imagine how it would have been possible without it. It empowers me with the tools and insights I need to do my job.
The game’s technical director Manesh Mistry has a background in music technology, and his role in championing adequate support for audio (as well as licensing Wwise) was a huge benefit to the project. Ustwo is a young studio and we’ve worked hard to cultivate a healthy attitude towards audio – it’s great to see this being embraced.
On the Unity side, the programmers put a lot of work into making some excellent bespoke tools for building levels – including a very elegant visual scripting tool for sequencing events and gameplay. We implemented audio functionality directly into these tools, which made it easy for me to plug in my wwise event hooks and parameters.
On an aesthetically focussed project like MV2, when it comes to implementation I’m keen to keep things simple where possible – particularly with wwise as my disposal. I’m happy to sacrifice some flexibility / power in a system if it allows me to get audio in easily and quickly, allowing more time to focus on the creative side and making content.
What is your asset creation approach? Do you record a lot of material which you then implement wherever fits best, or do you plan ahead and compose specifically for the various sections of the game? Roughly how many iterations do you make before being satisfied with the result?
I really don’t have a fixed approach or methodology, and try to constantly adapt to the needs of the project. As mentioned previously, embracing spontaneity was a necessity with MV2, and often iteration wasn’t even an option.
A very cool thing we did early in development, was to make an entire album of concept music for the game in just a couple of weeks (we even released this digitally under a pseudonym, as a bit of an experiment!) I went into the studio and put down a bunch of ideas, so that we had something to feed into the early concepts that were happening in the design and art. A few of those tracks were adapted to fit into the final game and it proved to be a really useful resource and process.
Can you share some tips on successfully achieving dynamic music and non-linearity? Since MV2 is a puzzle based game, players can spend any number of time within a same level – from a few seconds to long minutes, what are your techniques to avoid the music becoming repetitive?
Here’s a breakdown of the audio in one of the ‘chunks’ (sublevels) of MV2:
• The interactive rotating platforms have playful musical layers attached to their movement – as do the two buttons that move the geometry around.
• After a short while the mother character starts to play her flute (which triggers random phrases in a sympathetic musical key).
• Each phase of the puzzle completes you get a musical sting.
• The global navigation ‘ping’ that confirms the character movement also has it’s randomised hang drum note (that is tuned sympathetically to each level in the game).
I felt it was important to not overuse musical interactions, and place them at points where we wanted to encourage the player to take a moment to be playful
An example like the above elaborates on the holistic approach I mentioned earlier. There’s a mixture of diegetic, non-diegetic, ambient and interactive elements all working together to form the audio experience.
I felt it was important to not overuse musical interactions, and place them at points where we wanted to encourage the player to take a moment to be playful. There are also a lot of interactive objects that have more subtle musical layers, and I would expect many players not to notice the tonality – but hopefully it helps add to the overall feeling of a tactile world.
As an audio designer and composer for mobile games, what was your strategy for mixing? How would you compare working on a mobile game VS other platforms?
I always mix as I go along, and do a lot of A/B comparison between levels, occasionally profiling the levels in Wwise to monitor the overall headroom and make sure things are fairly consistent.
I’m cautious of treating ‘mobile games’ as a category
I’ve worked on many console projects and really there’s no reason why we should treat mobile differently. Obviously it’s mixing in stereo, and the best case scenario is that the player will listen on headphones, or maybe connected to a bluetooth system. The usual considerations apply, dynamic range and overall master level are really a creative decision to be made according to the needs of a project. I did a bit of testing on phone and tablet speakers towards the end, which are getting surprisingly good.
I’m cautious of treating ‘mobile games’ as a category: Clearly considerations for something like Monument Valley are different to a game like Candy Crush or CSR Racing. Not to mention that our mobile devices are clearly destined to become consoles when connected to a display and controller.
A parting thought: We actually managed to get a Unity analytics hook into the game that detected (upon completing each level) whether or not the player had headphones plugged in.
This felt a bit like opening pandora’s box… Would the results reveal the worst possible statistic for negotiating the next audio budget? Or would we delight at all the audio-loving listeners using their trendy Skullcandy’s?
Rather than revealing the results here, I’d be fascinated to ask people to take a guess in the comments below (% of people that played the game with headphones) and I’ll reveal the answer after we’ve seen a few!
Please share this:
-
17 %OFF
Wow, no guesses yet? I’d have absolutely no idea, but let’s say… um… 20% of the time people had headphones? I’m not entirely sure at all, but I’m curious to hear the answer so I thought I might as well take a guess :) For reference, I played both without headphones (just the device speaker) and with bluetooth headphones.
But (since I’m here commenting) firstly, how can I go without saying — AMAZING soundtrack. Absolutely amazing. I Love it. Before the album was released, sometimes I would just leave the game open so I could listen to the music (not surprised to have heard others doing this too!)
My favourites are “Gamelan Rain Melody” (I so love this when there’s the purple scene and the giant rose moon 😍 actually, I’d say this was my absolute favourite), “Child” (this was very meaningful to me), “Interwoven Stories”, and not to mention all the other moods the audio creates (makes me shudder in awe!) like Ro’s flute, and when Ro and her child become separated in that second chapter 😢. There was also some pretty trippy stuff with Ro’s chapters later on in the game! (when the music scratches as you pull things up and down.)
Oh, and I *so* loved the sound effects too — the turning of handles, the new sound for the button in the story text scenes — can’t believe you weren’t just doing the soundtrack but all the audio too! (P.S. I don’t know anything about making music, I’m just commenting as a fan of the game.) It’s so lovely to hear your insights in this interview and on ustwogames’ blog 😄
Anyway, whatever the results/statistics on % of people wearing headphones, *please*, I hope you don’t compromise on the audio, ever! This music was just wonderful.
From, a Monument Valley fan.
Wow, no guesses yet? I’d have absolutely no idea, but let’s say… um… 20% of the time people had headphones? I’m not entirely sure at all, but I’m curious to hear the answer so I thought I might as well take a guess :) For reference, I played both without headphones (just the device speaker) and with bluetooth headphones.
But (since I’m here commenting) firstly, how can I go without saying — AMAZING soundtrack. Before the album was released, sometimes I would just leave the game open so I could listen to the music (not surprised to have heard others doing this too!) My favourites are “Gamelan Rain Melody” (I so love this when there’s the purple scene and the giant rose moon 😍 actually, I’d say this was my absolute favourite), “Child” (this was very meaningful to me), “Interwoven Stories”, and not to mention all the other moods the audio creates (makes me shudder in awe!)
Oh, and I *so* loved the sound effects too — the turning of handles, the sound for the button at the end of every chapter — can’t believe you weren’t just doing the soundtrack but all the audio too! (P.S. I don’t know anything about making music, I’m just commenting as a naive fan of the game.) It’s so lovely to hear your insights in this interview and on ustwogames’ blog 😄
Anyway, whatever the results/statistics on % of people wearing headphones, *please*, I hope you don’t compromise on the audio, ever! This music was just wonderful.
From, a Monument Valley fan.
*Whoops, sorry for the duplicate comment. I’m not totally sure how to delete the other one!
I’m gonna day somewhere between 20-30% having seen people I know actively look for their headphones to play this game I’d like to think it’s higher than most
I played this at home, so I listened with headphones (or into my speaker system) 100% of the time. I wouldn’t do that for every game, but the music and sound in MV2 was really interesting.
This is such a wonderful game. Just played it the whole way through – using headphones. There is a great level where as you slide the some of the interactive blocks up and down, the music speeds up, slows down or even reverses depending on how quickly you slide the blocks! Really nice touch. Would love the opportunity to work on something like monument valley – so beautiful. I’d also say maybe 60% of users play on headphones – why wouldn’t you?
I played through both games in the arguably best conditions possible: with noise-cancelling headphones in the subway. Everyday commute felt like another dimension for a few days.
My guess is the percentage would be somewhere around 20%. You see a lot of commuters playing mobile games with headphones where I live, but it just happens to (probably) be one of the regions of the planet where Monument Valley is least known.
Couldn’t recommend more playing with headphones though. :)