Interview by Jennifer Walden, photos courtesy of Jumpship
Independent game studio Jumpship has gotten off to a great start with their debut game release Somerville – a sci-fi-adventure puzzle game that puts the player in the role of a father who must save his family from an alien invasion.
Sound-wise, the game relies on ambiences and sound design to help tell the story since there’s no dialogue or narration. It’s also a key component in conveying emotion since music is used sparingly, and more of a respite from the unease instead of leading the player to feel uneasy or scared.
Here, Audio Director Matteo Cerquone talks about creating a foreboding atmosphere using ambiences, how he created sound for the aliens (like the alien “sediment” that often impedes the player’s path, the stalkers, and the drones), and how he used sound cues to help players solve the game’s puzzles. He also breaks down his approach to making a mix that sounds more like a film and less like game – a mix that’s guided by storytelling instead of what you’d expect to hear in reality.
Somerville E3 2021 Trailer
If you had to describe the sound of Somerville using just three adjectives, what would they be?

‘Somerville’ Audio Director Matteo Cerquone
Matteo Cerquone (MC): Probably the first adjective I would use is “cinematic.” A lot of moments and areas in the game were designed and mixed linearly on a DAW, as if Somerville was a movie rather than a game.
We would then break it down into all the different assets and create different systems and game rules (along with our audio coder) to make sure that all of the mix choices we put into our linear design would get translated properly into the game. This linear approach of creating and implementing audio allowed us to focus on the mix and storytelling without the restrictions of being confined to hard fixed rules that are usually dictated by the game engine.
The second adjective would be “foreboding.” I remember reading an article in The Guardian where they described the sound of Somerville as being “a masterclass in foreboding sound design” and I thought it was such an on-point adjective.
There is no music to anticipate what is about to happen and when the game throws music at you…
I think part of that is because there is a lack of music throughout most of the game. The intention was to leave the viewer with an experience that can feel personal. There is no music to anticipate what is about to happen and when the game throws music at you, it’s usually more of a reminder to slow down, sit back, and digest what you’ve just experienced.
Lastly, another adjective would be “atmospheric.” Since the game is about experiencing the intimate repercussions of a large-scale conflict, much thought went into the work of building ambiences and the overall sonic atmosphere so that it could feel both familiar and alien at the same time; ambiences frequently became the sonic protagonist.
How did you use sound to provide audio clues to help players figure out the puzzles or help explain what’s happening?
MC: Somerville is packed with little hints, sound cues, and subtle mix changes designed to help guide the player through the different puzzles. Probably the most obvious ones are puzzles that require you to use light sources. There are sources of light in the game that you can interact with in order to melt or solidify an alien substance called “sediment” that often blocks your path. The sound of these lights intensifies and flickers more frequently as you get closer to them (and this is also accompanied by visual cues). After you interact with these source of lights and touch them, the sound changes and intensifies even more. The flickering becomes more pronounced until you finally use your power to melt away the sediment.
The sound of these lights intensifies and flickers more frequently as you get closer to them…
Other situations where we used sound to guide the player’s action is during our multiple encounters with these purple light beams coming from the alien ships in the sky. We placed different audio emitters in the level that would react to these beams. For example, in the early section of the game, as you traverse the forest and encounter one of these purple lights you can hear trees and foliage around you shaking with increasing intensity as it gets closer to the audio emitters.
In the festival section, the same thing happens but this time we have camping tents, flags, and shutters reacting to the purple light beams. This helps the player pinpoint what areas are being affected by the incoming danger. It gives a sense of urgency to finding a shelter as it gets closer to our character and the sound of the environment around you intensifies.
Speaking of shelters, we also have small mix changes when we become shielded from this danger.
We have another subtle hint in the festival area as the player seeks shelter from a purple beam that is swiping across the map, searching for and abducting survivors. There are parasols that are closed at first, but as the beam swipes across the area and sucks in anything in its range, it causes the parasols to open up, providing the player with a temporary shelter. The sound for the parasol opening causes the rest of the audio to duck down a few decibels. Again, it’s a subtle change but it helps reveal the solution to the player since the sound of the parasol opening becomes the most important one during this sequence.
What went into the sci-fi/alien sounds? How did you create those? Also, what were some of your favorite sounds for the alien aspects of the game? And what went into creating them?
MC: The sound for the aliens went through a few iterations before we arrived at the final version. Chris, our director, described the whole alien race as beings made out of light, almost as if they’re made of a condensed nuclear core that is ever burning. These beings will then use the sediment (the dark alien substance) both as a shell and as a means for interacting with our world, gaining the ability to walk, run, roll, and change shape, adapting to the environment they are in.
At first, I tried to convey the “light” aspect of the alien. I played with synths and human voices going through heavy processing to make them feel almost ethereal but it often sounded too abstract; we didn’t feel like it was grounded to the visuals.
…I recorded different materials that could sell the idea of a substance liquefying and hardening over time.
After a few unsuccessful attempts, I decided to take a step back and focus more on the “shell” – the sediment/alien substance that can be found scattered all around the map. It can be melted and solidified and everything in between when using the right power. It’s also used as part of the technology and weaponry that the resistance utilizes to fight the invaders. So, I recorded different materials that could sell the idea of a substance liquefying and hardening over time. These sound sources included cabbages, chards, fennels, flour, oranges, peppers, rice crackers, rotten wood, rubber, and cellophane. These recordings were loaded into a sampler and pitched, stretched, and processed in a multitude of ways. The results yielded a few little hidden gems that would pop up for a few seconds during the playback of such processed recordings; these little snippets of sounds had a guttural texture and almost sounded like some sort of vocalizations. So I went on recording more and more of these materials, trying to squeeze more variations, and stumbling upon more happy accidents. Most of these vocalizations-like sounds ended up being the main layer for the “stalkers” (the animal-like alien that often looks for and chases after human survivors).
These recordings also ended up being the sound palette for anything that is made out of sediment. They were used for giving character to the drones (the little harmless alien that we find rolling around on the ground), for crafting the sound of the resistance weaponry and technology, and (probably one of my favorite) for melting and solidifying the sediment itself.
What were some of the most challenging sequences, puzzles, or locations in terms of sound? What were some of the challenges?
MC: Probably one of the most interesting challenges we had was implementing the audio for the stalkers and the drones. One of the main peculiarities about the stalkers was that there are no animation assets for them as they are all using procedural animation. For us, it meant that we could not place audio triggers into pre-baked animations. All we had was a list of behaviors and commands that the stalker would follow – such as “go to x location,” “wait for x thing to happen,” look at a specific object,” “attack,” etc. The way they move is all driven by code.
So for audio, the way we implemented these sounds was by using loops that play silently. We would then read the speed and rotation of the different bones that the stalker is made of and use these to drive the volume of our audio loops.
…the way we implemented these sounds was by using loops that play silently.
For example, the stalker has this guttural, vocalization-like sound that we talked about earlier. That sound (along with other layers) is a loop that can only be heard when the stalker is rotating its head; the speed of this rotation is what drives the volume, filter, and pitch of these different layers.
We can read the speed at which each stalker is moving and use that to change and re-mix the sound of the stalker in real-time, changing footsteps, and overall, giving them a different mood – making them feel in a rush when they are running or making them feel more relaxed and less “talkative” when they are idling or scanning the area.
Similarly, the drones have different audio loops that can be heard only when they are rolling. The speed at which they are moving dictates what sounds are played and how loud it is compared to the rest of the layers; other cute sounds can only be heard when the drone turns its head. I’m quite happy with how they turned out since there is a lot of character that comes through when they start running away or when they simply look at the main character with curiosity. Overall, this also turned out to be less time-consuming than hypothetically having to manually tag a whole set of separate animations.
SPOILER ALERT: This section contains spoilers!
In terms of locations, probably one of the areas that gave me a lot to think about – purely in terms of sound design – is the inside of the mothership towards the end of the game. There are these areas in the ship where we can find the kids who have been abducted and are now trapped inside pods, living a fake reality. Chris explained to me that these locations should feel very sacred since those kids are actually being kept safe and protected, just like fetuses inside their mothers’ wombs. For this whole area, after a few trials and errors, I decided to record my own breathing, heartbeat, and blood flow using a stethoscope. I then processed and mixed them with more sediment-like sounds (again, using the same recordings we mentioned earlier). There is also a drastic change in the mix happening here; you can hear the breathing and footsteps of our character becoming muffled to simulate this feeling of being inside a mother’s womb. Going in this direction helped us get away with leaving the pods and the kids inside them completely silent since we were running out of time for the release of the game.
Popular on A Sound Effect right now - article continues below:
What were some of your challenges in mixing the sound for the game?
MC: The biggest challenge overall was actually about making the mix consistent at first and getting the ability, later on, to re-shape it into a more dynamic mix that is driven by storytelling decisions rather than hard fixed rules – all of this to embrace one of the audio direction pillars of making the game sound as cinematic as possible. For example, in modern games, we always have an invisible object in the 3D space that is usually referred to as “the listener.” This is what emulates the distance and position of audio sources in the space. In a few words, it’s our virtual ears in a 3D environment, so if a sound source is located on the left of the listener then we will hear it mostly in the left speaker; if its position is in the far distance, then we’ll hear it lower in volume, filtered, more reverberant, etc.
The biggest challenge overall was actually about making the mix consistent at first…
At first, we placed the listener on the main character itself and that made audio sources in the world sound louder or quieter based on their distance and position in relation to the main character. While that’s a good enough scenario for most of the third-person perspective games out there, we couldn’t get it to work in Somerville. We had a lot of issues and inconsistencies and that was because of the way we use cameras in our game.
The camera in Somerville does not have a fixed behavior. It constantly changes and interpolates between different settings. For example, at times it can be on a tripod where the only movement allowed is the rotation and it always faces the direction of our main character. Other times, it moves on a rail (dolly shots), and in this case, it follows the player from a predefined path. Lastly, it can also detach at any point and become free of movement where it simply follows the player.
Each area of the game also dictates not just how the camera moves but also the position and distance between the camera and the main character, the type of lens used, and how it interpolates with the next area and the next camera settings – all of that happens seamlessly under the direction of our game director.
So going back to our “audio listener,” having it positioned in the character itself meant that at times (like, if we have our character far in the background), we would still hear his breathing, footsteps, and any sound source near him very loudly and up close even though he is only a few pixels visibles in the background, omitting all the sounds that, in this case, are placed in the foreground.
…sounds are not always supposed to be the loudest based on their position on screen, but instead, the mix should be dictated by storytelling decisions.
So the next attempt was to place the listener on the camera itself so that volume attenuations, reverbs, and the positioning of the different sound sources in the speakers are driven by the distance between the camera and the emitters themselves. We found in those cases that, because cameras in Somerville make use of different lenses with different fields of view, a specific set of lenses would make objects appear much closer or further away from where they actually are, creating a lot of inconsistencies with audio across the whole game.
With this last try it became clear to us that we had to take a step back and look again to one of our audio direction pillars, which was to make Somerville sound like a movie. This meant that sounds are not always supposed to be the loudest based on their position on screen, but instead, the mix should be dictated by storytelling decisions. So with that in mind, we changed our way of designing and implementing audio.
We started by using a more linear approach. First, we would take a video of a playthrough, bring it into our DAW of choice (in our case it’s Reaper), and then edit the sounds there. We’d spend a good amount of time doing a mix pass in our DAW. This way we were able to design the overall sonic soundscape freely, focusing on storytelling without any restrictions from rules and confinements dictated by the game engines.
We ended up having two audio listeners running at the same time: one for the character and one for the camera.
We would then do a breakdown and analyze how the soundscape and the mix would change based on what we did in our DAW. After that, Jay (our audio coder) and I would jump on a call to discuss ways it could be implemented. He would then try to break it down into tools that could be an addition to an existing system or determine whether a new system was needed. This opened a whole Pandora’s box for us and it drastically changed not just the way we worked but also how we hacked, changed, and iterated upon basic audio engine rules that have been taken for granted over the past couple of decades.
We ended up having two audio listeners running at the same time: one for the character and one for the camera. We could then decide for each audio emitter in the game what listener to take into consideration between these two. We exposed the camera’s azimuth and elevation so that we could read from these parameters and feed them into specific sounds in the environment. This allowed us to create and interpolate between different mix snapshots that are dependent on the camera’s position and its facing direction.
We also ended up creating a system that can read the actual size on-screen of our character, as well as other specific objects in the game, so that we could then have volume attenuations, filters, and reverbs driven by this parameter and bypass the issue of the camera utilizing lenses with different fields of view. We could also read if a sound source was on-screen or off-screen. Additionally, we have created a few extra tools that would help us change the mix in real time depending on the character’s location as well as the camera’s location.
Somerville was created using the Unity game engine. Was this a good fit for the sound team as well? And did you use middleware
MC: I personally did not find any major drawbacks or differences between Unity and other game engines when it comes down to implementing sounds as long as the same audio middleware tool is used.
And yes, we used Wwise as our middleware. Often with audio, you are thrown into a project that has already been started long before audio has even been taken into consideration, so sometimes all we can do is adapt and hope for the best.
What was unique about your experience of creating the sound of Somerville?
MC: I can’t really pinpoint a specific unique experience without omitting countless others that this project has given me. Somerville is a very personal project to me. I’ve put a lot of research and countless attempts into trying to make this world feel unique in its own way. It is the first game in which I’ve been involved as the audio director, the first one where I was also credited as one of the composers, and the first one as a developer in a small indie team. I enjoyed the whole process of designing the overall soundscape, chatting with Chris on the direction of the game, and iterating upon his feedback.
Everyone has contributed enormously to making ‘Somerville’ sound the way it does today.
Building up a very small team was a blast. It was me and Jay Steen for the most part. Later on, we were joined by Adi Keltsh and Arron Amo-Travers. And towards the very end of the project, we had some more helping hands from Barney Oram and Lewis James. Everyone has contributed enormously to making Somerville sound the way it does today.
A lot of the challenges – and the way we overcame most of them – will live in my baggage of past experiences that can be taken and reiterated upon for the next projects to come.
A big thanks to Matteo Cerquone for giving us a behind-the-scenes look at the sound of Somerville and to Jennifer Walden for the interview!
Please share this:
-
80 %OFFEnds 1760911199
-
80 %OFFEnds 1761951599