3D-Audio For Sound Designers Part I – Spatial Hearing


If we take a look at the fundamental characteristics that make up the human hearing sense, we can see that we are not only perceiving loudness, pitch, time, timbre and the spectral qualities of a sound, but also a subjective perception of the spatial attributes of the sound. And since virtual worlds in games are 3D, we need to respect that and integrate audio with a certain amount of realism, in a way that it makes sense to the player. I’m not talking about absolute realism in the terms of a simulation, but we will need to go this path a long way, to get audio right in a virtual 3D world.

And since spatialization is basically the intent to reverse engineer the human spatial hearing, we need to tackle this issue first. What makes up the so called „Spatial Hearing“, is the ability of our auditory system to track the position of a sound source in space in terms of direction- and distance-changes over time (including the information that the source is moving). The human ability to locate the position and distance of a source, depends on various factors like the position and direction of the source, the acoustic environment and geometry of the space as well as the acoustical properties of the source itself. This ability varies across each individual and can´t be generalized.


Sound Sources in Space

On a very fundamental level, we can tell that the spatial characteristics that make up a natural sound, are the distinction between an open and enclosed sound field and whether its perceived as a single source entity, or more like a genuine environment.

In an open sound field, it is more likely, that sound sources in the distance are grouped together without a strong notion of directivity. Everything is more likely to blend together and to be perceived as from outside of the listener’s head. The sounds are often damped, because they have traveled a significant distance in the air and lack therefore the presence of a sound, that is only a few meters away, affected by the reflections of a room or hall. Indoors, the sound tends to be strongly altered by the effects of reflections, absorptions and diffractions as well as pressure patterns and standing waves. Sources are often in a distance of a few meters and their perception is strongly dominated by the reflections that make a large part of the spatial characteristics.

Besides that, we can categorize the spatial characteristics of sounds in space as a source being a discrete emitter, in terms that it can be perceived as a single, localizable entity. Environments often consist of a hard to distinguish mash of rather unspecific, general background sounds that are perceived as an embedding ambient sound that is hard to localize, due to the diffused character.[1]


Spatial Hearing in the Open Field

The free field is an acoustic term to describe an environment, in which there are more likely none, or only very few reflections. In real life you can probably come close, if you go outdoors on a mountain top without surroundings (but then again you would have the reflections of the ground). These environments are rarely experienced in real life. However, this ideal construction is suited to explain how sound travels in space on a very basic level.

The first conclusion is, that sound radiates as an omnidirectional point source.[2] As a consequence, this takes a significant amount of energy, because it is distributed over a sphere of increasing surface area, as it expands away from the source. The surface of the sphere is described by its radius r, so that the sound intensity S = 4π r2. Given the case that the source has a power P0 we can then derivate the sound intensity J1 at a distance r1:


If we now double the distance 2r1, we can see that the sound intensity has been reduced to around a quarter of its initial value:


Sound intensity is proportional to the square of the sound pressure. Knowing the sound pressure, we can derivate the sound pressure level from it:


This tells us, that a drop of the sound intensity to one quarter of the initial values, results in a reduction of the sound pressure to one half. This corresponds to a reduction of the sound level to about -6 dB SPL for every doubling in distance from the source.

The second conclusion we can draw here is, that sources radiate more likely in an omnidirectional manner for low frequencies and more directionally as the frequency rises.[6] Close to the source (near field), the radiation patterns are more likely to be obvious and the level drop can be quite significant, getting less and less obvious as one moves further away from it (far field).[7] With increased distance to the source, the wave front curvature get less, to a point where it becomes so shallow that it can´t be perceived as a directional source. Then it can be considered as a plane wave. The question of how much distance is sufficient for this effect to be noticeable depends on the distance to the source, as well as the dimensions of the source itself, if there is only one or various sources in a row (wavefront superposition) and lastly the wave length of the sounds.[8]


Spatial Hearing in Enclosed Spaces

In closed spaces with a geometric form and reflective surfaces, we perceive not only the direct part of the auditory event, but also the reflections of the room. With this reflections, the human brain can decode the spatial dimensions from both, the environment and the source (plus the source position and distance), as well as informations of the source´s position relative to the rooms geometry (indirect localization). A schematic view of these reflections can be illustrated with a room impulse response[9] in the time domain:



The first component that arrives the receiver is the direct signal. It is then followed by the early reflections, that result from bouncing of reflecting surfaces. They represent the shortest delay times with the shortest delay paths.

The pre-delay (or Initial Time Delay) is the time between the direct signal and the first early reflections. The bigger the room dimensions, the longer the pre-delay, because the early reflections will need more time when they travel through air. The pre-delay also gives us information about the relative position of the source in the room, as well as its relation to the room geometry. A longer pre-delay means that the source is more likely to be away from a wall or reflective surface and vice versa, a shorter pre-delay means that its most likely to be near a reflective geometric obstacle. The early reflections as well will be sparser in bigger room and denser in smaller ones[11]. After that, a lot of reflections come in and blend together into a reverb tail, with a significant lower amount of energy, because of the surface absorption and air damping compared to the direct signal and the early reflections.

This can be described with a high frequency decay function, that basically represents a quicker decay for high frequencies over distance. With these properties in mind, we can later on effectively simulate convincing and distinctive reverbs in the game.

Roughly speaking we can distinguish two kinds of sound fields in enclosed spaces. In the direct sound field, that is at close distance to the source, where the direct signal component dominates in level. In the diffuse sound field, the early and late reflections do dominate in level.

They are separated by the critical distance, that’s the distance from the source where the direct and reverberant sound components have the same energy[12]. The variables that have to be considered are: The directivity factor of the source (D = 1 for an omnidirectional source), the room volume in m³(V), the absorption surface of the room (A) and the reverberation time measured after the Sabine Equation E60 in seconds (T60).



The critical distance is a fairly important line of orientation in a room. We can develop a feeling for that ratio of direct and indirect sound in our hearing and subconscious figure out where the mid-poison of a room roughly is. With moving in and out of the direct and diffuse sound field, we get very quickly an impression of the reverberation pattern of that room and therefore about the reflective materials (walls), size and geometry of the room. We know from our hearing experience for example, that the room must have reflective materials and a rather big size, when the critical distance gets smaller, and the room has a higher RT60 reverberation time.[15]


Sound Source Localization

When it comes to the basic mechanisms, two terms need to be differentiated: Localization on the one hand, refers to the determination of the exact position of the source in the three-dimensional space. On the other hand, lateralization tries to determine the lateral displacement of an audio event in a strict horizontal (one-dimensional) manner along the ear axis.[16]

And finally, we have to take into consideration the spatial attributes of the source, meaning if it is a single event or if it consists of multiple auditory events.


Localization Cues for a Single Source

The first academic notes on this subject’s ca be found in Lord Rayleigh´s (1842-1910) Duplex Theory.[17] It describes the basic mechanisms of our horizontal hearing (lateralization). He observed, that sound arriving at the listener located at one side from the median plane could be easily located and the signal at the ear that was facing to the source would be received louder than at the other ear, due to the shadowing effect. He noticed however, that this was not noticeable at frequencies lower that 1000 Hz. He also noticed that our hearing is sensible to differences in phase of low frequency tones at the two ears. And lastly he figured out that the localization system isn’t always accurate when it arrives at the same angle from opposite directions. The mechanisms he discovered are the inter-aural time delay and the inter-aural level difference. These mechanisms are based on the Precedence Effect (Law of the first wavefront), that basically describes the fact that the first wavefront falling on the ear determines the perceived direction of the sound, even if the early reflections in a room are louder than the direct signal.[18]

Furthermore, the distance and size of the source is important too. Small objects seem to appear as point sources that can easily pinned down and large objects are more likely to emit from a volumetric extend.[19] Another thing to be acknowledged is that due to our hearing physiology, we are more likely to localize sounds with higher frequencies and sharp attacks, than sustained sounds with lower frequencies.[20]


Binaural Cues

In the same way we perceive perspective in our vision, our two ears allow us to get a spatial impression of our surroundings. The differences in time, level and spectrum from the source and the respective ear, build the main mechanisms that drive our hearing.[21]

The first difference is described in the time domain. The Inter-aural Time Delay[22] is the time difference between the arrival of the same sound at each ear. In the horizontal plane.[23] In publications, the average ear distance varies from 17 cm to 18 cm[24]. Therefore, the maximum possible distance the sound has to travel to get around the head is 21 cm at an angle of 90°, resulting in a max. delay time between the two ears of: 0,21/c = based on diff. ear distances and speed of sound, results vary from 0,61 MS to around 0.7 MS[25] (binaural delay). These time differences are particularly registered at the start- and endpoints of a sound event. There is no way to distinguish front and back sources when their angle and distance result in same delay times (e.g., right/front – left/back source position confusion). The ITD starts at frequencies of around 1 kHz and is most prominent at around 700-800 Hz, but completely ineffective in terms of modulation, at frequencies beyond 1,5 kHz. The reason why this method only works in that frequency domain, has to do with proportion of the head and the wavelength of the perceived signal. When the path length from one ear to the other equals half of the wavelength of the signal measured (approx. 700 Hz), the inter-aural phase difference begins to provide ambiguous cues. For frequencies above 1500 Hz the dimension of the head is larger than the wavelength. That’s leads to absolute ambiguous time cues, due to cross correlation.

When a sound source deviates from the median plane, the sound pressure at the farther ear is attenuated due to the shadowing effect of the head, resulting in a difference in level or intensity of the sound reaching the two ears. This effect becomes effective for frequencies above 700 Hz till about 1500 Hz. This means that the inter-aural level difference has an impact on lateralization throughout the frequency spectrum. Experimental result show that inter-aural level differences above 15-20 dB will completely move an image to one side.[26] When ILD cues contradict ITD cues, the ITD wins for frequencies that are manly above 1500 Hz.[27]


Resume Directional Localization Cues

To Wrap it up briefly, these are the main mechanisms when it comes to localization cues:

  • For frequencies of around 5-6 kHz and up to 16 kHz and above, where the wavelength is smaller than the head size, the main mechanism of localization is built on ILD due to the shadowing effect of the head.
  • The localization effect of spectral cues is particularly prominent in the 5-6 kHz area. Main reason is the dimension of the pinna. They mainly provide horizontal and front-to-back localization and also help to resolve lateral ambiguities.
  • In the 1600-700 Hz area both mechanisms ITD and ILD are active. Wavelengths are similar to the head size. At this point (1500 Hz) we start to see level differences due to head shadowing as well as inter aural phase delay differences.
  • For frequencies of 700-80 Hz, the ITD delivered from inter-aural phase differences is the dominant localization cue, as wavelengths start getting longer than the average head size.
  • At above 80 Hz wavelength are so long that it is very hard for us to localize the sound. This is a gradual fade and therefore it is often overseen that we are capable of locating subwoofers to some extent.
  • And lastly, we have to consider the dynamic cue that is introduced by slight head movements to resolve front-to-back ambiguity and for vertical localization and the visual cue that or eyes provide, that can enhance our hearing perception if we have a visual target we can focus on.

    Spectral Cues

In contrast to the binaural cues, the spectral cues work in a monaural[28] way due to the reflection and diffraction of the individual’s auditory system, mainly from the pinna as well as from the head and shoulders. Where binaural cues derived from IDT and ILD are mainly related to lateralization, the spectral cues are mostly connected to the vertical and front-to-back localization (although the so called „Head Shadowing Effect“ plays a major part in lateralization). Because of the geometry of our auditory system, we can observe different delays between the direct signal that goes straight into to the ear channel and the parts that get reflected, diffracted or absorbed mainly by the pinna, depending on the location of the source and its angle of incidence. All this acts as a filter effect that alters the incident sound spectrum, creating direction depended notches and peaks.[29] In addition to this it must be considered, that the geometry of the pinna differs from individual to individual significantly. It is astonishing that nonetheless our localization capabilities didn’t differ that much. This means that those spectral alterations must be qualified as individualized localization cues that we humans are capable to adapt to. Taking into consideration the dimensions of the pinna (approx. 65 mm on average) we can derive that the related effects will be effectively noticeable at higher frequencies where the wavelength is comparable to the dimension of the pinna. The effect will start around 2-3 kHz and be prominent around 5-6 kHz.[30]


       Cone of Confusion and Head Movement

ITD and ILD are the dominant localization cues at low and high frequencies, but are inadequate of determine the unique position. However, an infinite number of spatial positions exist that possess identical IDT and ILD values. In an idealized model where we disregard the implications the form of the head has and if we assume that the ears are two separate points free in space which build a cone around the inner aural axis in the tree-dimensional-space, we get the form in which these phenomena occur. That is what is called the cone of confusion. An extreme case is when the source is located at 0° at the median plane where sound level and delays to both ears are identically. Another case of identical ITD and ILD is when there are two sound sources at front-back mirror positions (as at azimuth 45° and 135°). There is no obvious way of distinguishing between those front and rear sources with ITD and ILD only. Binaural cues are ambiguous. This ambiguity can be solved by head movement (so called dynamic cue first mentioned by Wallach[31]). If we move our head clockwise and counterclockwise around the vertical axis, we can effectively change ITD, ILD and the sound pressure spectra at the ears. This way we are able to generate more measures and compare the localization cues dynamically to resolve ambiguities.[32]

      Head Related Transfer Functions

As mentioned before, when a sound from a certain position arrives our ears, after interacting with our anatomic structures, his properties have been changed due to the reflection, diffraction and absorption, mainly by the pinna, our head, shoulders and torso. Various localization information obtained from the ITD, ILD and the spectral cues come together, for the auditory system to process them and comprehensively locate the sound source. When you take for example a fixed-point source and a fixed head position, then you can describe this transmission process as a linear time-invariant process[33]. The head related transfer function[34] for each ear describes the overall filtering effect imposed by our anatomical structures and are introduced as an acoustic transfer function of the LTI process that can be displayed in these equations:[35]

(Left Ear HRTF)

(Right Ear HRTF)

PL and PR represent the spectral alterations at the left and right ear. P0 represents the complex-valued sound pressures in the frequency domain at the right and left ear at 0° center head. Due to our unique anatomical structure, each human has its own individual HRTF´s his hearing relies on. They can be measured in a time consuming and complicated process. The main problem is that he has to stand still with measurement microphones in his ears, while sound sources are recorded from all relevant positions that are necessary to create virtual sound sources in the application. This means that basically everyone must have his HRTF´s measured and use its personal ones in theory. But there are two reasons why this is not much of a deal in VR. First, we have head tracking and an open field of view with the head display on that system. This gives us a directional and visual cue and we can disambiguate uncertainties, that may occur due to incompatible HRTF´s and we can slowly learn to get along with them. Secondly it has been shown, that besides all the differences in the measures of HRTF from individuals, there are some basic patterns that work for all humans. These are the so called „Directional-Bands“[36]. They are basically frequency boosts and attenuations related to source positions. Regions around 0,3-06 kHz and 3-6 kHz seem to relate to frontal positions, 8 kHz seems to correspond to the overhead position and the 1,2 kHz and 12 kHz areas appear to be related to the rear perception.[37] This is the reason, why binaural reproduction over headphones using averaged HRTF´s does work.


Distance and Depth Perception

From an acoustic perspective, distance is a term used to describe how far away a source appears to the listener where depth is used to describe the overall front to back distance of a scene and the sense of perspective created.[38]

In terms of distance there are a few basic rules that our hearing applies to determine the distance of a source (apart from the things said to the open and closed sound field). The most obvious one is that we determine distance with level attenuation. But we need a context for that level difference. When a source is moving, we can figure out the relative loudness cue, meaning a scale that is changing and therefore we can extract the information out of it, if a source is nearer or further away. But this mechanism only applies effectively for known sound sources. The brain is more reactive on relative changes in level and spectral differences.[39]

In addition to that, the sound loose energy when its traveling through the air. And since high frequencies have less energy than low frequencies, we perceive sounds that are farer away not only at a lower level, but also duller due the loss of high frequency energy (air damping).

The strongest cues for our distance perception in enclosed rooms, is the ratio between the direct and reverberant sound component (wet/dry ratio). The ratio between the direct signal, the early reflections and the reverberation tail, as well as the composition of the reverberation components and their timings, tells us a lot about the size and geometry of a room and the position of a source in It. As I have described under II. A. 1. b), the initial time delay is part of that impression.

The last rule of distance perception that I want to describe is the motion parallax. When sound is traversing very quickly through our sounds field, it is an indication that it is very close to us, because we know, thanks to our hearing experience, that sound travels through air at a certain speed (approx. 343 m/s at 20°C)[40]. Because it can´t go faster, we can make the assumption that it must be very close to be able to cross our hearing field that suddenly. This phenomenon takes place at very close distances around 0,25 m or less. More about depth of sound sources in the following sub chapter.


Source Width and Envelopment

Sound sources in space can be perceived as small pin-pointed events or subjectively as an event with bigger dimension. This subjective phenomenon is subsidized under the term „Apparent Source Width“. The ASW[41] has been found to relate closely to a binaural measurement known as inter aural cross correlation, which measures the degree of similarity between the signals at the two ears comparing different frequency bands and time windows[42]. If a small-time window is measured (early IACC[43]), that’s up to about 80 MS, then we can see that there is a correlation between the measured early reflections and the broadening of the sound source.[44]

In a reverberant environment it can be hard to tell if a perceived source is „wide“ or just diffuse and hard to localize. Furthermore, it can be quite difficult to distinguish the individual source width of a big sound source from the width of the overall sound stage, which describes the distance perceived between the outer left and right limits of the hole stereophonic scene.

When we try to describe the environmental spatial impression of a sound field, spaciousness can be used to describe the hearing impression of an „open“ space when a sound appears to exist outside of the listeners personal space in his surroundings. On the other side, envelopment is used to describe the sense of immersivity and involvement in a reverberant sound field, with that sound appearing to come from all around.[45]


Spatial Hearing with Multiple Sources


Summing Localization Law for two Sources

The major mechanism can be described with a phenomenon called: two-channel-stereophonic localization[46]. If the two sound sources are emitting the same signal with the same sound level, then the listener will locate a virtual sound source symmetrical in the middle of the two sources. When they are playing at different levels, the source will lean toward the source with the bigger level in the panorama. If the level difference is larger than 15 dB, than the virtual source will be located at the position of the respected real sources the position didn’t change even if you further increase the level difference. And finally, the stereophonic law of sine describes, that when the spatial position of the virtual source is completely determined by the amplitude ratio between the two loudspeaker signals and the pan angle between them in respect to the listeners position, frequency and head radius are irrelevant.


Cocktail Party Effect.

Describes a psychoacoustical effect, that refers to our ability to focus attention on the speech of a specific speaker by disregarding irrelevant information coming from the surroundings[47]. Although the sound components are similar in intensity and frequency, our auditory system is still able to separate the desired signal from the interfering noise signal. „From the physical point of view, one of the predominant elements in the cocktail effect is the spatial separation of noise and speech. In consequence, we know that on the psycho-physiological level, selective listening is governed by our capacity to discriminate sounds from different sources – that is, by our capacity to localize the noise.“[48] That means that the cocktail party effect is a kind of binaural auditory effect associated with the spatial hearing of multiple sources through a comprehensive processing of the binaural sound information’s by the high-level neural system.[49]



  1. Rumsey, „Spatial Audio“, P 2.
  2. Rumsey, „Spatial Audio“, P 22.
  3. Görne. „Tontechnik“. 2011, P 36.
  4. Görne. „Tontechnik“. 2011, P 36.
  5. Görne. „Tontechnik“. 2011, P 32.
  6. Rumsey, „Spatial Audio“, P 8.
  7. Görne. „Tontechnik“. 2011, P 38.
  8. Farrell, „Designing Sounds“ Pp 53, 56-58.
  9. RIR = Room Impulse Response.
  10. https://upload.wikimedia.org/wikipedia/commons/1/19/Acoustic_room_impulse_response.jpeg
  11. Sandmann. „Effekte und Dynamics“. Pp 60-62.
  12. http://education.lenardaudio.com/en/04_acoustics_2.html
  13. Görne „Tontechnik“. P 86.
  14. Görne Tontechnik“. P 85.
  15. Görne. „Mikrofone in Theorie und Praxis“. P 22-23.
  16. Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 8.
  17. http://www.diracdelta.co.uk/science/source/d/u/duplex%20theory%20of%20localization/source.html#.WGyuqbElyEI
  18. Görne. „Tontechnik“. 2011, P 128.
  19. Farrell. „Designing Sounds“ P 73.
  20. Farrell. „Designing Sounds“ P 73.
  21. Rumsey, „Spacial Audio“, Pp 21-26.
  22. ITD = Inter Aural Time Delay
  23. Xie, „Head-Related Transfer Functions and Virtual Auditory Display“. Pp 15; Farrell. „Designing Sounds“ P 73.
  24. Görne. „Tontechnik“. 2011, P 126; Raffaseder, „Audiodesign“ P 127.
  25. (0,61 ms) Görne. „Tontechnik“. 2011, P 126; (0,65 ms) Rumsey, „Spacial Audio“, P 22; (0.7 ms) Raffaseder, „Audiodesign“ P 127 and Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 15.
  26. http://www.diracdelta.co.uk/science/source/i/n/interaural%20level%20difference/source.html#.WGywX7ElyEI
  27. Farnell. „Designing Sounds“. P 72.
  28. Jin, Corderoy, Carlie, van Schaik. „Spectral Cues in Human Sound Localization“.
  29. Xie, „Head-Related Transfer Functions and Virtual Auditory Display“. Pp 15. Regarding the different theories in science to this phenomena see also Pp 15-17.
  30. Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 15.
  31. Wallach. „The role of head movement and vestibular and visual cue in sound localization“. Pp 339-354.
  32. Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 20.
  33. LTI Process = Linear time-invariant process
  34. HRLT = Head related transfer function.
  35. Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 20.
  36. Blauert. “Spacial Hearing. The Psychophysics of Human Sound Localization” Pp 205-213.
  37. Rumsey, „Spacial Audio“, Pp 25..
  38. Rumsey, „Spacial Audio“, Pp 35.
  39. Brian Hook talk „Oculus Connect: Introduction to Audio in VR“. Released at 29.10.2014 at OculusConnect2014: https://www.youtube.com/watch?v=kBBuuvEP5Z4
  40. Weinzierl. „Handbuch der Studiotechnik“ P 23.
  41. ASW = Apparent Sound Width
  42. Rumsey, „Spacial Audio“, P 37.
  43. IACC = Inter Aural Cross Correlation.
  44. Rumsey, „Spacial Audio“, Pp 36-37.
  45. Rumsey, „Spacial Audio“, P 38.
  46. Boer, „Stereophonic sound reproduction“ Diss.
  47. Weinzierl. „Handbuch der Studiotechnik“ P 118.
  48. Cherry. E. C. „Some experiments on the recognition of speech, witch one and two ears“. J. Acoust. Soc. Am. 25, Pp 975–979.
  49. Xie, „Head-Related Transfer Functions and Virtual Auditory Display. P 28.








Entry in the DiracDelta.co.uk science and engineering encyclopedia.

– Web-Page. Accessed: 22.12.16.

http://www.diracdelta.co.uk/science/source/d/u/duplex%20theory%20of%20localization/ source.html#.WGyuqbElyEI

Wakenland, Carl. „The importance of Audio in VR“.

– Article, Web-Page. Accessed: 05.01.17.


Huiberts, Sanders and Van Tol, Richard. „EZA: a framework for game audio“.

– Article, Web Page. Accessed: 14.12.16.

http://www.gamasutra.com/view/feature/3509/ ieza_a_framework_for_game_audio.php

Eren, Aksu. „Future of MR“. FraVR 2016.

– Talk, Web-Page. Accessed: 15.10.16.


Jin, Craig T. and Corderoy, Anna and Carlile, Simno and van Schaik, André. „Spectral Cues in Human Sound Localization“. Neural Information Processing Systems Conference.

– Paper, Web-Page. Accessed: 18.12.16.



Wilkinson, Simon. „18 months experimenting with storytelling in VR“. FraVR 2016.

– Talk, Web-Page. Accessed: 22.12.16.


Carlile, Simon. „Cocktail Parties, Presence and Compelling NextGen Audio“. GDC 2008.

– Talk, Web-Page. Accessed: 05.11.16.


Gumbleton, Simon. „Audio for AAA Virtual Reality Experiences“. GDC 2016.

– Talk, Web-Page. Accessed: 03.01.17.


Hook, Brian. „Oculus Connect: Introduction to Audio in VR“. OculusConnect 2014.

– Talk, Web-Page. Accessed: 27.12.16.


Smurdon, Tom. „3D Audio: Designing Sounds for VR“. Oculus Connect 2 2015.

– Talk, Web-Page. Accessed: 03.12.16


Ward-Foxton, Nicholas. „Environmental Audio and Processing for VR“. GDC2015.

– Talk, Web-Page. Accessed: 11.12.16.


Andersen, Stig. „Playdead INSIDE“. Wwise Tour 2016.

– Talk, Web-Page. Accessed: 04.12.16.


Przybylowicz, Marcin. „CD Projekt Red Witcher“. Wwise Tour 2016

– Talk, Web-Page. Accessed: 06.01.17.


Tsingos, Nicolas and Gascuel, Jean-Dominique. „Fast rendering of sound occlusion and dif- fraction effects for virtual acoustic environments“.

– Paper, Web-Page. Accessed: 13.10.16


HTC Vive. Commercial Reference.

– Web-Page. Accessed: 18.12.16.


Oculus Rift. Commercial Reference.

– Web-Page. Accessed: 18.12.16.


Playstation VR. Commercial Reference.

– Web-Page. Accessed: 18.12.16.


Ableton Live. Commercial Reference.

– Web-Page. Accessed: 21.10.16.


Mathworks. Graphic.

– Web-Page. Accessed: 13.12.20.


Entry in the Cambridge Dictionary.

– Web-Page. Accessed: 05.12.16.


Oculus Rift. Commercial Reference.

– Web-Page. Accessed: 18.11.16.


Oculus Rift. Commercial Reference.

– Web-Page. Accesed: 12.10.16.


Ableton Live. Commercial Reference.

– Web-Page. Accessed: 13.12.16.


Sonnox Surpressor. Commercial Reference.

– Web-Page. Accessed: 14.11.2016.


Lenard Audio Institute. Article

– Web-Page. Accessed: 19.11.16.


Oculus Rift. Commercial Reference.

– Web-Page. Accessed: 19.12.16.


Wwise. Commercial Reference.

– Web-Page. Accessed: 19.11.16.



Wwise. Commercial Reference.

– Web-Page. Accessed: 13.12.16.

https://www.audiokinetic.com/library/edge/? source=WwiseFundamentalAp proach&id=understanding_rtpcs

Wwise. Commercial Reference.

– Web-Page. Accessed: 09.11.16.


Boogeyman VR Game. Video. Commercial Reference.

– Web-Page. Accessed: 12.10.16.


Paranormal Activity VR Game. Video. Commercial Reference.

– Web-Page. Accessed: 11.11.16.


Hodgson, Jonathan. Film of a Poem by Charles Bukowski. “The man with the beautiful eyes“.

– Video,Web-Page. Accessed: 16.11.16.


Notes on Blindness VR Experience – ARTE. Video. Commercial Reference.

– Web-Page. Accesed: 10.10.16.


Oculus Rift. Commercial Reference. Wep-Page.


– Accessed: 30.11.16

Oculus Rift. Commercial Reference. Wep-Page.


– Accessed: 31.10.16

Oculus Rift. Commercial Reference. Wep-Page.


– Accessed: 30.10.16

Oculus Rift. Commercial Reference. Wep-Page.


– Accessed: 29.11.16

Softube Classic Channel. Commercial Reference. Wep-Page.


– Accessed: 09.01.17

3DCEPTION. Commercial Reference. Wep-Page.


– Accessed: 08.12.16

Myriad. Commercial Reference. Wep-Page.


– Accessed: 02.01.17

Wwise. Commercial Reference.

– Web-Page. Accessed: 12.12.16.


The Pull – ARTE. Video. Commercial Reference.

– Web-Page. Accessed: 10.11.16.


Interstellar OST. Video. Commercial Reference.

– Web-Page. Accessed: 17.11.16.


Facebook 360 Spatial Workstation. Commercial Reference.

– Web-Page. Accessed: 09.11.16.


International Telecommunication Union. Technical Recommendation.

– Web-Page. Accessed: 28.12.16.


Wwise. Commercial Reference.

– Web-Page. Accessed: 19.11.16.


NI Kinetik Metall. Commercial Reference.

– Web-Page. Accessed: 02.12.16.


LA-610 MK II. Commercial Reference.

– Web-Page. Accessed: 03.01.16.


SE1X. Commercial Reference.

– Web-Page. Accessed: 09.12.16.


Ableton Live. Commercial Reference.

– Web-Page. Accessed: 27.11.16.


Unity3D. Commercial Reference.

– Web-Page. Accessed: 01.11.16.


Unreal Engine. Commercial Reference.

– Web-Page. Accessed: 26.11.16.

https://docs.unrealengine.com/latest/INT/Engine/Audio/Overview/index.html#generalvol umeguidelines

Videogameaudio.com. Graphic.

– Web-Page. Accessed: 04.12.16.

http://videogameaudio.com/FullIndie-Apr2015/GameAudioMiddleware-FullIndie- SchoolOfVideoGameAudio-LPaul-Apr2015.pdf

Arturia. Commercial Reference.

– Web-Page. Accessed: 06.11.16.


Arturia. Commercial Reference.

– Web-Page. Accessed: 27.10.16.




Collins, Karen and Kapralos, Bill and Tessler, Holly. „The Oxford Handbook of Interactive Au- dio“. Oxford: Oxford University Press, 2014.

Stevens, Richards and Raybould, Dave. „Game Audio Implementation“. Boca Raton FL: CRC Press, 2016.

LaBelle, Brandon. „Background Noise. Perspectives on Sound Art“. New York: Continuum Int. Publishing Group, 2006.

Farnell, Andrew James. „Designing Sound. Practical synthetic sound design for film, games and interactive media using dataflow“. London: Applied Scientific Press, 2008.

Yewdall, David Lewis. „Practical art of motion picture sound – Forth Edition“. Waltham, MA: Focal Press, 2012.

Raffaseder, Hannes. „Audiodesign – Second Edition“. Hamburg: Hochschule für Angewandte Wissenschaften Hamburg, 2010.

Leenders, Matts Johan. „Sound Für Videospiele. Besondere Kriterien und Techniken bei der Ton- und Musikproduktion für Computer- und Videospiele. Marburg: Schüren Verlag GmbH, 2012.

Augoyard, Jean-Franç and Torgue, Henry. „Sonic Experience. A guide to Everyday Sounds“. Québec: McGill-Queen´s University Press, 2005.

Xie, Bosun. „Head-related transfer function and virtual auditory display – Second Edition“. Palantion, FL: J. Ross Publishing, 2013.

Rumsey, Francis. „Spatial Audio“. Burlington MA: Focal Press, 2003.

Beck, Jay and Grajeda, Tony. „Lowering the boom: critical studies in film sound“. Champaign: University of Illinois at Urbana-Champaign, 2008.

Görne, Thomas. „Tontechnik: Schwingungen und Wellen, Hören, Schallwandler, Impulsantwort, Faltung, Sigma-Delta-Wandler, Stereo, Surround, WFS, Regiegeräte, tontechnische Prax is“. München: Carl Hanser Verlag GmbH & Co. KG, 2006.

Görne, Thomas. „Mikrofone in Theorie und Praxis“. Aachen: Elektor-Verlag GmbH, 2007.

Boll, Monika. „Nachtprogramm: Intellektuelle Gründungsdebatten in der frühen Bundes-

republik“. Münster: Lit Verlag, 2004.

Udo Zindel, Wolfgang Rein. „Das Radio-Feature: Ein Werkstattbuch. Inklusive CD mit Hör-

spielen. Praktischer Journalismus Band 34, 2. Aufl. Konstanz: UVK, 2007.

Crook, Tim. „Radio Drama. Theory and practice“. London; New York: Routledge, 1999.

Collins, Karen. „Game Sound“ Cambridge MA: MIT Press, 2008.

Herkman, Juha and Humanen, Taisto and Oinonen Paavo. „Intermediality and Media Change“. Tampere: Tampere University Press, 2012.

Hand, J. Richard and Traynor, Mary. „The Radio Drama Handbook“. New York: Continuum- books, 2011.

Poe, Edgar Ellen. „Edgar Ellen Poe: Essays and Reviews“. New York: The Library of America, 1984.

Cherry. E. C. „Some experiments on the recognition of speech, witch one and two ears“. J. Acoust. Soc. Am. 25, Cambridge MA: Massachusetts Institute of Technology, 1953.

Lanza, Joseph. „Elevator Music. A Surreal History of Muzak, Easy-Listening and other Mood- songs“. Ann Arbor MI: University of Michigan Press, 2007.

Huiberts, Sander. „Captivatting Sound“. Portsmouth: University of Portsmouth, 2010.

Coen, Annabel J. „Music as a Source of Emotion in Film“ Chapter 13 of Justin, Patric N. „Hand- book of Music and Emotion: Theory, Research, Applications“. Oxford: Oxford University Press, 2012.

Boer, Klaus. „Stereophonic sound reproduction“. Dissertation. Delf: Institute of Technology, 1940.

Wallach, Hans „The role of head movement and vestibular and visual cue in sound localization“. Swarthmore College, 1940

Sandmann, Thomas. „Effekte & Dynamics“ Bergkirchen: PPVMedien, 2001.

Roberts-Breslin, Jan.. „Making Media“. Burlington MA: Elsevier, 2008.

Lilli, Waldemar. „Grundlagen der Stereotypisierung“. Goettingen: Hogrefe, 1982.

Chion, Michel. „Audio-Vision Sound on Screen“. New York: Columbia University Press, 1994.

Kane, Brian. „Sound Unseen. Acoustic Sound in Theory and Practice“. New York: Oxford University Press, 2014.

Blauert, Jens “Spacial Hearing. The Psychophysics of Human Sound Localization”. Stuttgart: Hirzel Verlag, 1974.

Marks, Aaron, „The complete guide to Game Audio. For Composers, Musicians, Sound Designers and Game Developers“. Burlington MA: Elsevier, 2009.

Weinzierl, Stephan. Handbuch der Audiotechnik Band I“. Heidelberg: Springer Verlag, 2008.

%d bloggers like this: