3D AUDIO PROGRAMMING Daryl Sartain
Introduction
Millions of dollars are spent on research and development for better audio and visual graphics for computer games. Why? Why are we so intrigued with mastering the skills of mimicking nature? Can it even be done? Philosophers, poets, and scientists have argued for centuries the degree to which perception is subjective. Science has proven that the material world is 99.9% empty space. Psychology and Neurology have revealed that one individual's perceived experience of objective reality can be as distinct from the next persons as is his or her DNA. Yet, we agree on the color of traffic signals, the speed of light, the many other multifarious laws of physics, regardless of our knowledge or ignorance of them.
Whether we will ever agree that this empirical, material world is the real world, or the best world, we seem to have tacitly consented that most things in this world affect us as perceivers in a very similar way. Perhaps the road to greater perception is paved with the efforts of mastering this one. Perhaps there is a reason behind this fascination with gaming; what some might call a waste of money, energy and education. Perhaps our current mode of recreation is exactly as meaningful as the etymology suggests - a re-creation. Or, perhaps it is all for naught. Therein lies the mystery, the indulgence, and the magic of programming.
As programmers of sights and sounds we re-create perception to replicate sensation. In the realm of the second of our five senses - hearing, the only way we can hope to replicate the complex process of auditory comprehension is by understanding both the sensation as we perceive it, and the laws of physics which it is compelled to obey. This understanding leads to the capability to technically reproduce a sensation that is as close as we have come to replicating any portion of the senses - 3D Sound.
Science and technology multiply around us. To an increasing extent they dictate the languages in which we speak and think. Either we use those languages, or we remain mute.
~ J.G. Ballard ~
In the course of my career, I have been asked to lecture on the subject of audio in many different settings. I have watched as the imaginative ideas of my colleagues have manifested into new technologies, contributing crucial building blocks to the sturdy path of progress. I have as well seen a great deal of insight, creativity and midnight oil poured into creations that were washed aside the path with the first rain of scrutiny.
What follows is an overview of each of the chapters within this book and an explanation of how this material is intended to be used. The book is organized into three sections with each providing a greater depth of discussion on 3D audio. Section 1 may proved to be a review for you or perhaps it will introduce new terms and concepts on the perception of sound. Section 2 focuses on the programming model for 3D audio including some practical trade-offs. And finally Section 3 moves into much of the theory of 3D audio, but also provides some experience resulting from the application of these techniques.
Chapter 1 addresses the types of sound systems that exist today and how these relate to each other. This explanation then ties into 3D audio systems.
Chapter 2 deals with the physics of sound. This is fundamental to the understanding of how sounds moves through an environment over time and is an introduction to many of the terms used throughout this book.
Chapter 3 introduces the human auditory system and it's characteristics. With this information, we begin to understand how the hearing system detects location and what sounds are perceived. When building a system whose purpose is to emulate the human auditory system, it is important to understand the mechanisms of this system.
Chapter 4 details the method by which information is interpreted and various relationships that exist in sound. These relationships may be exploited when creating a 3D sound synthesis solution in order to improve the perceived quality of the system.
Chapter 5 leads the next section of this book with an overview of psychoacoustics. Psychoacoustics is the way in which you interpret (psycho) the sounds (acoustic) that you hear. Many sounds are not perceived, while others are only marginally perceived or are strongly perceived. This information helps determine how to build and optimize a 3D sound system.
Chapter 6 is a summary of Microsoft's open standard API (application programming interface) used for standard audio streams and 3D audio streams. This chapter describes the assumptions that a programmer will need to understand for the Windows(tm) operating system.
Chapters 7 and 8 develop a programming example with Chapter 7 focusing on the development of the application and much of the necessary setup, and Chapter 8 focusing on the features of 3D audio.
Chapter 9 provides a mathematical model of a 3D sound synthesis technique known as Head Related Impulse Response (HRIR). This chapter is very technical and introduces many equations and data models. However, the value of this chapter is not limited to those seeking to implement the equations. Several tradeoffs are discussed in this chapter regarding the use of time-domain calculations versus the other dominant technique of Head Related Transfer Functions (HRIF), which is a frequency domain technique.
Chapter 10 moves heavily into the analysis of a system used to create 3D audio sounds. This chapter compares various levels of simulation accuracy and the amount of processing required for them.
Chapter 11 presents a relatively new technology in the field of 3D audio called stereo dipole. This technology uses some of the concepts presented in the first section of this book to model another method of generating sounds. As an overview, the technique places the two speakers very close together such that they might typically sound as if they are at one location. However, with the help of signal processing, the audio wavefronts generated by the two speakers are able to create a wide area in which the listener may sit and have high quality 3D effects.
|