INMS HOMENewsLab HOME

Sensing the News Home


Group Members: Jim Andrews, Sue Johnson, Regina McCombs, Matt Thuesen



What is interactive audio?
One must bear in mind that audio is always already interactive insofar as it does make it to the ear, which cannot be closed like the eye, unless the volume is set to 0. Making it to the brain appears to be a longer trek. Information is always already interactive insofar as we ourselves create meaning. It isn't 'there'. We create it.

But, that said, 'interactive audio', in the sense we're using it, refers to audio that is under the control of the operators; the operators are the user(s) and the programmer. This could refer to various types of control, from the humble volume control, to a finger on the 'record' button, the 'edit' button, the 'filter' button, and other functions performed on sound by producers and consumers of sound. It could also refer to the ability to change channels, as in a radio, or, if the notion of 'sound as object' is adopted by the programmer, then sounds themselves can have incarnations as visual icons or animations or texts (or whatever) and then the audience can drag and drop them, carve them up, and do other things we associate with operations on visual objects (interesting to think of things we do to visual objects and then find the analog in 'sound as object'-what does it mean to shake or stretch or hit a sound?).

What is immersive audio?
One must bear in mind that audio is always already immersive: the distinction we make between 2D and 3D visual images does not hold, precisely, in the analogy of, say, mono sound to stereo sound. Also, the depth of 'immersion' may be sometimes appropriately measured more by depth of attention than 3Dness of the audio environment.

So let's not limit our usage of the term 'immersive audio' to 3D audio, but make special mention of it when we do use it to refer to 3D audio.

How could interactive audio be used by the news business? Examples of specific use.
Let us say that, first of all, whatever production tools you use or construct for audio, think of them in the hands of your audience concerning your audio. If, for instance, you construct a way of searching your audio that you intend to use only as a production tool, stop and ask yourself how and if it would be useful to your audience to have that ability with the audio. Chances are that it you really need it, so do they, in some sense.

Which brings us to a natural application of interactive audio in the news: let your audience search for audio on your site as they would for text. This is less practiced than one might think. It does not require indexing the entire sound file, but just key words. This is, in a sense, a move toward 'sound as textual object' insofar as searches are conducted with text.

A good way to enable this is to put a link to the audio with the text of the story it accompanies, if there is text, as at http://www.wbez.org . Now if only they had a search engine; their site nicely links text to sound, but does not provide search for either. With the good way they link text to sound, they would only need to search text, often, to find the appropriate audio. WBEZ has some of their programs, such as "Eight Forty Eight" divided into sections, even, for better Internet audio access, ie, the shows are chopped into audio segments that correspond with the parts of the show that day.

Another example of interactive audio is within a configurable multimedia news delivery machine in the browser or on the desktop. Internet radio can be live or archival, continuous or periodically delivered to the doorstep, as it were, via email links or via a toolbar (like the one at www.toolbar.google.com ) that acts as a configurable multimedia news delivery machine in the browser or on the desktop. Radio is relatively unobtrusive, or can be, in the desktop environment. A toolbar that was configurable to deliver news of a particular media at intervals determined by the audience would be useful. A sort of a news alarm clock (however alarming) + news search mechanism. Interactive audio, a la the radio, would be an important part of such a toolbar/news machine, but not the only media it could deliver.

The integration of radio and the Web involves quite a bit of 'sound as object', because the ways that we browse are visual and textual, primarily. But we can browse sonically also. Which brings us to another application of interactive audio in the news. It is possible to stream audio to the listener extremely quickly if the sound quality is moderate. So menus could be sound-enabled, ie, audio descriptive of a story could be triggered on mouseover of the menu link.

Similarly, hot spots on photos could open audio on particular parts of stories. Or hot spots on other things such as maps. Clicking the hot spot might also result in a corresponding change in the visuals. But it is easier to provide more information via streaming audio than what you get by changing the visuals, often, unless it too is streaming.

Also, audio can set tone and provide the sounds of the environment of the story, for instance, in a way that introduces the presence of the world with a stronger sense of being there than a photograph or very small video can do, often. And it offers the human voice, the voice of the news, the person, the human reporter/narrator and subjects in the full range of our humanity, if we so desire, in a way that text does not.

In the future, it's likely that in addition to the mouse and the keyboard, voice input will be common to control the computer (via speech recognition). How do we 'talk back' to/at/with the news? Currently, we grumble, mostly, or we correspond concerning it, or act on its knowledge in other ways. One feels that voice recognition software will not be merely voice commands that duplicate menu commands to the letter, but will offer commands that act also on audio, not just on menus and other visual objects. As in, for instance, "Stop right there. Now play that to me again. OK. Now turn it into one of my signature loops and mail it to Tom."

There is excellent use of interactive audio at www.360degrees.org ("perspectives on the U.S. criminal justice system). The audio is largely independent of the visuals in the sense that we do not have audio from a video stream. But the audio is deeply related to the visuals, so much so that some people actually recall later that they were looking at video. And we can browse the visuals while listening to the audio. The audio provides the voices of the people involved in the story. This piece aspires to provide 360 degrees conceptually around the subject of the story. And indeed it is 'immersive' audio in this sense.

Of course, 'immersive' audio technology is being developed that gives people a sense of being physically immersed in the environment, ie, 3D sound. This technology will help direct the user's attention to navigate to where the reporter's voice is, in 3D panoramic news reports, for instance. It will let people navigate to parts of the visual 3D world that are currently unseen in the visuals. But, also, it will provide a type of physical/perceptual immersion in the reportage that is different, but related, to the first type of 'immersion' mentioned.

Consider another view of 'immersion' and audio space expressed by Helen Thorington at http://amsterdam.nettime.org/Lists-Archives/nettime-l-9804/msg00052.html :

"I am convinced that sound on the internet is going to be more and more significant. It is, from my point of view, a grounding material. Sound is a way of creating space. You can create space with sound, so you can in this very immaterial (again: as in radio) area, locate people, temporarily, through the use of sound in a space, a geography. I think this is very important. While we are still geographical people and floating with our feet above the earth it's an instinct to be grounded somewhere. We are loosing this sense, particularly in the corporate world where they are switching people around from one location to another. The sense of belonging to a community anywhere is sort of dissipating in our lives. That does not mean the need for it isn't there. I think sound is one way of creating a space that people can enter and feel that they know where they are, at least imaginatively."

Notice that Thorington is not talking about 3D audio immersion, necessarily. She is talking about both a physical acoustic space and also a mental and emotional space. We don't need 3D to feel immersed or enveloped in audio, but it could be interestingly used.

Impact of Interactive Audio on the Audience, Newsroom, and Journalism Education
Journalists do a lot of searching for information. Some is easier found than others. Searching audio recordings is a pain, unlike searching text. The notion of 'sound as object' treats sound as having many properties in common with text and visual objects. One of those would be searchability. More generally, interactive audio tends to treat sound like a material that should be as easy to work with as other media in terms of edit/find/compose functionality on materials. Interactive audio is about giving people control over audio interactively in the same sorts of ways that we have control over other media. And about giving them sonic control over other media besides sound. That's a different attitude toward sound than we have now. Sound is evanescent and not as trustworthy a source of knowledge as the visual. When we start a sentence with "I hear that." we are usually asking for clarification, also. When we start a sentence with "I see that." we are expressing considerably more certainty.

Why this difference between the epistemological status of sound versus the visual? Because sound is evanescent, it isn't written down, it is here and then it is gone. Unless it is recorded. Interactive audio is about turning sound into a material as worthy of epistemological trust as the visual and the written, ie, recorded sound as object like other types of objects.

What this means for the newsroom is a different sense of composition concerning sound. A sense of sound not just as the soundtrack of the video, but as independent objects attached to video, or other media, and as keeper of an (acoustic) space that while related to the visual is as different from the visual as what is behind the eye is different from what is in front of the eye. Audio allows you to change points of view more radically than is often evident in the accompanying video.

Research Issues
The question of indexing sound is an important one. A practical way around it was pointed out, so that sound is still quite searchable but not formally indexed, ie, if work involves both sound and text, make the text searchable and put a link to the audio in the appropriate searchable text. But this is not a replacement for formally indexed audio, which is by no means the norm now.

3D sound is another area that has been heavily researched yet the technology does not seem to be popularly forthcoming.

Speech-to-text is another one in which the appropriate hardware and software seem to be in the offing still.

Text-to-speech is common, but the voice tends to be out to lunch or a heavy synth drone. This is largely because we create meaning, ie, we interpret semantically when we read, and inflect according to our semantic interpretation. But semantic interpretation (as opposed to syntactic parsing of the grammar) is a much more difficult computing question than syntactic parsing.

Artists are exploring interactive audio in many ways on the Web and in installation pieces, for instance. They seek to create new forms of music, ie, interactive music, in which the audience can compose or experience 'presets' of compositions, and can compose with parts of songs, rather than with notes from an instrument. To quote Brian Eno (from http://www.wired.com/wired/3.05/eno.html?pg=4&topic= ):

"What people are going to be selling more of in the future is not pieces of music, but systems by which people can customize listening experiences for themselves. Change some of the parameters and see what you get. So, in that sense, musicians would be offering unfinished pieces of music-pieces of raw material, but highly evolved raw material, that has a strong flavor to it already. I can also feel something evolving on the cusp between "music," "game," and "demonstration"-I imagine a musical experience equivalent to watching John Conway's computer game of Life or playing SimEarth, for example, in which you are at once thrilled by the patterns and the knowledge of how they are made and the metaphorical resonances of such a system. Such an experience falls in a nice new place-between art and science and playing. This is where I expect artists to be working more and more in the future."

From 'we play music' to 'we play with music'. Similarly, there will be quite a bit of 'we play news' to 'we play with news'. We already play with the news in many ways, insofar as playing is what we do when we are having fun or productively interpreting our world. Interactive audio is a part of other investigations into the interactive where people are given more power to operate on media of all sorts according to their goals.



 
  The University of Minnesota is an equal opportunity educator and employer.
Privacy Statement
© Regents of the University of Minnesota, 2001. School of Journalism and Mass Communication.