A new way to listen: Expansive audio formats hold promise for cinemas

After a relatively quiet 20 years, cinema sound has rejoined the list of important issues that today’s exhibitors need to consider. While the industry as a whole has been occupied with the conversion to digital projection, cinema sound has remained relatively calm, with occasional and incremental changes in the established soundtrack formats.

Few veteran exhibitors can forget the choices, competitive positioning, and format confusion of the early 1990s, when a host of new 35mm-based digital formats from vendors such as Kodak, DTS, Sony and Dolby burst upon the scene. Regardless of the turmoil, cinema sound converted over to digital in the 1990s, often with dramatic results, and well before digital projection became viable. The adoption of digital sound eliminated 60 years of problems, such as the noise and wear associated with film-based formats, while allowing 35mm prints to carry more audio channels with higher quality.

All of the competing formats of the 1990s were largely built on the idea of extending the creative impact with the addition of more channels, in particular the surround channels with their ability to place and move sounds within the auditorium. From their start, digital sound formats were based on a minimum of three channels behind the screen, a single subwoofer channel to reproduce the low-frequency effects, and a pair of channels—or stereo surrounds—covering the auditorium. The loudspeaker arrangement, generally referred to as the 5.1 configuration, became ubiquitous in practically all cinemas and later became the de facto standard for consumer home derivatives such as DVD and broadcast.

The fundamental issue with digital-on-35mm film formats was bandwidth allocation—i.e., only so much data can be reliably recorded on a film print. Sound vendors were forced to choose how to allocate the limited amount of data into the channels that would create the most “bang for the buck” in the auditorium. Sony SDDS focused on improving across-the-screen resolution by accommodating five channels behind the screen. Dolby focused on improving front-to-back and across-the-back resolutions by increasing the number of surrounds channels with Dolby EX 6.1. Others suggested that adding height channels, or increasing the top-to-bottom resolution, would yield worthwhile benefits.

As digital projection gained momentum and took the focus (and investment), the sound vendors continued to make improvements. As standards were established, the basic 5.1 channels configuration was set to be the minimum acceptable playback for DCI content. Fortunately, the developers of the DCI specifications also realized that cinema soundtracks would continue to evolve and had the foresight to include room for up to 16 conventional audio channels in the specification.

3D Audio?

The 5.1-channel formats were a radical improvement over previous analog formats and even to this day continue to have widespread acceptance from the industry and audiences around the globe. But their channel limitations have become apparent, especially to those within the sound post-production community.

Although today’s audiences rarely complain that 5.1 or 7.1 is not enough channels, sound designers, particularly those developing complex mixes for 3D blockbusters, have found that the 5.1 arrangement puts restrictions on where sounds can be placed and moved within the auditorium. While audiences need not care or be aware or how many channels are in use, the filmmakers’ ability to create and deliver new and exciting soundtracks keeps audiences surprised and returning for more.

For distributors, the 5.1 format and all its subsequent variations were becoming a bigger problem. The root of this issue is that 5.1 was solely a “channel-based” format, which assumes that the playback environment is virtually identical to that used in the mix room. Sounds are grouped into discrete channels that are expected to be reproduced in the theatre in exactly the same manner. Theatres, however, are different, with a range of auditorium and screen sizes. As exhibitors equipped with various formats, distributors found they had to prepare different mixes on different DCPs to fit the various playback situations, leading the industry further away from the long-term goal of single-inventory distribution.

The increasing number of playback situations to support has created a growing gap between what was intended during the recording and what is possible during playback. Future formats needed to be capable of delivering more channels, or at least more spatial resolution, in the larger auditoriums, but at the same time simplifying the post-production and distribution process so a single soundtrack mix would play correctly over a wide range of playback situations.

Exhibitors know that many of their patrons have 5.1 or more at home, and cinemas need to retain a unique advantage. Naturally, the larger flagship screens want a differentiator to identify their sound. For both technical and marketing reasons, the audio equivalent of a 3D picture was needed.

Channels and Objects

As cinemas and the rest of the media world converted to digital, audio engineers began to think of audio formats in completely new terms. The channel-based formats we use today are a holdover from previous days, when channels were essentially fixed paths between the recorded tracks and the loudspeakers. With digital, there are no physical channels, but instead files or bit streams of combined and encoded data. With advanced techniques, it becomes possible to have each individual sound within the mix act as its own discrete element. Each sound can be individually identified and encoded with its own information about where the director or sound designer intends it to be heard at any moment.

This technique is known as “object-based” mixing. The entire soundtrack can be thought of as a collection of sounds, each carrying its own information about where it should come from. In the theatre inside the cinema sound processor, a fast analysis matches the individual sounds with the physical configuration of that particular auditorium—a process known as real-time rendering—so as to achieve the best possible playback based on how many channels are installed. Due to its inherent scalability, exhibitors can add to or alter the auditorium’s loudspeaker configuration, over time to improve their sound to match their own budget and pace.

We now take a detailed look at four companies who’s pioneered a more expansive audio experience in cinemas.

In 2004, Iosono was announced as a spinoff of Germany’s Fraunhofer Institute, the developers of MP3 and more, offering 3D sound for the cinema. Iosono uses a continuous ring of audio channels aligned on a horizontal plane completely surrounding the auditorium, along with ceiling speakers to add height when desired. In production, an object-based mix is created so that each sound object is defined with its directional characteristics. For playback in the cinema, a digital processor renders the sound objects to a flexible number of loudspeakers, with the intent of reconstructing the original sound field with their original intensity, distance and direction.

The Iosono system was introduced to Hollywood with a number of impressive demonstrations, which opened the eyes of many about what is possible in the future. However, the apparent high cost of adding the equipment was enough to stall serious interest from most exhibitors, who were still anxious about the investment into digital cinema and 3D visual systems. So far, Iosono has met with limited success in the cinema market, but has achieved recognition in special-venue applications such as theatre, live events and amusement parks.

Barco’s Auro
In 2010, projector manufacturer Barco announced a partnership for cinema applications with Belgium’s Auro Technologies LLC, a group of audio engineers who had developed Auro-3D, a cinema-specific format that supports 11.1 channels. While a traditional channel-based format, Auro-3D can be thought of as a stacked 5.1 system, with the upper channels mounted near to top of the side and back walls delivering the height audio channels. An auditorium ceiling channel is typically used, bringing Auro-3D to 11.1 channels.

Auro-3D has a clever twist to insure compatibility with existing 5.1 channel auditoriums. The height information is encoded separately, leaving the full (lower + upper) 5.1 mix in the DCP package essentially untouched. If the Auro-3D equipment is installed, all the audio channels are decoded to yield the lower and upper channel groups.

So far, Auro-3D has been installed in 36 cinemas around the globe, with announced commitments from several exhibition circuits. Their initial rollout had the support of Lucasfilm with their release of Red Tails this past January. More upcoming Auro-3D titles are expected to be announced this fall.

imm Sound

Barcelona is the home of imm Sound, a Spanish startup which announced the development of their “immersive” 3D cinema soundtrack format in 2010. Purely object-based, imm Sound uses a workstation on the mixing stage to encode the individual sound objects so they can be reproduced in different sized auditoriums with a different number of loudspeakers while still achieving the best possible creative intent.

As of CineEurope 2012, imm Sound had approximately 40 installations around the globe, and a short list of upcoming titles. Dolby acquired imm Sound this past July after they found imm Sound had compatible engineering work in progress which could be used to speed the adoption of their new format, Dolby Atmos.

Dolby Atmos
Dolby announced their new audio technology and brand at CinemaCon 2012 in a top-notch marketing blitz. Quietly in development for several years, Dolby’s Atmos is a hybrid approach, seemingly to combine the best aspects of both channel-based and object-based soundtrack mixes. Working closely with industry sound designers and recording engineers, Dolby found that the channel-based approach works better with certain sounds, such as “beds” of ambiances and background effects, while the object-based approach lends itself well to discrete sound effects that require point-to-point movement.

During post-production, Dolby Atmos works with the existing mixing tools preferred by the creative team along with new Dolby-supplied hardware to create a master soundtrack file that contains all of the audio essence and artistic intent. This file is later embedded within a single DCP (which also contains standard PCM 5.1 and 7.1), which is rendered in during playback in a way that is creatively correct (i.e., as close as possible to the sound approved during the mix) but independent of channel count or loudspeaker locations.

In the cinema, the Dolby Atmos decoder uses pre-set information about the existing auditorium’s loudspeaker layout and capabilities to determine how to best reproduce the soundtrack for that particular auditorium. Dolby has produced guidelines advising exhibitors and installers how to best optimize the number and position of new loudspeakers for individual auditoriums.

On a technical level, the Dolby Atmos format is scalable up to 128 channels, although practical installations will use considerably fewer channels depending on the particular auditorium. The audio coding is uncompressed, using 24-bit 96-kHz samples, considered the gold standard by recording engineers.

Dolby has chosen not to label Atmos as “3D sound” for several reasons. First, it can be argued that cinema sound loosely had the 3D equivalent ever since the introduction of the behind-the-listener surrounds in the 1950s. Second, Dolby sees the benefits of Atmos extending to all titles and wants to avoid the impression that its benefits are limited to 3D releases.

Dolby rolled out Atmos to the public in approximately 15 theatres for the opening of Disney and Pixar’s Brave in June using pre-production equipment, and plans to release details on how Dolby Atmos will be implemented in their lineup of cinema processors in the future. While going slow and steady through the development cycle, once their cinema-grade equipment is ready, Dolby expects the number of Dolby Atmos titles and installations to accelerate rapidly in 2013.

With the new soundtrack formats from Dolby, Barco, Iosono and potentially others, coupled with the ongoing advancements in digital projection such as high frame rates, and laser illumination, the industry will have plenty of new items on the list to stay aware of. Fortunately, at least with sound, the rollout timeline for the technology previews we are seeing today will give exhibitors the opportunity to evaluate each and carefully decide if, and when, to upgrade their sound. As always, with sound, it’s most important for us to pause, learn and, most importantly, listen.