Open discussion: Immersive audio and the need for standards


On Thursday morning, March 27, at CinemaCon, the International Cinema Technology Association will present the seminar “The Road to Open Standards for Immersive Audio,” which will seek to shed some light on the new competing 3D sound formats and the goal of establishing open standards. Film Journal International invited three panel participants—Brian Claypool of Barco, Dean Bullock of Dolby, and John Kellogg of DTS—to share their views on this hot-button issue for movie exhibitors.

Setting the Record Straight

By Brian Claypool
Senior Director, Strategic Business Development, Entertainment, Barco

With another new year of high-profile movies upon us, it’s no surprise that the trend toward bigger and bolder continues, capitalizing on yet another evolutionary phase in digital cinema technology. But this time, it’s not about the quality of the image or the “resolution” of the native pixel count, frame rate or brightness of the 3D image. Now, the focus is on the latest audio formats designed to deliver heightened realism via immersive sound to bring today’s biggest blockbusters to life on the big screen.

What’s in a name?
But what does immersive sound really mean? A lot of words are being tossed around lately to describe the type of cinema audio that envelops an audience with realistic, lifelike sounds seemingly emanating from their points of origin as depicted on the movie screen. Terms like “immersive sound,” “3D sound,” “object-based,” “channel-based,” “VBAP” and “MDA” are fueling a never-ending vernacular that characterizes the benefits of one approach over the other. There are also “brands” wrapped around these techno terms, specifically Auro 11.1 by Barco and Dolby Atmos.

If we look at the dictionary’s definition of “immersive”—pertaining to digital technology or images that deeply involve one's senses—we can all agree that whether the brand advertised is Auro 11.1 by Barco or Dolby Atmos, both offer a heightened experience over traditional 5.1-channel cinema speaker systems, and both fit the definition of “immersive.”

And for the purposes of this article, let’s agree that all these terms apply to various activities taking place in the industry to create a unique audio deliverable that is not the standard 5.1 or 7.1 mix, and is designed to provide audiences with something special that they can’t, at least today, create at home.

Demystifying object-based cinema sound
So-called “object-based” audio is not new. It’s an idea that has been around for the better part of a decade and first proposed into cinema by Iosono around 2006. However, an important fact that remains accurate to this day is that reproduction of true object-based audio in a cinema environment is a wildly expensive proposition. Research has shown that a more compelling “immersive” experience can be achieved with far fewer channels, and that it is the critical placement of these “channels” in an environment that delivers the best experience…and at an economically viable price point.

Object-based playback was not designed to be used in cinema environments, but rather for specialty venues and ultimately for the consumer market to bring “virtual surround” technology to headphones and home cinema. Ironically, both markets compete with cinemas for the same audience attendance!

For content creators, the idea of mixing in an “object-based” format offers flexibility that channel-based mixing does not. However, the advantages stop there. The leading maker of object-based audio has developed a closed proprietary system that requires the user to adopt applicable compatible equipment in future upgrades, in addition to the royalties and licensing fees.

The irony is that the primary manufacturer of object-based cinema audio insisted only a decade ago that the preferred method of providing a great experience for the majority of the movie audience was through the use of diffuse surround arrays. After all, you can’t cheat physics and make one circa-1998 “surround speaker” provide adequate sound coverage for an auditorium with 400 seats. It’s just not possible. But you can use an array of such speakers to provide the power and coverage required. And that is exactly the premise behind the success of channel-based cinema audio systems like Auro 11.1 by Barco.

Making good sense of channel-based cinema audio
Not commonly understood is that all these formats are channel-based when they are rendered into the cinema. Whether the audio is rendered into 12 channels or 112 channels becomes fodder for academic discussions regarding what is needed to be truly “immersive.”

For example, the unique positioning of the two extra height layers found in Auro 11.1 by Barco allows for the perfect rendering environment for sounds that could never effectively be captured and played back as “objects,” like a symphony or the rich natural ambience of an event. In these, and frankly all cases, Barco’s unique three-layer approach to immersive sound is the optimum format for providing the most natural immersive experience. Certainly adding speakers overhead allows for the ability for dramatic flyover effects, but in nature, few sounds come from directly overhead. As such, our listening capabilities are not as acute to sounds from above us as much as they are from around us. Therefore, it’s this critical mezzanine layer, unique to Auro 11.1 by Barco, that bridges the acoustical gap between the existing surround array and the overhead speakers that are critical to relaying a more realistic, natural and comfortable immersive listening experience.

The sound qualities of a channel-based sound system like Auro 11.1 by Barco are not the only advantages of this user-friendly solution. It is also the optimum choice for studios and content producers. Traditionally, production workflows and tools have been created to deal with audio as being channel-based, meaning the mix is created in accordance with the channel count and speaker positions that are installed in commercial cinema environments. Until recently, 5.1 has been the default deliverable and this has not changed for nearly two decades.

The devil in the details

Content creators and post-production houses have a vested interest in maintaining consistent workflows to minimize the cost of creating these new types of soundtracks. Unbeknownst to many, both Auro 11.1 by Barco and Dolby Atmos provide this capability to filmmakers in the accompanying production tools. The difference is that Barco’s approach is based on a proposed “open standard,” which means there will be no future license fees or royalties required. However, the Atmos solution is a closed proprietary system, which almost always means that onerous tariffs are on the horizon.

Like studios, exhibitors can only afford to adopt a technology that can add a meaningful differentiation for their audiences at a price point that generates an ROI in line with their investment in the technology itself. Because of the flexibility and cost efficiencies built into the Auro-3D® sound formats, Auro 11.1 by Barco is a perfect step toward embracing immersive sound.

Ultimately, SMPTE will determine the guidelines for an “open standard” in immersive cinema sound to ensure that anyone can develop technology to improve the experience for moviegoers worldwide, with a minimal financial burden on both studios and exhibitors. After all, this is the primary intention. Barco has and will continue to align with industry partners to assist in the creation of a universal audio rending platform to simplify adoption of immersive sound by both sides of the equation.

The “open standard”—freedom of platform choice

Together with leading providers of audio and cinema sound reproduction equipment, Barco is fully invested in the development of an open-format approach to producing immersive cinema sound. The goal is to protect exhibitors’ freedom of platform choice, ensuring their ability to play any movie regardless of which immersive audio system they procure. These efforts are in response to concerns expressed by the National Association of Theatre Owners (NATO) and the Union Internationale des Cinemas (UNIC), which are imploring the industry to devise solutions that enable theatre owners to present movies on any new 3D audio format designed according to the open standard.

The open-standard nature of how Barco and our partners at Auro Technologies approach the content-creation pipeline fits with studios’ existing workflows, enabling them to easily create immersive mixes with the use of the Auro-3D Creative Tool Suite. The soundtracks that these tools produce can then be played on any legacy or new immersive audio system. The Auro-3D Creative Tool Suite allows the talented people working in immersive sound to create channel- and object-based masters in a single software-based solution, with no external hardware rendering required.

Our entire approach is based on an inclusive, rather than exclusive, philosophy bent on encouraging studios to produce more movies featuring immersive sound while enabling exhibitors to show them in their best light. And, from a cost and scalability perspective, the move toward open-standard cinema audio will be especially critical for empowering smaller theatres to take advantage of immersive sound technologies.

Committed to the future
Auro 11.1 by Barco is but one configuration of Auro-3D sound. When the open standard becomes a reality, Barco will offer exhibitors a user-friendly technology that allows for this new format to be played in all Auro-3D configured auditoriums. As a company invested in providing the latest solutions capable of delivering the most relevant premium cinema experience, cinema audio will remain a core focus for Barco.

Side note: History in the making
Before entering into the business of sound for cinema, Barco undertook many research projects together with our partners at Auro Technologies. We confirmed our findings and matched the research from various sources conducted by other parties worldwide. For example, in the Symposium from the German Tonmeisters (VDT) in Detmold in 2011, they concluded that “channel-based sound reproduction based on the speaker layouts defined by the Auro-3D format has a much more naturel immersive sound and impact than systems using object-based technology.”

Establishing standards that reflect best practices in the real world

By Dean Bullock, Director, Technology Strategy, Cinema, Dolby Laboratories

The “magic of cinema” is grounded in a lot of hard work. The digital-cinema industry is complex, dynamic and multifaceted. It depends on the contributions and interactions of many entities, processes and devices. In this environment, standards can play a critical role to ensure success.

Dolby Laboratories has been a leading contributor to cinema standards for more than 40 years. And what we’ve learned from that experience is that the most effective standards—those that are technically correct and stand the test of time—are based on what has proven to work best in the real world.

Right now, there is a very healthy debate about open standards for object-based audio. But to be truly effective, good standards must be more than just open. They must:

•    Enable exhibitors to provide better, more compelling and differentiated experiences to draw movie audiences to the cinema, and
•    Ensure that filmmakers’ creative vision—their original intent—is presented, as completely as possible without getting lost or distorted along the way.

Standards that are based on technical white papers, on what works in a lab, or on what can survive a controlled public demonstration will not suffice.

Effective and useful open standards build on the technology know-how and insight that come only from real-world experience—creative experience from working directly with filmmakers to develop better tools to present sound in new ways, and practical operational experience from working with exhibitors around the world.

At Dolby, we are working diligently to ensure that any standards for object-based audio are informed by the experience we have gained from developing and implementing Dolby® Atmos™. Dolby Atmos is the only object-based audio solution that has gained traction from the creative community (more than 100 film titles from some of the world’s best-known filmmakers and sound teams), distributors (all major Hollywood studios), and exhibitor partners around the world (more than 450 screens have been installed or committed to in more than 40 countries).

More than a year and a half ago, Dolby backed a study group within the Society of Motion Picture and Television Engineers (SMPTE) to share our findings and define standard solutions to the challenges presented by object-based audio. More recently, we submitted documents describing two important aspects of object-based audio technology based on real-world implementation of Dolby Atmos. We requested that SMPTE consider both documents for publication as Registered Disclosure Documents.

We share the industry’s desire to establish standards as quickly as possible, while ensuring that they are technically correct, that they work in real-world situations, and that they deliver compelling consumer experiences. However, development of good standards takes time, so Dolby has made open specifications available at no charge to a number of third-party manufacturers, including Christie, CineCert, Doremi, DVS, GDC and Qube, with additional manufacturers currently in testing. In addition, more than 40 Digital Cinema Package (DCP) facilities and more than 55 mixing facilities are now equipped to support Dolby Atmos.

In the meantime, filmmakers like Alfonso Cuarón, Ang Lee, Michael Bay, Danny Boyle, Peter Jackson and others can continue to enjoy the freedom of expression unleashed by Dolby Atmos, and movie audiences can rediscover the transformative nature of cinema with sound that transports them into other worlds.

Dean Bullock has worked in the cinema industry since joining Dolby Laboratories in the 1990s as an engineer, working on the Dolby cinema processor product line. Since 2009, he has actively participated in SMPTE committees and working groups.

Multi-Dimensional Sound and the Role of MDA

By John Kellogg, Senior Director, Corporate Strategy & Development, DTS, and Roger Dressler, RWD Consultants

There is a continuing drive to enhance the theatrical experience. Digital cinema has taken images to higher frame rates, increased resolutions and 3D, bringing new creative tools to filmmakers and more involving experiences to theatregoers.

Likewise, sound has evolved over the last century from simple mono to all manner of special effects formats touting extra channels (Fantasia, 1940) and subwoofers (Sensurround, 1970s). With the transition to digital cinema, the last two decades have settled into 5.1 audio as the dominant soundtrack format worldwide.

Even with 5.1 as the de facto standard, the d-cinema format was built for extensibility on a 16 channel platform. The first step beyond 5.1 was to 7.1, subdividing the two surround arrays into four. As simple as this was to implement, it immediately raised a fundamental question: What happens when a 7.1 soundtrack goes to a 5.1 cinema? One solution is to deliver the 7.1 soundtrack in parallel with the 5.1, so each theatre gets the right version. This works perfectly, though it adds costs along the way. Another option is to downmix the four surrounds back to two. This works well operationally, but aesthetically it is not ideal due to the way direct sounds are emphasized over diffuse sounds. More on that below.

And if 7.1 is good, why not 9.1, 11.1, or 13.1 channels? Just that short list alone implies five different soundtracks to create and deliver. Where does this channels race end? Most of today’s cinemas already have 18 to 24 speakers installed, so why not drive each of them with their own signal? And why not add more speakers on the ceiling for overhead effects? Just imagine the complexity of creating special soundtracks for all those combinations of speakers.

Enter MDA
From the time of its inception in 2009, the goal of MDA was to eliminate the channels race, and to provide a better way to deliver sound to a wide range of playback systems from a single soundtrack. The solution is elegant: MDA scales the sound up or down to use whatever speakers are available to best represent the soundtrack creator’s intent.

Current Mixing Process
To illustrate the concept behind MDA, consider the current movie mixing process. The mixer has access to hundreds of sound elements, the music, dialog, effects. The mixer decides placement or creates movement using a panning tool. That same process happens for most of the other sound elements, so there are perhaps dozens of panners running simultaneously, with their outputs merging together into the final 5.1 soundtrack.

If that same mixer were instead creating a stereo soundtrack, stereo panners would be used, resulting in just two final outputs. The mixer’s intent is the same; his operation of the panners is the same; but it is the playback system that determines how the panners will map his intention to the speakers.

Object-Based Mixing Process
Based on the above, the key to playback compatibility rests with the panner. Panners are built with exactly the number of outputs needed, be it two, five, seven or more.

Rather than asking the mixer to make separate soundtracks for every speaker format, MDA in effect moves the panner into the playback system, where it adapts to the speaker configuration at hand. What is delivered to the MDA playback system in the DCP, then, is not the audio already modified—panned—to specific output channels, but the unmodified audio plus the mixer’s panning instructions describing where the sound should appear in space. MDA’s “panner” (called the renderer) automatically translates that information to the available speakers, no matter the number or configuration, following the mixer’s wishes.

One Soundtrack, Played Anywhere
Operationally, if the playback system used 5.1 speakers, the MDA-rendered output would be the same as a 5.1 mix created the traditional way. That is important, but hardly justifies all the effort in developing an object-based delivery system. The payoff comes from the ability to reproduce that single soundtrack on a variety of speaker configurations, going well beyond conventional stereo, 5.1 or 7.1, to as many speakers in as many locations as desired, including those for imparting height or overhead effects.

The ability to achieve this is due to the flexible nature of the MDA renderer, which pans both horizontally (x-y) and vertically (z) to as many outputs as needed. It is also due to delivering the soundtrack as separate signals, objects, rather than pre-combining them into channels. This allows MDA to avoid traditional channel downmixing, which would not only be an issue for 7.1 to 5.1 playback, but for any extended channel format one may conceive. Downmixing alters the relationships between direct and diffuse sound elements in the mix—the age-old mono/stereo compatibility problem. Object rendering, in contrast, inherently maintains the signal levels of each sound source regardless of the number of speakers, thus better maintaining the original intent. 

Specifications and Standardization
To promote wide and rapid adoption, MDA is available to relevant creation, delivery and playback product makers royalty-free. The specifications have been published in SMPTE where a standard for d-cinema delivery is under development. The MDA specification is also being submitted to ETSI for publication.

Open Standard
The concept of MDA being offered as an open standard comes from the recognition that there is nothing fundamentally special or new about PCM audio objects and metadata. All that is new is the ability to package and exchange files in this format for distribution. MDA developers believe it should be as universally available as PCM or MIDI, thus encouraging broad adoption across the industry, while opening new opportunities for technological development and commerce such as creation tools, codecs, 3D renderers and cinema playback products.

Some Common Questions and Answers
Q.     How is MDA different from Dolby® Atmos™ or Barco® Auro® 11.1?
A.     The primary differences are:
•    MDA comes with no preconceived speaker configuration, it adapts to the needs of the cinema.
•    MDA is an open standard, inviting development from SMPTE and the industry.
•    MDA is royalty-free.

Q.     Isn’t this a throwback to the Dolby®/DTS®/SDDS® format wars?
A.     The film format wars ended when d-cinema adopted PCM as the sole audio format, thus avoiding proprietary audio codecs. MDA parallels that solution with an open PCM plus metadata format.

Q.     How will it be commercialized?
A.     It is expected to follow the same path as the current d-cinema industry, with creation tools and playback systems innovated by independent product developers.

Q.     Will the DCP just come with MDA encoding as another “flavor”?
A.     That is decision for each studio to make. There are no barriers to playing the MDA content in any MDA-capable system.

Q.     What does it cost? What additional gear do I have to buy to play it back in my theatre?
A.     Since MDA is scalable, costs will vary commensurate with the scope and level of system execution.

Q.     What movies are encoded in this format (or when will content be available)?
A.     The industry is currently working on a unified OBAE delivery format. Once it has been defined, mixing for release can commence.