AUDIO

Audio Engineering Society European Convention in Warsaw, Poland

Immediately following the intense schedule of the High End show in Munich, Germany, audioXpress attended the 2025 Audio Engineering Society European Convention (May 22-24) in Warsaw, Poland. An exciting event where most deeper technical sessions focused on spatial and immersive audio. But what does Immersive even mean? What Is the Future of Spatial Audio for Consumers? Those were questions asked and debated at this AES convention, where leading audio engineers and researchers focused on the latest advancements in those audio applications.

Less than a year ago, I attended the Audio Engineering Society (AES) 2024 Automotive Audio Conference at the Geely Uni3 Center in Gothenburg, Sweden. It was one of the most successful, insightful, and relevant AES events ever because of its perfect convergence between the most cutting-edge academic and industry research efforts, demonstrating how the vibrant evolution in the automotive segment is shaping whole audio disciplines.
 

Roger Shively and J. Martins represented audioXpress at this AES European convention in Warsaw. There were far too many presentations, sometimes with four sessions happening concurrently, to be possible to attend and cover in this report. 

For some reason, I had this happy memory in mind as I headed to Warsaw, Poland, for the 2025 AES European Convention (May 22-24), immediately following the intense schedule of the High End show in Munich, Germany. I think it is important to compare these events, just as a reminder that a direct connection between education and development, academia and industry, is critical to ensure the relevance and progress of audio. Without that connection, many perfectly valid research efforts tend to move in circles and never find actual adoption.

This, for me, was in clear evidence when I recently reported from the Global Audio Summit 2025, promoted in March of this year by the China Audio Industry Association (CAIA) in Shanghai. This was a conference where academic, scientific, engineering, and industry efforts perfectly converged— also under the hospices of an audio industry association. The big difference there was that every presentation placed all those research efforts under a product development perspective. Including when multiple sessions illustrated failed or less successful efforts, and I witnessed the presenters recommend that attendees save their time pursuing that approach and focus instead on new uncharted directions. This, for me, was a very refreshing and unique perspective.

Seeing competitive companies, discussing actual products that they developed, acknowledging research that had been conducted in universities and other entities, illustrated with slides explaining what failed to gain acceptance by real consumers in actual products changed my whole perception about the content that I normally see presented at international audio conferences. In a different way, the sessions and demonstrations that I witnessed at the AES Automotive Conference in Gothenburg provided a similar sense of connection with real applications that made the whole experience extremely valuable for those who attended.

 

The Audio Engineering Society promoted the 2025 AES Europe Convention, this year taking place in Warsaw, Poland, May 22 – 24. Where noted, official AES photography is by Marcin Krokowski – marcinkrokowski.com

This year’s AES Europe Convention was headed by co-chairs Bozena Kostek, Magdalena Piotrowska, and Jan Zera. Photography by Marcin Krokowski.

AES2025_Warsaw_54549974283-Genelec-Web
As always, Genelec created a dedicated listening space with state-of-the-art immersive audio monitoring, for a continuous program of sessions presented by Thomas Lund (in the photo seen during the immersive audio “Enveloping Masterclass” series), Florian Camerer, Kimio Hamasaki, and Morten Lindberg, among others. The reference listening room was built using seven 8351Bs (The Ones), two 7370 subs, and four 8341As.  Photography by Marcin Krokowski.

Modern Warsaw
I probably should have made a fairer comparison between this 158th AES Convention in Warsaw and the previous AES Europe convention in Madrid, Spain, in 2024. That event was hosted by the Universidad Politécnica de Madrid, a vibrant academic institution, which is a natural environment to promote the AES International conventions, providing a valuable opportunity for researchers and students to interact with the global audio engineering community.

In Madrid (June 15-17, 2024), the focus was as usual on the more creative and artistic aspects of music production and recording, with the technical sessions focusing on immersive audio production and reproduction, room acoustics, and audio signal processing. Among the most recognized papers of that year, we find multichannel loudspeaker reproduction, spatial sound, acoustic simulation and auralization, and once again some studies oriented toward automotive applications.

As I headed for the ATM Studio — a media production hub in an industrial area of the city of Warsaw — I also had in mind many of the discussions and trends from the Munich show days before, marked by the progress in personal audio (headphones), high-quality reproduction, digital audio, audio electronics, and naturally loudspeaker technology (see my report here). 

 

On the first day of this AES convention, an early session was scheduled with the title “The Advance of UWB for High Quality and Low Latency Audio” and was originally planned to be presented by Jonny McClintock (Audio Codecs). To the surprise of those attending (with an almost full room, unusual for the early hour), this session was presented (remotely) by Jackie Green, Technical Advisor for the UWB Alliance, joining a panel present onsite to discuss the exciting new possibilities unlocked by UWB. Interventions focused on the ongoing AES efforts to propose an ultra-wideband-based audio standard for high-quality transmission, and Jackie Green discussed the UWB Alliance’s plans to launch a certification program for UWB audio devices, aligned with this upcoming AES X-260 standard to ensure device interoperability. Gary Spittle from Sonical also intervened remotely to discuss the upcoming introduction of the Remora Pro platform, enabling multichannel and personalized audio.

AES2025_Warsaw_54549734141-A2B-Web
“The benefits, tradeoffs, and economics of standard and proprietary digital audio networks in DSP systems” was one of the papers presented by Miguel Chavez (Analog Devices), who also presented a poster detailing the low latency network features of A2B. This one in particular generated enormous interest among students who have little contact with A2B technology. As one of the developers of DSP software and algorithms for Analog Devices SigmaStudio, Miguel Chavez was flooded with interesting questions from attendees.  Photography by Marcin Krokowski.

I know that the Audio Engineering Society also has scheduled the 2025 AES International Conference on Headphone Technology, taking place in August 27-29, in Finland. As it happened the year before with Automotive Audio, these AES-themed conferences tend to suck all the more relevant (and industry-oriented) content away from the general AES Conventions, so I was not expecting to see many sessions addressing hearables – which is probably the most relevant and exciting topic in audio these days (after automotive audio).

Anyway, after studying the recently published program and full schedule of nearly 100 conferences for the 2025 AES European Convention, I found more than enough content to motivate me to attend (it also helped that from Munich to Warsaw is such a short flight). And effectively from the moment we arrived at the ATM Studio, the environment couldn’t be more welcoming, as I was looking forward to the opportunity to mingle with all the other nearly 400 attendees. The energy from an audience with a strong local component of industry experts and students was noticeable. Many of the deeper technical sessions by leading audio engineers and researchers focused on the latest advancements in audio spatial audio and immersive audio applications. In fact, 34 of the sessions focused directly on spatial audio.

One major complaint was the fact that simple black curtains were being used to separate four of the auditoriums (set up within one of the ATM Studio sound stages), making some presentations hard to follow when sound was playing loudly in adjacent sessions.

This AES European Convention benefited largely from the direct involvement of the local AES Polish section with the support of a very vibrant community of audio companies from Poland, which includes some important manufacturing operations, and above all a very strong academic community represented by strong institutions such as the Gdańsk University of Technology, AGH University of Science and Technology, Fryderyk Chopin University of Music, Wrocław University of Technology, Warsaw University of Technology, and others. One strong common component among the papers and posters presented by students and researchers (independently of the application field): spatial and binaural.

 

AES2025_Warsaw_IMG_0784-Poster-Web
Many research posters presented at this convention also focused on spatial audio in the many fields of application, from automotive to VR. This included the presentation of Binamix, a Python library for generating binaural audio datasets. Binamix is an open-source Python library designed to facilitate programmatic binaural mixing using the extensive SADIE II Database, which provides HRIR and BRIR data for 20 subjects. The Binamix library provides a flexible and repeatable framework for creating large-scale spatial audio datasets, making it an invaluable resource for codec evaluation, audio quality metric development, and machine learning model training.

Posters are always an intriguing format at these AES conventions . Sometimes presenters are surrounded by participants and answer a continuous flow of questions. Other times they are simply a passive way to promote ongoing research that is not yet ready for primetime. We noted topics such as “Neural 3D Audio Renderer for acoustic digital twin creation” and “Reconstructing Sound Fields with Physics-Informed Neural Networks: Applications in Real-World Acoustic Environments,” both of which sounded intriguing and contained fascinating insights.

Immersiveness
When visiting an audio show such as AXPONA in the US, it’s always a special thing to find a room showcasing a multichannel system or home theater installation. It’s 99% a show focused on stereo. At the High End in Munich, where the convergence with a broader set of home audio and lifestyle-oriented brands is more prominent, we can find some really outstanding multichannel rooms, but again, we can say that the show is 90% stereo. This year, it was a notable event that the Kii Audio room had an impressive Dolby Atmos setup, including presentations by music producers, mastering engineers, and artists (Anette Askvik) detailing their vision for immersive audio. A welcome exception.

Is there a disconnect? Not really. We are going through a transition, the results of which are not yet clear. As Prof. Karlheinz Brandenburg (now founder and CEO at Brandenburg Labs GmbH) stated to the audience at one of the sessions of this AES Warsaw convention, “consumers have not yet experienced spatial audio.” His new company is now determined to make that a possibility – at least over headphones – but they say we are at least two years away from that. And the key, the company also details, lies in combining all the already studied elements, such as low-latency head-tracking, and combining them with Binaural Room Impulse Responses (BRIRs), which is how our brain knows not just where a sound comes from, but also what kind of space we’re in.

 

AES2025_Warsaw_IMG_0867-Heyser-Web
In one of the most enlightening moments of this 2025 AES Europe convention, Dr. Jürgen Herre, Professor of Audio Coding at International Audio Laboratories Erlangen, gave the Richard C. Heyser Memorial Lecture “It’s All About Perception – A Personal Research Journey.” He recapped his 36-year journey from a young DSP engineer at Fraunhofer IIS to a pioneer of perceptual audio coding technologies like MP3, AAC, and AI-driven audio coding. A very interesting presentation detailing how the evolution from MP3 to AAC introduced a significant evolution that remains relevant still today.

AES2025_Warsaw_IMG_0894-Audioscenic-Web
“Binaural Audio Reproduction Using Loudspeaker Array Beamforming” was the title of the workshop presented by Marcos Simón and Jacob Hollebon (Audioscenic), explaining how crosstalk cancellation combined with beamforming and listener tracking delivers immersive spatial sound without being limited to headphones. Combined with listener tracking, the live demonstrations showed consistent and accurate binaural playback from a simple soundbar, even as the listener moves.

Maybe I am privileged to have already experienced a demo of the Brandenburg Labs Okeanos Pro system, which is now available for scientific research institutions and studios who want to achieve “life-like simulations of real speaker systems, allowing audio engineers to precisely control the spatial image of their mixes on headphones for the first time.” I know this works. Convincingly. But I still prefer to have the same level of experience in a room surrounded by speakers, or even in front of a speaker array with a convincing 3D spatialization implementation such as the solutions offered by Audioscenics.

Contradicting what Karlheinz Brandenburg stated in Warsaw, millions of consumers have experienced real Spatial Audio, but in the form of Dolby Atmos movie soundtracks they heard at movie theaters. But that’s something very few people even associate with listening to music. What he meant is that a truly convincing binaural representation of Spatial Audio isn’t yet available as such – at least for consumers.

But it’s neither clear what the intention or purpose of the whole discussion around Spatial Audio, immersive sound, 3D sound, and more, that is taking place at forums such as this AES Convention is. There is no consensus about what “Spatial Audio” is for music – as Apple knows well, even with all the efforts to promote Dolby Atmos Music on streaming services. The experience remains unconvincing for consumers over AirPods and other headphones, even with head-tracking. In contrast, it is very convincing when experienced inside a new electric vehicle fitted with a Dolby Atmos speaker system, as the automotive industry is now promoting.

 

AES2025_Warsaw_IMG_0940-PanelFuture-Web
“The Future Of Spatial Audio For Consumers” panel was converted from the announced workshop format into a question motivating an interesting debate. It was one of the most interactive sessions we witnessed, showing how spatial audio is currently a focus of research. 

AES2025_Warsaw_IMG_0941-PanelFuture-Web
Note the title turned into a question… The panel to answer included Jake Hollebon (Audioscenic), Jan Skoglund (Google), Markus Zaunschirm (Atmoky), Jean-Marc Jot (Virtuel Works), and Hyunkook Lee (Applied Psychoacoustics Lab, University of Huddersfield), moderated by Marcos Simón, Audioscenic.

All this was under discussion during the 2025 AES European Convention, where all the multiple angles of intense research are trying to meet the requirements for immersive audio experiences in entertainment, gaming, virtual reality (VR), and augmented reality (AR) applications. There is a general view in some circles that “immersive sound” and realistic 3D soundscapes will be preferred to stereo by consumers on their listening devices. This has a valid angle but doesn’t meet either the production format criteria or the consistency of experiences that most of the efforts focused on music and creative content are pursuing. Discussions around binaural recording and spatial sound processing are sometimes moving in conflicting directions in terms of the actual industry adoption of immersive audio for applications such as sound reinforcement and live sound events – as an example.

One thing is certain: when so many bright minds such as all the audio researchers that have presented at this AES convention converge in the same direction, something is bound to happen. I saw multiple references to Apple’s Vision Pro as a completely new platform that goes beyond everything that was tried before in VR. There is also a sense of the huge potential to explore binaural rendering of audio scenes captured with microphone arrays, creating realistic sound experiences to which consumers will respond enthusiastically. As Brandenburg Labs states, the only problem is just that the (reproduction) devices to do that are still not yet available to consumers.

In a conference where Dolby was mentioned in most of the sessions, there was no one there to speak for the company. Which also explains why there was a panel where the question of “What Is the Future of Spatial Audio for Consumers?” was debated from five completely different perspectives. The panel, which was originally announced as a workshop panel, was actually one of the most interesting moments of this AES Europe convention and included Jake Hollebon (Audioscenic), Jan Skoglund (Google), Markus Zaunschirm (atmoky), Jean-Marc Jot (Virtuel Works), and Hyunkook Lee (Applied Psychoacoustics Lab, University of Huddersfield). 

The standing-room-only session, moderated by Marcos Simón, Audioscenic CEO, reflected on the strong momentum for Spatial Audio technology and the assumption that “…we are at a tipping point where Spatial Audio availability is going to become mainstream.” Jacob Hollebon, Principal Research Engineer at Audioscenic, tried to explain how the key for this to happen is to make 3D spatial audio reproduction work in everyday products, from soundbars to TVs and laptops. Something that Audioscenic is actually turning into a commercial reality using its beamforming on loudspeaker arrays in combination with listener-adaptive position tracking.

 

AES2025_Warsaw_IMG_0006-ProfBrandenburg-Web
Prof. Karlheinz Brandenburg (Brandenburg Labs GmbH) stands up to intervene during the “The Future Of Spatial Audio For Consumers” panel. In his view, we are still two years away from being able to offer a convincing Spatial Audio experience over headphones to consumers.

AES2025_Warsaw_IMG_0987-ECHO-Web
Another standing-room-only session at the end of the second day.  A presentation of the ECHO Project for Immersive Microphone Array Techniques for Orchestral Recording, presented by Hyunkook Lee, Katarzyna Sochaczewska, Morten Lindberg, Mark Willsher, and Simon Ratcliffe.

This was a “reality” contrast with what Jan Skoglund described, basically reflecting the perspective of the Eclipsa Audio “open” immersive audio format that Google wants to use on YouTube, and which is based on the new Immersive Audio Model and Formats (IAMF) specification available under a royalty-free license from the Alliance for Open Media. As always – in my opinion – something that will be funded and promoted only until Google gets tired of it. And resulting in another format that basically intends to translate any immersive content, allowing everyone to continue to produce their 7.1.4 mixes to the Dolby Atmos standard.

Of course, the perspective on spatial audio from Markus Zaunschirm, the co-founder of atmoky, is also based on a very specific perspective of creating content for virtual/extended reality and gaming, where the boundaries continue to be the playback devices, but where production creativity has no boundaries. A broader perspective regarding applications was offered by Jean-Marc Jot, currently at Virtuel Works, which has previously worked on Spatial Audio efforts for companies such as iZotope, Magic Leap, DTS, Creative, and IRCAM, having left a clear mark on the industry with many of the tools that are currently in use for production and live sound.

Among those is SPAT (from Spatialisateur in French), the object-based and perceptual immersive mixing solution originally researched at the French Institute IRCAM, and that motivated Harman Professional to acquire FLUX Software Engineering, the spin-off entity created to commercialize SPAT. From Jean-Marc Jot’s perspective, the focus needs to be on embracing format-agnostic solutions for “true-to-life, timbre-preserving personal spatial audio experiences” that can apply to both entertainment and business applications and with any binaural audio rendering engine.

And probably the most colorful and inspiring presentation was from Hyunkook Lee, the notable University of Huddersfield Professor and founder of the Applied Psychoacoustics Lab (the creators of the Virtuoso binaural renderer), who provoked the audience by questioning everything and teasing with the concept that even mono can be immersive, depending on how plausible, interesting, and interactive an experience can be. A conceptual model that he expanded on from the day before, in his Keynote presentation during the convention’s opening ceremony.

 

AES2025_Warsaw_54549977208-HyunkookLee-Web
Prof. Hyunkook Lee (University of Huddersfield, Applied Psychoacoustics Lab (APL)) delivered the keynote speech at the opening ceremony, titled “Dimensions of Immersive Audio Experiences.” His slides are available by clicking the image.  Official AES photography is by Marcin Krokowski.

AES2025_Warsaw_IMG_0061-HyunkookLeekeynote-Web
In his keynote “Dimensions of Immersive Audio Experiences,” Hyunkook Lee questioned the lack of consensus for what “immersive” truly means.

AES2025_Warsaw_IMG_0057-ConceptualModel-Web
Hyunkook Lee proposes an interesting conceptual model of what broadly defines an immersive experience, which is independent of reproduction systems and formats, and can easily be applied to single sources and stereo.

Titled, “Dimensions of Immersive Audio Experiences,” Hyunkook Lee characterized the whole complexity of what is taking place in music, film, gaming, and virtual/augmented reality, where the term “immersive” is used as an umbrella term for spatial and 3D audio. He says this lack of understanding on what “immersive” truly means leads to consumer confusion and disappointment when content is labelled but not perceived as “immersive.” In response, he expanded on his field of research and the fundamentals for a conceptual model, but leaving more open questions than consensus.

Overall, the 2025 Warsaw AES conference was extremely rich in content addressing the technological pieces of this Spatial Audio and immersive puzzle, providing a good indication that something is bound to happen very soon in multiple fields. But we are probably, as Prof. Brandenburg said, about two years away from having something convincing for consumers. This is in line with what is also on the horizon for modern vehicles fitted with Dolby Atmos factory sound systems to become mainstream in more regions of the world than China. I sincerely hope that the upcoming 2025 AES International Conference on Headphone Technology, Finland, offers a clearer picture of this becoming a commercial reality in that time frame.

Meanwhile, a reminder that the 2025 AES Show will be back on the West Coast and takes place in Long Beach, CA, from October 23-25. aX

 

AES2025_Warsaw_54550863196-Bob-Peter-Web
One of the most interesting tutorials, on the final day of this AES convention in Warsaw, was “The Gentle Art of Dithering” presented by none other than Bob Stuart (Meridian) and Peter Craven (Algol), the two pioneers of data compression, Adaptive Differential Pulse-Code Modulation (ADPCM), MLP lossless compression, and the MQA technologies. Peter Craven was also a co-designer of the Soundfield Microphone. Their tutorial stresses how transparency (high-resolution audio fidelity) depends on the preservation of micro-sounds – those small details that are easily lost to quantization errors, but which can be perfectly preserved by using dithering.  Photography by Marcin Krokowski – marcinkrokowski.com.

 This article was originally published in The Audio Voice newsletter, (#517), June 12, 2025.
 

 


Source link

Related Articles

Back to top button