Search
No matching content found
Newsroom / Blog

DSP Hardware vs. Software in Remote Meetings: Who Ensures Sound Quality?

The Audio Processing Conflicts Between Zoom, Teams, and Professional Conference Microphones
Vergil
July 15, 2025
12 min read
DSP Hardware vs. Software in Remote Meetings: Who Ensures Sound Quality?

The Battle Between DSP Hardware and Software in Remote Meetings: Who Ensures Audio Quality?

As remote meetings become routine in today's work environment, audio quality is crucial for effective communication. Modern video conferencing setups face a persistent challenge: hardware and software audio processing often operate independently—and unintentionally at cross purposes. On one hand, conference room hardware like microphone arrays and external DSP units conduct their own digital signal processing (DSP). On the other, client applications such as Zoom and Microsoft Teams apply software algorithms—typically echo cancellation (AEC), noise suppression (NS), and automatic gain control (AGC). When hardware and software perform overlapping processing unaware of each other, a processing conflict occurs. This kind of redundant or duplicate processing is frequently the source of complaints about audio dropouts, distortion, or unstable volume. For audio product managers and engineers, the fundamental issue explored in this article is how to avoid these processing conflicts when deploying across diverse platforms.

Real-World Example: DSP Hardware vs. Zoom/Teams Interaction Issues

In practical deployments, conflicts between hardware and software audio processing are common and often frustrating. Consider a university using Shure’s ceiling microphone arrays in Zoom-based classrooms. The Shure arrays incorporate onboard DSP, providing localized echo cancellation and AGC. However, because many participants are unaware that they need to activate Zoom’s “original sound” mode (which disables Zoom’s built-in echo cancellation), Zoom continues to process incoming audio too. The result is that audio already optimized at the hardware level is subjected to a second round of software processing. Students on the far end experience issues such as voices being excessively noise-suppressed (sounding choppy) or unstable volume caused by conflicting gain controls—significantly degrading the learning experience.

The same complaints surface with Microsoft Teams: some professional users wish to bypass Teams’ built-in gain control and echo cancellation, but the platform doesn’t offer an option. As one audio engineer put it, “We use professional mics and in-ear monitoring, so software-based echo cancellation is unnecessary. Unlike Zoom, which allows you to disable processing, Teams doesn’t provide any way to turn off AEC”1.

This isn’t isolated. On Reddit’s audio forums, there is frequent debate about managing microphone arrays (e.g., Shure MXA920), external DSP processors (like Biamp or QSC), and video conferencing software (such as Zoom)—all potentially running echo cancellation simultaneously2. Experienced professionals emphasize that only one component should ever process echo cancellation, with all other AEC disabled. For setups where an external DSP takes care of AEC, the microphone’s onboard AEC should also be turned off, and ideally, the software platform as well.

But in reality, maintaining this clarity is difficult: for example, Zoom’s default meeting mode always applies noise suppression and echo cancellation, unless the user manually enables “original sound”3. Most everyday users remain unaware, so their audio—already processed by hardware—gets reprocessed by software. Conversely, disabling hardware DSP just to avoid redundancy sacrifices the very performance advantages of those algorithms. Such inconsistencies lead to redundant processing at both hardware and software stages, ultimately damaging audio fidelity. The problem is so widespread that some savvy users explicitly recommend disabling all Zoom “enhancements”—because microphones such as the Shure MV7 already deliver top audio quality through their own DSP4.

Microsoft Teams' approach is even more inflexible: the standard Teams client offers no user-facing way to disable audio processing. An engineer with Teams deployment experience explains, “Teams does not allow disabling acoustic echo cancellation (AEC). If the connected device isn’t officially Teams certified, Teams assumes the device can’t be trusted and always applies its own AEC logic”5. As a result, even premium professional hardware is subjected to mandatory Microsoft processing, much to the frustration of advanced users.

Why Does Redundant Processing Hurt Audio Quality?

Redundant echo cancellation conflict diagram
Diagram showing how hardware and software performing echo cancellation simultaneously creates processing conflicts, leading to over-processed audio signals

When both hardware and software independently process the same audio stream, overlapping algorithms can have detrimental effects on sound quality, specifically:

  • Duplicate Echo Cancellation Causes Speech Loss or Artifacts: Only one segment of the audio chain should handle echo cancellation. If both a microphone array and meeting software apply their own AEC, the secondary stage is fed audio that’s already been altered, increasing the risk that valid speech elements are misclassified as echo and suppressed. The sound may become muffled, uneven, or difficult to understand.

  • Cascading Noise Suppression Degrades Intelligibility: Running separate noise reduction algorithms in hardware and software can result in legitimate speech—especially softer portions—being accidentally filtered out. For instance, initial hardware noise reduction might lower the volume of quiet speech, which software then further suppresses as background noise, leading to dropouts or clipped audio. Many users have observed Zoom truncating instrument sounds or the ends of words, the direct result of this “stacked noise gating.”

  • Clashing AGC Algorithms Create Volume Swings: The purpose of AGC is to stabilize volume input by adjusting gain as needed. But if both hardware and software simultaneously control gain, it often leads to “pumping” or “breathing” effects—hardware raises a quiet input, software brings it back down, and the result is erratic volume for the listener. Competing AGCs destroy volume consistency.

At its root, the problem is that multiple uncoordinated DSP stages in the signal chain cause cumulative distortion and loss of clarity, especially when duplicating functions like echo cancellation or noise suppression. As DSP expert Harald Steindl puts it: “Multiple, unknown audio DSPs in series are disastrous for audio quality; this must be avoided”6. Professional guides from manufacturers such as Shure consistently advise disabling all software filtering when using high-quality microphones with onboard DSP, precisely to prevent these negative interactions7. In short, redundant processing is recognized across the industry as a major cause of unsatisfactory sound.

Platform and Device Compatibility: Mechanisms and Gaps

Platform audio settings interface showing Zoom Original Sound and Teams High-fidelity Music Mode options
Examples of platform audio control settings: Zoom's Original Sound feature and Teams' High-fidelity Music Mode that allow users to bypass software audio processing

The ideal solution is clear coordination between hardware and software: either fully trust hardware and disable all software processing, or use raw output from hardware and let software handle everything. Many vendors and platforms now offer mechanisms to support this, but notable limitations remain.

Zoom’s Original Sound Feature

Zoom allows for more manual intervention than most platforms. By default, Zoom enables “optimized audio” with its own noise suppression and echo cancellation to serve general users8. Professional users, musicians, or those with high-end microphones can instead enable “Original Sound” (including High-fidelity Music Mode) to bypass Zoom’s filters. This disables Zoom’s AEC and post-processing, raises the audio sampling rate to professional levels, and even lets users toggle AEC completely off (recommended only in non-speaker scenarios to prevent feedback). Essentially, this “Original Sound” mode acts as a manual switch: if users trust their own hardware and environment to handle echo and noise, they can bypass Zoom’s built-in processing to avoid redundancy.

More advanced, Zoom Rooms (the enterprise conference room variant) can automatically detect device types and adjust processing appropriately9. If the system detects a single external DSP endpoint (e.g., a USB audio device providing both mic and speaker and reporting AEC support), it will disable Zoom Rooms’ software processing and leave echo cancellation to the hardware. In contrast, if input and output are on different devices (such as a stand-alone microphone and a display speaker), Zoom Rooms enables software AEC by default. Zoom’s documentation notes that certified peripherals—such as devices from Logitech or Poly—trigger automatic deactivation of Zoom’s in-app echo cancellation and noise suppression. However, if a user manually changes device settings, software DSP may be reactivated and must be disabled again as needed. In general, Zoom offers flexible hardware-software compatibility—both auto-detection and manual override—which many AV integrators appreciate10.

Teams and the Device Certification Gate

By contrast, Microsoft Teams adopts a much more restrictive approach. The standard Teams client does not expose any settings to disable audio processing, forcing users to accept default processing. Instead, Teams relies on a device certification whitelist: only officially Teams-certified devices are trusted to provide echo cancellation (as indicated by USB device type “Echo-Cancelling Speakerphone,” code 0x0405)11.

In theory, Teams should offload AEC to such devices. However, simply self-reporting as code 0x0405 isn’t sufficient—Teams checks against a Microsoft-maintained whitelist of specific certified devices (by vendor and product ID). High-quality but uncertified peripherals, even if fully standards-compliant, may have their reports ignored, with the Teams client enforcing its own audio processing and creating redundant processing.

This creates practical headaches. Integrators and IT can’t override Teams’ judgment: the decision is fully on Microsoft’s side. The message to the marketplace is unambiguous: “Only certified devices are allowed to bypass platform-side processing.” This policy steers users toward the certification ecosystem or leaves them subject to forced software processing—even if their hardware is fully capable.

Microsoft likely prefers this approach to ensure a reliable baseline audio quality, erring on the side of over-processing rather than risking poor user experiences due to hardware variability. For audio professionals, however, “one-size-fits-all” platform-level DSP is unsatisfying. Teams recently introduced a “High-fidelity Music Mode,” which exposes toggles to disable noise suppression and echo cancellation during meetings aimed at music and high-performance audio scenarios12. But these must be activated manually, are not default, and mainly target specialized use.

Device-Based Compatibility Strategies

In addition to platform-level features, many hardware vendors have adopted strategies to minimize conflicts:

  • Device Mode Declaration: Some conference audio devices present both a "hands-free" mode (for echo cancellation) and a "recording" mode (for raw signal). For example, certain microphones or DSP processors can identify to the computer as “Echo-Cancelling Speakerphone,” signaling to video software that they provide their own AEC and that the software’s AEC should be bypassed13. USB speakerphones often adhere to this standard, so Zoom and Teams automatically delegate echo cancellation to hardware when such devices are detected—provided they’re on the platform’s trusted list.

  • Dynamic Conference Modes: Some hardware automatically adapts processing in response to the detected application or audio path. For instance, a microphone array might disable its own noise suppression when connected to certain conferencing software, leaving that function to the software for better coordination. However, this approach requires app-level detection and isn’t universally applicable.

  • Pursuing Platform Certification: Official Zoom or Teams certification is the surest path to robust compatibility. Certification processes validate echo cancellation, noise suppression, and software integration to ensure seamless operation. Certified hardware can trigger platforms like Teams or Zoom Rooms to automatically disable software-side DSP14, relying on the device’s onboard processing. Manufacturers such as Shure now have microphone arrays (like the latest MXA series) certified for both Zoom and Teams15, promoting a "plug-and-play" experience without manual reconfiguration.

Shure Microflex Advance microphone array
Shure Microflex Advance microphone array, ceiling-mounted in a conference room. Once recognized as trusted by the software platform, in-app processing is deferred, eliminating redundant processing and maximizing sound quality.

Practical Tips for Avoiding Redundant Processing

If you are a product manager or developer seeking to avoid hardware-software audio conflicts and ensure consistent performance across platforms, consider these key strategies:

  • Delegate processing to one stage only: Decide which component (hardware or software) will perform AEC/NS/AGC, and make sure the other side’s corresponding features are off. For example, if employing a microphone array or DSP with AEC, the conferencing software’s AEC should be disabled; if relying on software-based DSP, hardware should output an unprocessed stream. Avoid simultaneous, redundant processing at all costs.

  • Utilize advanced platform settings: For Zoom, enable “Original Sound” and related high-fidelity modes when setting up rooms, and educate presenters or integrators to make sure echo cancellation and AGC are disabled where appropriate. In Teams, use “High-fidelity Music Mode” for scenarios where higher-quality or less-processed audio is needed. While Teams cannot disable AEC globally, these features reduce the risk of processing conflict for musical or professional audio uses.

  • Favor certified and well-supported devices: Select Zoom/Teams certified hardware when possible, as these are explicitly supported for seamless processing integration and minimize incompatibility issues. If using uncertified devices, follow vendor recommendations for configuration (like ensuring the device presents as an echo-cancelling hands-free unit) for maximum compatibility.

  • Provide configurable processing modes: Hardware makers should let users select between processed output (with AEC/NS applied) and a bypassed/raw mode. For platforms like Teams that don’t allow bypassing software processing, advise users to use bypass mode; for Zoom (with “Original Sound” enabled), hardware-processed output may be preferable. Even without automatic negotiation, empowering users and integrators with mode choices is a significant step forward.

  • Educate users and integrators: Publish clear deployment guides to inform end users and IT on how to coordinate settings for optimum audio. For example: “When using this professional microphone, enable Zoom’s original sound and switch off auto gain.” Or: “With Teams’ default processing, use ‘High-fidelity Music Mode’ or set the device to Teams-compatible mode with AEC.” Proper documentation and user education dramatically reduce misconfiguration risks.

  • Track platform updates: Both Zoom and Teams update their DSP architectures over time. Stay informed via release notes, as future versions may expand options for disabling AEC or change device handling logic, and adapt your recommendations accordingly.

Avoiding redundant processing demands collaboration from both hardware and platform developers. Devices must accurately communicate their capabilities (for instance, through correct USB descriptors), while software platforms should expose appropriate DSP controls—either to users or through smarter device detection routines. Ultimately, the goal is one processor per function per signal chain: otherwise, multiple uncoordinated DSPs degrade the clarity of even the highest-quality microphones.

Conclusion: Advancing Cross-Platform Audio Quality

As hybrid work and distance learning proliferate, users often toggle between multiple meeting platforms. Audio product teams strive to deliver the best possible experience, but unresolved conflicts between hardware and software processing risk degrading sound and creating confusion about where the responsibility for quality really lies.

While there’s no perfect solution, things are moving in the right direction: industry standardization is on the rise. Companies like Zoom and Microsoft have launched certification programs and opened some advanced user controls to improve hardware-software collaboration. Audio hardware vendors are updating their products to comply with these standards and platforms’ evolving needs. In the future, we may see truly intelligent protocols allowing hardware and software to negotiate processing roles automatically—eliminating most of these conflicts and delivering consistent, high-quality audio everywhere.

Until then, audio professionals should rigorously apply the “one stage per function” principle during design and deployment, taking full advantage of today’s tools to prevent processing overlap. Only thorough coordination across software and hardware will restore natural, intelligible voice to remote meetings.

Share This Article

Subscribe to the latest news

Get the latest audio technology articles and industry news, delivered straight to your inbox.

Read More

Let's work together!

Bringing high-end audio within reach.
WeChat QR Code

Scan to follow our WeChat Official Account

© 2025 Pawpaw Technology. All rights reserved.