WebRTC: how to use audio process module? 13



My current job responsiblity is researching on WebRTC, and the first task is wrapping a class from WebRTC to process audio frames to implement functions of audio AEC, AGC, NS, High pass filter etc.

Information list below is from WebRTC.org, you can also view it by visiting http://www.webrtc.org, or it’s code.

The Audio Processing Module (APM) provides a collection of voice processing components designed for real-time communications software.

APM operates on two audio streams on a frame-by-frame basis. Frames of the primary stream, on which all processing is applied, are passed to |ProcessStream()|. Frames of the reverse direction stream, which are used for analysis by some components, are passed to |AnalyzeReverseStream()|. On the client-side, this will typically be the near-end (capture) and far-end (render) streams, respectively. APM should be placed in the signal chain as close to the audio hardware abstraction layer (HAL) as possible.

On the server-side, the reverse stream will normally not be used, with processing occurring on each incoming stream.

Component interfaces follow a similar pattern and are accessed through corresponding getters in APM. All components are disabled at create-time, with default settings that are recommended for most situations. New settings can be applied without enabling a component. Enabling a component triggers memory allocation and initialization to allow it to start processing the streams.

Thread safety is provided with the following assumptions to reduce locking overhead:
1. The stream getters and setters are called from the same thread as ProcessStream(). More precisely, stream functions are never called concurrently with ProcessStream().
2. Parameter getters are never called concurrently with the corresponding setter.

APM accepts only 16-bit linear PCM audio data in frames of 10 ms. Multiple channels should be interleaved.

Usage example, omitting error checking:

AudioProcessing* apm = AudioProcessing::Create(0);
Super-wideband processing.
// Mono capture and stereo render.
apm->set_num_channels(1, 1);
apm->gain_control()->set_analog_level_limits(0, 255);
// Start a voice call...
// ... Render frame arrives bound for the audio HAL ...
// ... Capture frame arrives from the audio HAL ...
// Call required set_stream_ functions.
// Call required stream_ functions.
analog_level = apm->gain_control()->stream_analog_level();
has_voice = apm->stream_has_voice();
// Repeate render and capture processing for the duration of the call...
// Start a new call...
// Close the application...
apm = NULL;


Leave a comment

Your email address will not be published. Required fields are marked *

13 thoughts on “WebRTC: how to use audio process module?

  • Rafael

    Thank you so much for this interesting information about WebRTC! I am working in a software that needs AEC, NS, AGC… and I am using the Audio Processing Module, but now I have more clear how to implement it.

    But I have a question for you. I need to know which algorithms are implemented in each module (AEC,NS and AGC), because I have to write my Master´s Thesis and I need to know that because I cannot use something with the theory behind it…

    Please, can you help me?

    Thanks in advance and congratulation for your blog!

    • Jacky Wei Post author

      Well, I’m not sure what exact you are asking for.
      a. If you are trying to research into the detail algorithm of webrtc’s audio processing modules, I can tell you the every single line of the module’s source code is availlable, which locations at $(WEBRTC_ROOT)\trunk\webrtc\modules\audio_processing. $(WEBRTC_ROOT) is the base directory you downloaded the source code.
      b. If you attend to research into the formulas and deductions behind the algorithms and the codes, then, sorry, I’m not the right one you should ask to, and webrtc doesn’t provide this kind of informations either. I believe what you should do is just Google it.

      • Rafael

        Thank you so much for your answer.

        I don´t need to know in deep about the algorithms implemented into the webrtc audio processing modules. I only need to know, for example, if the AEC module implements an NLMS algorithm or another one, and the same question for the Noise Supression algorithm and for the Automatic Gain Control algorithm. The formulas and deductions may I study by my own.

        I will appreciate so much if you can tell me this information.

        Thank you in advance!

        • Jacky Wei Post author

          If so, then the answer is Yes.
          AEC module of webrtc implements an NLMS algorithm, including but not limits to NS.
          However AGC is not associated with NS.

          You can read it from the audio engine –> audio process module directly.

  • Rafael

    Hi, I have also a question regarding the code. I don´t have clear how to pass my audio frames to the webrtc functions. I will try to explain:
    what does “render_frame” means in the parameter of the function “apm->AnalyzeReverseStream(render_frame);”? The type of this parameter is “AudioFrame*”, but I don´t what is that type. If I just put my audio frames (with a type convertion), the program don´t cancel any echo. I have seen into the file “module_common_types.h” and see the class “AudioFrame”, but I don´t know how to handle that to get my program works.

    Thank you another time!

  • hudson

    Hi,Jacky Wei
    Nice to read you blog about AGC etc WEBRTC.
    I am a fresh engineer on audio.
    I created one application with javascript, found has a seriously echo, so could you give some suggestion how to release the echo noise if possible.

    Thank you,

    • Jacky Wei Post author

      HI Hudson,
      I’m so sorry to tell you that I’m not so familiar with javascript level APIs of WebRTC. I did involve in AGC processing of WebRTC, but it was in its Voice Engine level of API.

  • Hao Nguyen

    Hi Jacky Wei,
    I’m facing with so much echo sound when streaming audio in full-duplex (two-ways).
    Does the webrtc-audio-processing modules solve my problems?
    Do you have any guides to implement this module to cancel echo sound when streaming audio between two devices?
    I’m very new to audio processing and webrtc, so i really need your help.
    Thanks you for your support !!