WebRTC audio processing continuous: AGC effect optimizing (setting parameter adjustment)

After days of working,  all audio processing functions seems are working in the KdvMediaSDK now, and the next step will be getting a proper set for the audio processing algorithms to run under certain environments & circumstance.

I will start it by beginning with AGC module too. Here are the parameters of AGC algorithm. Continue reading “WebRTC audio processing continuous: AGC effect optimizing (setting parameter adjustment)”

Reading codes of WebRTC: Deep into WebRTC Voice Engine(draft)

Introduction of this document

This module includes software based acoustic echo cancellation (AEC), automatic gain control (AGC), noise reduction, noise suppression and hardware access and control across multiple platforms.

My ultimate goal will be wrapping an independent module out of WebRTC’s Voice Engine for our product, and the first task is to get AGC implemented base on current KdvMediaSDK implementations which is not so much the same in interfaces with WebRTC).

Keywords: WebRTC, audio processing, AEC, AGC, noise reduction, noise suppression.

Overall architecture of WebRTC

The overall architecture looks something like this:

Overall architecture of WebRTC
Overall architecture of WebRTC

(Image from http://webrtc.org)

WebRTC Voice Engine – AGC control workflow

WebRTC Voice Engine - AGC control workflow
WebRTC Voice Engine – AGC control workflow

You can download my original Visio file here:

http://rg4.net/p/webrtc/webrtc.voiceengine.agc.vsd

You can modify & distribute whatever you wish, but if you made any improvement for this chart, please send me a copy by mail. That’ll benefit a lot more people. Thank you.

Target/related source codes

Major source codes:

l  audio_device_wave_win.cc: %WEBRTC%\src\modules\audio_device\main\source\win\audio_device_wave_win.cc

l  audio_device_buffer.cc:

l  audio_device_utility.cc:

l  audio_mixer_manager_win.cc:

l  voe_base_impl.cc: %WEBRTC%\src\voice_engine\main\source\voe_base_impl.cc

l  transmit_mixer.cc : %WEBRTC%\src\voice_engine\main\source\transmit_mixer.cc

l  level_indicator.cc AudioLevel src\voice_engine\main\source\level_indicator.cc

Utility source codes:

event_win_wrapper.cc

thread_win_wrapper.cc

Detail interfaces & implementations

audio_device_wave_win.cc

It responsible for:

l  Audio capture

l  Get/Set Microphone Volume (I’m not sure what this volume means, hardware volume, or a virtual volume after audio processing, only because it is Get/Set through audio_device_mixer_manager.cc)

a. Audio capture.

Step 1: run audio capture in a thread named as ThreadProcess().

[cpp]bool AudioDeviceWindowsWave::ThreadProcess()

{

while ((nRecordedBytes = RecProc(recTime)) > 0)
{

}

}[/cpp]

Step 2: detail into RecProc() function, all capture parameters & the captured buffer will be saved to a variant |AudioDeviceBuffer* _ptrAudioBuffer|

[cpp]WebRtc_Word32 AudioDeviceWindowsWave::RecProc(LONGLONG& consumedTime)
{
……
// store the recorded buffer (no action will be taken if the #recorded samples is not a full buffer)

_ptrAudioBuffer->SetRecordedBuffer(_waveHeaderIn[bufCount].lpData, nSamplesRecorded);

// Check how large the playout and recording buffers are on the sound card.
// This info is needed by the AEC.

msecOnPlaySide = GetPlayoutBufferDelay(writtenSamples, playedSamples);
msecOnRecordSide = GetRecordingBufferDelay(readSamples, recSamples);

// If we use the alternative playout delay method, skip the clock drift compensation
// since it will be an unreliable estimate and might degrade AEC performance.

WebRtc_Word32 drift = (_useHeader > 0) ? 0 : GetClockDrift(playedSamples, recSamples);
_ptrAudioBuffer->SetVQEData(msecOnPlaySide, msecOnRecordSide, drift);


if (_AGC)
{
WebRtc_UWord32 newMicLevel = _ptrAudioBuffer->NewMicLevel();
if (newMicLevel != 0)
{
// The VQE will only deliver non-zero microphone levels when a change is needed.
WEBRTC_TRACE(kTraceStream, kTraceUtility, _id,”AGC change of volume: => new=%u”, newMicLevel);

// We store this outside of the audio buffer to avoid
// having it overwritten by the getter thread.
_newMicLevel = newMicLevel;
SetEvent(_hSetCaptureVolumeEvent);
}
}

}[/cpp]

b. Get/Set Microphone Volume

There are two other threads along with the major capture thread, they are

::DoGetCaptureVolumeThread()

::DoSetCaptureVolumeThread()

These threads will be always running waiting for a signal to Get or Set capture volume.

Things I’m still trying to figure out

There are so many definitions about microphone level or relevant. What I’m not sure is which volume is what volume? Here are some volume related definitions I confused with:

1.    class VoEBaseImpl(voe_base_impl.cc)

What’s the difference between currentVoEMicLevel and currentMicLevel in the codes below?

Which can be also compare to the variants _oldVoEMicLevel and _oldMicLevel defined in voe_base_impl.cc

WebRtc_UWord32 _oldVoEMicLevel

WebRtc_UWord32 _oldMicLevel

Where will set/change these variants values?

This code locates at voe_base_impl.cc

[cpp]WebRtc_Word32 VoEBaseImpl::RecordedDataIsAvailable(

const WebRtc_UWord32 currentMicLevel,
WebRtc_UWord32& newMicLevel)
{

// Will only deal with the volume in adaptive analog mode
if (isAnalogAGC)
{
// Scale from ADM to VoE level range
if (_audioDevicePtr->MaxMicrophoneVolume(&maxVolume) == 0)
{
if (0 != maxVolume)
{
currentVoEMicLevel = (WebRtc_UWord16) ((currentMicLevel
* kMaxVolumeLevel + (int) (maxVolume / 2))
/ (maxVolume));
}
}
// We learned that on certain systems (e.g Linux) the currentVoEMicLevel
// can be greater than the maxVolumeLevel therefore
// we are going to cap the currentVoEMicLevel to the maxVolumeLevel
// if it turns out that the currentVoEMicLevel is indeed greater
// than the maxVolumeLevel
if (currentVoEMicLevel > kMaxVolumeLevel)
{
currentVoEMicLevel = kMaxVolumeLevel;
}
}
// Keep track if the MicLevel has been changed by the AGC, if not,
// use the old value AGC returns to let AGC continue its trend,
// so eventually the AGC is able to change the mic level. This handles
// issues with truncation introduced by the scaling.
if (_oldMicLevel == currentMicLevel)
{
currentVoEMicLevel = (WebRtc_UWord16) _oldVoEMicLevel;
}
// Perform channel-independent operations
// (APM, mix with file, record to file, mute, etc.)
_transmitMixerPtr->PrepareDemux(audioSamples, nSamples, nChannels,
samplesPerSec,
(WebRtc_UWord16) totalDelayMS, clockDrift,
currentVoEMicLevel);
// Copy the audio frame to each sending channel and perform
// channel-dependent operations (file mixing, mute, etc.) to prepare
// for encoding.
_transmitMixerPtr->DemuxAndMix();
// Do the encoding and packetize+transmit the RTP packet when encoding
// is done.
_transmitMixerPtr->EncodeAndSend();
// Will only deal with the volume in adaptive analog mode
if (isAnalogAGC)
{
// Scale from VoE to ADM level range
newVoEMicLevel = _transmitMixerPtr->CaptureLevel();
if (newVoEMicLevel != currentVoEMicLevel)
{
// Add (kMaxVolumeLevel/2) to round the value
newMicLevel = (WebRtc_UWord32) ((newVoEMicLevel * maxVolume
+ (int) (kMaxVolumeLevel / 2)) / (kMaxVolumeLevel));
}
else
{
// Pass zero if the level is unchanged
newMicLevel = 0;
}
// Keep track of the value AGC returns
_oldVoEMicLevel = newVoEMicLevel;
_oldMicLevel = currentMicLevel;
}
return 0;
}[/cpp]

2.    class AudioDeviceWindowsWave

(audio_device_wave_win.cc):

[cpp]WebRtc_UWord32                          _newMicLevel;
WebRtc_UWord32                          _minMicVolume;
[/cpp]

WebRtc_UWord32 _newMicLevel

WebRtc_UWord32 _minMicVolume

Where will set/change these variants values?

_newMicLevel value: |_ptrAudioBuffer->NewMicLevel();| in AudioDeviceWindowsWave::RecProc while processing AGC

3.    class TransmitMixer(transmit_mixer.cc)

WebRtc_UWord32 _captureLevel;

Codes listed below is the key to the microphone level values, including the level before processing & and the level after processed.

[cpp]WebRtc_Word32 TransmitMixer::APMProcessStream(
const WebRtc_UWord16 totalDelayMS,
const WebRtc_Word32 clockDrift,
const WebRtc_UWord16 currentMicLevel)
{
WebRtc_UWord16 captureLevel(currentMicLevel);

if (_audioProcessingModulePtr->gain_control()->set_stream_analog_level(
captureLevel) == -1)
{
WEBRTC_TRACE(kTraceWarning, kTraceVoice, VoEId(_instanceId, -1),
“AudioProcessing::set_stream_analog_level(%u) => error”,
captureLevel);
}

captureLevel =
_audioProcessingModulePtr->gain_control()->stream_analog_level();
// Store new capture level (only updated when analog AGC is enabled)
_captureLevel = captureLevel;

return 0;
}[/cpp]

4.    class AudioDeviceBuffer

WebRtc_UWord32 _currentMicLevel;

WebRtc_UWord32 _newMicLevel;

5.    class

Functional implement flow:

audio_device_wave_win.cc

Summary

The major code of calling webrtc APM(audio processing manager) is in the function APMProcessStream() of TransmitMixer class.

For example: we will do the audio processing here for the input audio frame(AudioFrame _audioFrame), calculating the microphone levels in the same time, then output the processed frame and the new microphone level.

[cpp]WebRtc_Word32 TransmitMixer::APMProcessStream(
const WebRtc_UWord16 totalDelayMS,
const WebRtc_Word32 clockDrift,
const WebRtc_UWord16 currentMicLevel)
{
WebRtc_UWord16 captureLevel(currentMicLevel);

if (_audioProcessingModulePtr->gain_control()->set_stream_analog_level(
captureLevel) == -1)
{
WEBRTC_TRACE(kTraceWarning, kTraceVoice, VoEId(_instanceId, -1),
“AudioProcessing::set_stream_analog_level(%u) => error”,
captureLevel);
}

captureLevel =
_audioProcessingModulePtr->gain_control()->stream_analog_level();
// Store new capture level (only updated when analog AGC is enabled)
_captureLevel = captureLevel;

return 0;
}[/cpp]

And here are some customized log ouputs I added to the source code to log the detail processing and microphone level value change of AGC processing

[code]…

CRITICAL  ; ( 3: 7:13:562 |    4) AUDIO DEVICE:    1    99;      4776; TransmitMixer::PrepareDemux, Near-end Voice Quality Enhancement (APM) processing, currentMicLevel=18 before processing

CRITICAL  ; ( 3: 7:13:562 |    0)        VOICE:    1    99;      4776; TransmitMixer::APMProcessStream, AudioProcessing::set_stream_analog_level(18)

CRITICAL  ; ( 3: 7:13:562 |    1)        VOICE:    1    99;      4776; TransmitMixer::APMProcessStream, AudioProcessing::get_stream_analog_level after processed(17)

CRITICAL  ; ( 3: 7:13:562 |    0) AUDIO DEVICE:    1    99;      4776; TransmitMixer::PrepareDemux,Measure audio level of speech after APM processing, currentMicLevel=18, energy=-1

CRITICAL  ; ( 3: 7:13:562 |    0) AUDIO DEVICE:    1    99;      4776; AudioDeviceBuffer: _ptrCbAudioTransport->RecordedDataIsAvailable return newMicLevel=4369

CRITICAL  ; ( 3: 7:13:562 |    0)      UTILITY:    1    99;      4776; AudioDeviceWindowsWave::RecProc AGC change of volume: => new=4369

CRITICAL  ; ( 3: 7:13:562 |    3) AUDIO DEVICE:    1    99;      5672; AudioMixerManager::SetMicrophoneVolume volume=4369

…..

[/code]

WebRTC: how to use audio process module?

WebRTC
WebRTC

My current job responsiblity is researching on WebRTC, and the first task is wrapping a class from WebRTC to process audio frames to implement functions of audio AEC, AGC, NS, High pass filter etc.

Information list below is from WebRTC.org, you can also view it by visiting http://www.webrtc.org, or it’s code.

Continue reading “WebRTC: how to use audio process module?”