An example of AAC capability in H.245

Got mails continuously from everywhere throwing question to me about AAC audio in H.323.

So I arranged this post to example my previous posts: http://rg4.net/archives/1480.htmlhttp://rg4.net/archives/1126.htmlhttp://rg4.net/archives/1112.html

The pcap file for this example can be downloaded here: HUAWEI_TE600-vs-ZTE_T800.pcapnp

Here it is.

1. Basic knowledge: AAC LD descriptions in 14496-3

It operates at up to 48 kHz sampling rate and uses a frame length of 512 or 480 samples, compared to the 1024 or 960 samples used in standard MPEG-2/4 AAC to enable coding of general audio signals with an algorithmic delay not exceeding 20 ms. Also the size of the window used in the analysis and synthesis filterbank is reduced by a factor of 2.

And Table 1.3 — Audio Profiles definition of 14496-3 explained AAC format definition, AAC LC or AAC LD.

2. Basic knowledge: AAC capability in description of H.245 TCS

maxBitRate: 640
noncollapsing:
ProfileAndLevel: nonCollapsing item –> parameterIdentifier: standard = 0
AAC format: nonCollapsing item –> parameterIdentifier: standard = 1
AudioObjectType: nonCollapsing item –> parameterIdentifier: standard = 3
Config(Including sample rate and channel parameters): nonCollapsing item –> parameterIdentifier: standard = 4
MuxConfig: nonCollapsing item –> parameterIdentifier: standard = 8

3. H.245 TCS of HUAWEI TE60 and ZTE T800

HUAWEI TE60: 172.16.219.105
————————————————–
There are two AAC capabilities:
Capability 1:
collapsing:
collapsing item –> parameterIdentifier=2, parameterValue=2
collapsing item –> parameterIdentifier=5, parameterValue=1
noncollapsing:
ProfileAndLevel: 24
AAC format: logical (0)
AudioObjectType: 23

Capability 2:
collapsing:
collapsing item –> parameterIdentifier=2, parameterValue=2
collapsing item –> parameterIdentifier=5, parameterValue=1
noncollapsing:
ProfileAndLevel: 24
AudioObjectType: 23

ZTE T800: 172.16.219.103
————————————————–
There are four AAC capabilities:
Capability 1:
Capability 2:
Capability 3:
Capability 4:

4. Detail parameters in OLC command

TE60 OLC to T800:
————————————————–
maxBitRate: 1280
collapsing:
item 0 –> parameterIdentifier=2, parameterValue=2
item 1 –> parameterIdentifier=5, parameterValue=1
noncollapsing:
item 0 –> parameterIdentifier=0, value=25
item 1 –> parameterIdentifier=1, value=logical (0)
item 2 –> parameterIdentifier=3, value=23
item 3 –> parameterIdentifier=6, value=logical (0)
item 4 –> parameterIdentifier=8, octetString = 41 01 73 2a 00 11 00
item 5 –> parameterIdentifier=9, octetString = 00 00 00

Explanation:
AOT=23 –> AAC LD
MuxConfig = 41 01 73 2a 00 11 00 –> LATM format
Sample rate = (MuxConfig[2]&0x0f) = 0x73 & 0x0f = 3 = 48K Hz
Channel = (MuxConfig[3]&0xf0)>>4 = (0x2a & 0xf0) >> 4 = 0x20 >> 4 = 2 = Stereo

HUAWEI sent open logical channel with AAC LD stereo to ZTE.

T800 OLC to TE60:
————————————————–
maxBitRate: 1280
collapsing:
item 0 –> parameterIdentifier=2, parameterValue=2
item 1 –> parameterIdentifier=5, parameterValue=1
noncollapsing:
item 0 –> parameterIdentifier=0, value=25
item 1 –> parameterIdentifier=1, value=logical (0)
item 2 –> parameterIdentifier=3, value=23
item 3 –> parameterIdentifier=6, value=logical (0)
item 4 –> parameterIdentifier=8, octetString = 41 01 73 1a 00 11 00
item 5 –> parameterIdentifier=9, octetString = 00 00 00

Explanation:
AOT=23 –> AAC LD
MuxConfig = 41 01 73 1a 00 11 00 –> LATM format
Sample rate = (MuxConfig[2]&0x0f) = 0x73 & 0x0f = 3 = 48K Hz
Channel = (MuxConfig[3]&0xf0)>>4 = (0x1a & 0xf0) >> 4 = 0x10 >> 4 = 1 = Mono

ZTE sent open logical channel with AAC LD mono to HUAWEI.
Any furture questions?

AAC音频能力协商问题

视频会议中,通常音频能力的比较是比较简单的,通常是只是比较一下格式就行了。但是aac系列音频就是一个例外。它有一个复杂的能力表示方式,在交互的时候也不会明确的指明确切的采样率,通道数,而是像264格式一样,给出的是能力的level上限,需要我们去匹配比较。这里简单的介绍一下aac能力,和工作中碰到的问题的总结。 Continue reading “AAC音频能力协商问题”

Does your H323 terminal support AAC LC or AAC LD?

This is a guide for analysis H.245 terminalCapabilitySet and figure out does the terminal supports AAC LC or AAC LD?

  • First capture the call and it's communication packages, KDV1000-HUAWEI.pcap.
  • Open KDV1000-HUAWEI.pcap with wireshark.
  • Filter the KDV1000-HUAWEI.pcap with H225||H245
  • Locate & expand the terminalCapabilitySet
  • Find the items from capabilityTable which capabilityIdentifier is 0.0.8.245.1.1.0 (ISO/IEC 14496-3 MPEG-4 audio)
  • There will be two items if your terminal supports both AAC LC & AAC LD.
  • Find parameterIdentifier: standard which value is 3, check parameterValue: unsignedMax.

And here are sample items: If the value is 2 then its an AAC LC capability. 2-aac-lc If the value is 23, then its an AAC LD capability. 23-aac-ld

MP4 media file container with H.264 video and AAC audio

After all audio processing module has been full supported, I migrated the webrtc version to the latest version(update date: 5/9, 2013, svn code: 3988), I got a new assignment today, that is implement mp4 file container for the SDK.

During the discussion of the assignment, I offered my suggestion for the implementation solution, that was import ffmpeg for the SDK, because I was dealing with ffmpeg for over 8 years. By using ffmpeg, it really helps a lot in programming multimedia softwares, codec, container, scaling, streaming, so something else likewise.

However, I got a No, because “ffmpeg is too big for us”. True, this is something we shall concern.

So I can but only turn to mpeg4-ip/mp4v2.

And one more issue must be mentioned here is, the major audio codec of the SDK is G.722.1c. This means if we are intending to have a standard media file container for the SDK, transcode for the audio is required, otherwise, this version of MP4 container can support Windows only, and even more we need to add a DirectX filter for running on Windows.

Mission launched.

Preparations:

FAAC source code: http://sourceforge.net/projects/faac/ or

AAC implementation in libstagefright of android: http://ffmpeg.zeranoe.com/builds/source/external_libraries/vo-aacenc-0.1.2.tar.xz

libmp4v2 source code: http://code.google.com/p/mp4v2/

Directly to to codes

I must say some of the codes I got from my employer are getting really ugly:

  • No interface & variants comments
  • No sample programs
  • I can have only part of the source code.

So I digged into it myself for the first round of code study, trying to understand what these bouches of codes do, and how they work.

Current status in the previous two days

There are lots of stories during the code study, and after one-full-day’s try and try, I finally managed to write a simple program to run it up, but I still not sure whether it’s the right interfaces and right sequences I’m calling to run it up among the dozens of exported interfaces.

And when I trying to add libmp4v2 to it, I found an even big issue here which may turn the solution down:

We are using mono audio(encoded to G.722 1c) in our products, however, when I transcoded them to AAC and recorded to a MP4 container together with H.264 video, there will be a tremendous noise in the  audio.

I don’t know the exact reason.

So I did some test for it, writing a test program to record stereo audio to Mp4 by using libfaac & libmp4v2, everything works just fine.

Why???

Can someone tell me?

One suspecious phenomenon is the audio encoded by the FAAC always been showed as stereo when playing with VLC, whatever the audio format you inputed to FAAC, mono or stereo.

Code segment of encoding mono PCM audio to AAC

[cpp]
hEncoder = faacEncOpen(32000, 1,&samplesInput, &maxBytesOutput);

[/cpp]

Next step ( alternative solution for the problem )

After discussed with my supervisor, we decided to  not transcode the audio to AAC, but G.711 , for a temporary solution, but this means:

1. The audio quality will be poorer.

2. But loading for recording a MP4 media file container will be reduced.

I will do this in the next week.