RTCP and AVPF related missing features

Most of the missing features are AVPF related, which is defined in RFC4585 and RFC5104.

RFC4585: Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)
https://www.rfc-editor.org/rfc/rfc4585.txt
RFC5104:  Codec Control Messages in the RTP Audio-Visual Profile with Feedback (AVPF)
https://www.rfc-editor.org/rfc/rfc5104.txt

AVPF contains a mechanism for conveying such a message, but did not specify for which codec and according to which syntax the message should conform.  Recently, the ITU-T finalized Rec.H.271, which (among other message types) also includes a feedback message.  It is expected that this feedback message will fairly quickly enjoy wide support.  Therefore, a mechanism to convey feedback messages according to H.271 appears to be desirable.

RTCP Receiver Report Extensions
1. CCM – Codec Control Message
2. FIR – Full Intra Request Command
A Full Intra Request (FIR) Command, when received by the designated
media sender, requires that the media sender sends a Decoder Refresh
Point (see section 2.2) at the earliest opportunity.  The evaluation
of such an opportunity includes the current encoder coding strategy
and the current available network resources.

FIR is also known as an “instantaneous decoder refresh request”,
“fast video update request” or “video fast update request”.

3. TMMBR – Temporary Maximum Media Stream Bit Rate Request
4. TMMBN – Temporary Maximum Media Stream Bit Rate Notification

Example from RFC5104:

Receiver A: TMMBR_max total BR = 35 kbps, TMMBR_OH = 40 bytes
Receiver B: TMMBR_max total BR = 40 kbps, TMMBR_OH = 60 bytes

For a given packet rate (PR), the bit rate available for media
payloads in RTP will be:

Max_net media_BR_A =
TMMBR_max total BR_A – PR * TMMBR_OH_A * 8 … (1)

Max_net media_BR_B =
TMMBR_max total BR_B – PR * TMMBR_OH_B * 8 … (2)

For a PR = 20, these calculations will yield a Max_net media_BR_A =
28600 bps and Max_net media_BR_B = 30400 bps, which suggests that
receiver A is the limiting one for this packet rate.  However, at a
certain PR there is a switchover point at which receiver B becomes
the limiting one.  The switchover point can be identified by setting
Max_media_BR_A equal to Max_media_BR_B and breaking out PR:

TMMBR_max total BR_A – TMMBR_max total BR_B
PR =  ——————————————- … (3)
8*(TMMBR_OH_A – TMMBR_OH_B)

5. TSTR – Temporal-Spatial Trade-off Request

5. TSTN – Temporal-Spatial Trade-off Request

6. VBCM – H.271 Video Back Channel Message

7. RTT – Round Trip Time
A receiver that receives a request closely after
sending a decoder refresh point — within 2 times the longest round
trip time (RTT) known, plus any AVPF-induced RTCP packet sending
delays — should await a second request message to ensure that the
media receiver has not been served by the previously delivered
decoder refresh point.  The reason for the specified delay is to
avoid sending unnecessary decoder refresh points.

8. NACK
8a. PLI – Picture Loss Indication
8b. SLI – Slice Loss Indication
8c. RPSI – Reference Picture Selection Indication

Here’s a sample INVITE command relaying from FreeSWITCH:

INVITE sip:1009@10.21.1.80:5060;transport=tcp SIP/2.0
Via: SIP/2.0/TCP 192.168.100.96;branch=z9hG4bK6p37yQX86QXar
Route: <sip:1009@10.21.1.80:5060>;transport=tcp
Max-Forwards: 69
From: "Extension 1008" <sip:1008@192.168.100.96>;tag=DKK4FpBB3ptSS
To: <sip:1009@10.21.1.80:5060;transport=tcp>
Call-ID: 0199ec1f-9e53-1233-8583-000c29f7d152
CSeq: 77747697 INVITE
Contact: <sip:mod_sofia@192.168.100.96:5060;transport=tcp>
User-Agent: FreeSWITCH-mod_sofia/1.7.0+git~20150614T062551Z~a647b42910~64bit
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, MESSAGE, INFO, UPDATE, REGISTER, REFER, NOTIFY, PUBLISH, SUBSCRIBE
Supported: timer, path, replaces
Allow-Events: talk, hold, conference, presence, as-feature-event, dialog, line-seize, call-info, sla, include-session-description, presence.winfo, message-summary, refer
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 495
X-FS-Support: update_display,send_info
Remote-Party-ID: "Extension 1008" <sip:1008@192.168.100.96>;party=calling;screen=yes;privacy=off

v=0
o=FreeSWITCH 1436150633 1436150634 IN IP4 192.168.100.96
s=FreeSWITCH
c=IN IP4 192.168.100.96
t=0 0
m=audio 16890 RTP/AVP 96 0 8 101
a=rtpmap:96 opus/48000/2
a=fmtp:96 useinbandfec=1; stereo=0; sprop-stereo=0
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ptime:20
m=video 22404 RTP/AVP 96
b=AS:1024
a=rtpmap:96 H264/90000
a=fmtp:96 profile-level-id=42801F
a=rtcp-fb:96 ccm fir tmmbr
a=rtcp-fb:96 nack
a=rtcp-fb:96 nack pli

Sample SDP of WebRTC for Firefox

GET /socket.io/1/websocket/GgKg1qt9TCXtfPb6n2g0 HTTP/1.1
Host: 172.16.203.1:2013
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:36.0) Gecko/20100101 Firefox/36.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Sec-WebSocket-Version: 13
Origin: http://172.16.203.1:2013
Sec-WebSocket-Key: pPQe97SI5k09yaPnVLa2RQ==
Connection: keep-alive, Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: Lm5/9dDv4pjyphQQeswS+V+AiKc=

..1::..|.'kI..Q..I
...Q^.U...BK......IIP.F....Q'.A...z..T5:::{"name":"log","args":[[">>> Message from server: ","Room foo has 1 client(s)"]]}.`5:::{"name":"log","args":[[">>> Message from server: ","Request to create or join room","foo"]]}.$5:::{"name":"joined","args":["foo"]}.B5:::{"name":"emit(): client GgKg1qt9TCXtfPb6n2g0 joined room foo"}..$.N...t _. {I.l ..+iW.)...l{V.=8..l}K.noW.<:I.*sE..g.Z5:::{"name":"log","args":[[">>> Message from server: ","Got message: ","got user media"]]}.~.x5:::{"name":"message","args":[{"type":"offer","sdp":"
v=0
o=mozilla...THIS_IS_SDPARTA-38.0 2820695485956467000 0 IN IP4 0.0.0.0
s=-
t=0 0
a=fingerprint:sha-256 7C:7B:AE:C2:AE:ED:14:39:A4:7A:EE:4B:FB:FE:90:90:E8:A1:0B:C1:50:FC:C8:9C:FA:28:68:22:EE:1C:F6:97
a=group:BUNDLE sdparta_0 sdparta_1
a=ice-options:trickle
a=msid-semantic:WMS *
m=audio 9 RTP/AVP 109 9 0 8
c=IN IP4 0.0.0.0
a=sendrecv
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=ice-pwd:db746395f7bf11179b1cddabf3ef6359
a=ice-ufrag:1c858a90
a=mid:sdparta_0
a=msid:{69b2b229-1dc0-4291-a703-aafe505d477b} {ebc6bb1c-8525-4a70-9601-354b53c5c103}
a=rtcp-mux
a=rtpmap:109 opus/48000/2
a=rtpmap:9 G722/8000/1
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=setup:actpass
a=ssrc:4051396866 cname:{f0f8a3ab-8c54-4694-872a-98dd14f0c821}
m=video 9 RTP/AVP 126 97
c=IN IP4 0.0.0.0
a=sendrecv
a=fmtp:120 max-fs=12288;max-fr=60
a=fmtp:126 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1
a=fmtp:97 profile-level-id=42e01f;level-asymmetry-allowed=1
a=ice-pwd:db746395f7bf11179b1cddabf3ef6359
a=ice-ufrag:1c858a90
a=mid:sdparta_1
a=msid:{69b2b229-1dc0-4291-a703-aafe505d477b} {34f61d33-9fe4-42ff-8e2b-ef9c465c6f67}
a=rtcp-fb:120 nack
a=rtcp-fb:120 nack pli
a=rtcp-fb:120 ccm fir
a=rtcp-fb:126 nack
a=rtcp-fb:126 nack pli
a=rtcp-fb:126 ccm fir
a=rtcp-fb:97 nack
a=rtcp-fb:97 nack pli
a=rtcp-fb:97 ccm fir
a=rtcp-mux
a=rtpmap:126 H264/90000
a=rtpmap:97 H264/90000
a=setup:actpass
a=ssrc:3993721606 cname:{f0f8a3ab-8c54-4694-872a-98dd14f0c821}

 

 

Some key notes for RFC3264

Facing a new task of standardizing SIP protocols for the Kedacom conference Endpoints.

So I digged into some RFC document recently. Here are some key notes for RFC3264: An OfferAnswer Model with the Session Description Protocol (SDP)

1. Capatibility comparison – Direction

—————————————–

If “a=sendrecv” attribute does not exist, or been omitted, that means the direction is sendrecv, since sendrecv is the default.

2. [Offering] RTP Port in m

—————————————–

RTP port in m line for recvonly and sendrecv streams, while RTCP port for sendonly streams.

For recvonly and sendrecv streams, the port number and address in the
offer indicate where the offerer would like to receive the media
stream.  For sendonly RTP streams, the address and port number
indirectly indicate where the offerer wants to receive RTCP reports.
Unless there is an explicit indication otherwise, reports are sent to
the port number one higher than the number indicated.  The IP address
and port present in the offer indicate nothing about the source IP
address and source port of RTP and RTCP packets that will be sent by
the offerer.

3. [Offering] Switching media format if multiple formats supported**

—————————————–

If multiple formats are listed, it
means that the offerer is capable of making use of any of those
formats during the session.  In other words, the answerer MAY change
formats in the middle of the session, making use of any of the
formats listed, without sending a new offer.

[Offering] Capatibility comparison – Preference

——————————————

In all cases, the formats in the “m=” line MUST be listed in order of
preference, with the first format listed being preferred.  In this
case, preferred means that the recipient of the offer SHOULD use the
format with the highest preference that is acceptable to it.

[Offering] Capatibility comparison – Preference 2

—————————————–

For sendrecv RTP
streams, the payload type numbers indicate the value of the payload
type field the offerer expects to receive, and would prefer to send.
However, for sendonly and sendrecv streams, the answer might indicate
different payload type numbers for the same codecs, in which case,
the offerer MUST send with the payload type numbers from the answer.

* This is what SIP different with H.323, the payload type value in H.245 OLC mean’s the request will send the payload with this payload type value, while in SDP it means you should send your payload with the value I specified in my SDP.

[Offering] Bandwidth description

—————————————–

If the bandwidth attribute is present for a stream, it indicates the
desired bandwidth that the offerer would like to receive.  A value of
zero is allowed, but discouraged.  It indicates that no media should
be sent.  In the case of RTP, it would also disable all RTCP.

[Offering] A typical usage example for multiple media streams

—————————————–

A typical usage example for multiple media streams of the same type
is a pre-paid calling card application, where the user can press and
hold the pound (“#”) key at any time during a call to hangup and make
a new call on the same card.  This requires media from the user to
two destinations – the remote gateway, and the DTMF processing
application which looks for the pound.  This could be accomplished
with two media streams, one sendrecv to the gateway, and the other
sendonly (from the perspective of the user) to the DTMF application.

[Answering] Answering SDP must have at least one media format while rejecting

—————————————–

To reject an offered
stream, the port number in the corresponding stream in the answer
MUST be set to zero.  Any media formats listed are ignored.  At least
one MUST be present, as specified by SDP.

[Answering] Stream marked as recvonly can suggest a new format for the offerer

—————————————–

For streams marked as recvonly in the answer, the “m=” line MUST
contain at least one media format the answerer is willing to receive
with from amongst those listed in the offer.  The stream MAY indicate
additional media formats, not listed in the corresponding stream in
the offer, that the answerer is willing to receive.

Similarly, just like recvonly streams, sendrecv streams can suggest new formats for the offerer (of course, it will not be able to send them at this time, since it was not listed in the offer).

[Answering]Capability comparison – RECOMMENDED Order for answerer

—————————————–

If a stream in the offer lists
audio codecs 8, 22 and 48, in that order, and the answerer only
supports codecs 8 and 48, it is RECOMMENDED that, if the answerer has
no reason to change it, the ordering of codecs in the answer be 8,
48, and not 48, 8.

[Answering]Prefer RTP payload type with the value in the offer rather than in the answer

—————————————–

In the case of RTP, it MUST use the payload type numbers
from the offer, even if they differ from those in the answer.

[multicast] ??? How to handle participants in a some conference but with different format support ???

—————————————–

The set of media formats in the answer MUST be equal to or be a
subset of those in the offer.  Removing a format is a way for the
answerer to indicate that the format is not supported.

[Processing Answer]

—————————————–

It(Call offerer) MUST send using a media format listed in the answer,
and it SHOULD use the first media format listed in the answer when it
does send.

The reason this is a SHOULD, and not a MUST (its also a SHOULD,
and not a MUST, for the answerer), is because there will
oftentimes be a need to change codecs on the fly.  For example,
during silence periods, an agent might like to switch to a comfort
noise codec.  Or, if the user presses a number on the keypad, the
agent might like to send that using RFC 2833 [9].  Congestion
control might necessitate changing to a lower rate codec based on
feedback.

[Modifying the Session] Session version of REINVITE

—————————————–

When issuing an offer that modifies the session,
the “o=” line of the new SDP MUST be identical to that in the
previous SDP, except that the version in the origin field MUST
increment by one from the previous SDP.  If the version in the origin
line does not increment, the SDP MUST be identical to the SDP with
that version number.

[Modifying the Session] Rules for m(Media stream) line of REINVITE

—————————————–

If an SDP is offered, which is different from the previous SDP, the
new SDP MUST have a matching media stream for each media stream in
the previous SDP.

[Modifying the Session] Adding a Media Stream in REINVITE

—————————————–

New media streams are created by new additional media descriptions
below the existing ones, or by reusing the “slot” used by an old
media stream which had been disabled by setting its port to zero.

[Modifying the Session] Changing the Set of Media Formats in REINVITE

—————————————–

For example, if A generates an offer
with G.711 assigned to dynamic payload type number 46, payload type
number 46 MUST refer to G.711 from that point forward in any offers
or answers for that media stream within the session.  However, it is
acceptable for multiple payload type numbers to be mapped to the same
codec, so that an updated offer could also use payload type number 72
for G.711.

     The mappings need to remain fixed for the duration of the session
      because of the loose synchronization between signaling exchanges
      of SDP and the media stream.

[Modifying the Session] Sending Media Stream after REINVITE is done

—————————————–

Similarly, as described in Section 6, as soon as it sends
its answer, the answerer MUST begin sending media using any formats
in the offer that were also present in the answer
Similarly, when the offerer receives the
answer, it MUST begin sending media using any formats in the answer

[Modifying the Session] Rules of ceasing use of an old media format for Agents

—————————————–

When an agent ceases using a media format (by not listing that format
in an offer or answer, even though it was in a previous SDP) the
agent will still need to be prepared to receive media with that
format for a brief time.

[Modifying the Session] Hold on a call

—————————————–

Hold on a call means request the other participant(s) stop sending streams to it.

   If the stream to be placed on hold was previously a sendrecv media
stream, it is placed on hold by marking it as sendonly.  If the
stream to be placed on hold was previously a recvonly media stream,
it is placed on hold by marking it inactive.

Certain third party call control scenarios do not work when an
      answerer responds to held SDP with held SDP.

[Modifying the Session] Hold on a call 2

—————————————–
Does it mean the sending of the Hold on requester will continue? The answer is no.

   Typically, when a user “presses” hold, the agent will generate an
offer with all streams in the SDP indicating a direction of sendonly,
and it will also locally mute, so that no media is sent to the far
end, and no media is played out.