strawman text to show how unverified media would work#1026

fluffy · 2017-02-12T15:50:10Z

Fixes #849

fluffy · 2017-02-12T15:52:05Z

More works is needed on this but I want to check with people the basic content of this is correct before we bother to cross the t and dot the i.

stefhak · 2017-02-12T18:06:28Z

Thanks Cullen. Let's see what people think.Should we reach out to Ekr for an opinion on whether this is OK from a security perspective?

rshpount · 2017-02-12T18:36:37Z

I do not think 5 sec limit is needed here. I think all we need to say that media must be terminated if SDP with fingerprint is received and this fingerprint does not match the certificate used in the established DTLS session.

stefhak · 2017-02-14T10:30:07Z

@rshpount to not have a timeout seems very strange. Would not an app that wants to do something bad simply just skip applying the answer, and the unverified media could flow for a long time?

rshpount · 2017-02-14T17:33:11Z

@stefhak I understand your concern about the timeout, but then we need to answer two questions:

What happens if answer arrives after 5 sec and it matches the certificate?
Why 5 seconds? Why not 32 seconds or some other number? I can certainly see how answer can be delayed for more then 5 seconds, especially in situations of signaling servers recovering after fail-over.

stefhak · 2017-02-14T19:10:36Z

@rshpount as I've understood it the use case is when the media set-up outraces the SDP answer propagation back to the offerer when the answerer has a legacy device (that can't send a pranswer). That to me says the timeout should be something more like 1 second (in what normal scenarios would the answer take more than 1 second more than the media?). The use case is not about handling signaling servers that go down.

A fundamental question to me is if

   NOT assume that the data transmitted over the TLS connection is valid
   until it has received a matching fingerprint in an SDP answer.

from RFC 4572 allows playing the media at all before the answer is received.

(BTW the PR looks fine to me, except for perhaps the timeout should be more 1 sec than 5 IMO, given this is allowed by 4572)

rshpount · 2017-02-14T19:24:17Z

I do not think RFC 4572 allows media playback before SDP answer. Ideal behavior is to process and decode media, but not to pass it for playback so that media can be played as soon as answer is received without waiting for the new iframe. From what I understand this is the proposed behavior when allowUnverifiedMedia flag is not set.

This being said, if you set the allowUnverifiedMedia flag and do allow media playback before answer is received, then you need to take into account that signaling normally goes over some sort of centralized infrastructure while media goes direct between two end points. Signaling often traverses multiple servers, which are non-local, can become temporary overloadedm or fail-over. Things which result in user detectable issues (and starting playback and then stopping and then starting again is such as issue), should be avoided. In case of services that we operate, in case of fail-over, signaling can be delayed up to about 16 sec. The 5 sec delay will not be sufficient for me since it will require a much more frequent heartbeat monitoring between signaling servers. Even in case of two end points on the same network communicating with remote signaling server, simple internet connection disruption can cause TCP packet delay which is higher then 5 sec which will in turn delay the answer.

stefhak · 2017-02-15T09:29:56Z

@rshpount what makes me feel uneasy is adding an API that allows skipping one part of the security solution we've agreed on without much discussion (and no input from security experts like @ekr so far). This may be totally fine, but I am personally not able to judge.

fluffy · 2017-02-15T12:52:03Z

I will ping EKR - be good to have his input.

The 5 seconds was very arbitrary. I was trying to pick a number that was high enough that a reasonable signaling system could deliver the answer in that time. I don't really think a timer is needed at all but I don't mind adding it. The reasons I don't think a timer is needed is because the app can implement whatever timer it wants in the app. Keep in mind that if the app is "bad" it can do whatever it wants with this media.

The key thing about data received before a finger print is not that "it's not valid" it is valid data, it's that you don't know who it is from. Some application clearly won't want to play media when they don't know who it is from, some are fine with that and they will not display who the media is from until they get the fingerprint. Keep in mind the default behavior would be to not use this data, we are just adding the option for applications that want to use it play it. One of the reasons we need this is for the 1-800-gofedex case to work which is in our original requirements.

stefhak · 2017-02-15T13:58:25Z

Thanks @fluffy. Just to be sure: is there no way that this can be exploited e.g. by some MITM for the period up to when the fingerprint is received?

martinthomson · 2017-02-15T17:13:37Z

I was initially concerned that this could be incompatible with media isolation and could not be implemented in Firefox without also removing or significantly altering the semantics of identity. However, I think that it's OK from that angle - the requirement here is that the DTLS handshake is complete, which should make that available.

That said, this definitely could be exploited by an attacker. This would end-run any protections we might gain through something like draft-thomson-avtcore-sdp-uks. FWIW, that's a pretty lame attack, so you might reasonably conclude that the risks are acceptable.

aboba · 2017-02-15T23:02:35Z

@fluffy Some clarifying questions:

"The RTCRtpReceiver MAY discard this media or MAY buffer this media so that video key frames are not
lost. Once the SDP fingerprint is received, and the DTLS connection verified, any buffered media and media received after is made available to the application."

[BA] How does the application developer know which of the above approaches an implementation has taken (e.g. whether it discards unverified media or buffers it)?

"If the allowUnverifiedMedia attribute on the RTCRtpReceiver has been set to true, then up to 5 seconds worth of media is made available to the applications even if that media is received before the SDP fingerprint."

[BA] Is the above trying to say that unverified media is buffered for up to 5 seconds and then made available once verification is completed? Or is it trying to say that up to 5 seconds of unverified media is made available as it comes in?

s/applications/application/

pthatcherg · 2017-03-30T19:12:05Z

As I pointed out in #849 just now, I think it's actually impossible for this to work with ICE+DTLS. Here's my reasoning, copied from #849:

You can receive DTLS from the remote side before receiving the remote description (and thus fingerprint). This happens if the remote side sends an ICE connectivity check and the local side sends a response and then the remote side sends a DTLS packet.
You cannot send DTLS from the local side before receiving the remote description (and thus fingerprint). This is because you can't send an ICE connectivity check until you have the remote ICE ufrag and pwd, and thus can't get an ICE connectivity check response, and thus can't send DTLS. This is because you can't send anything other than ICE until you get an ICE connectivity check response.
Since you can't send DTLS, you can't complete the handshake, and thus can't extract the SRTP key.

adamroach · 2017-04-03T16:07:03Z

On their face, @pthatcherg's assertions appear to be true. Can someone who thinks this can happen draw a ladder diagram demonstrating how the situation under discussion arises? @fluffy?

adamroach · 2017-04-03T16:21:42Z

The MMUSIC discussion seems, at the moment, to conclude that the only way the situation under discussion can arise is:

When using ORTC rather than WebRTC -- which clearly requires no text in the WebRTC document, or
When an ICE lite endpoint is in use, the ICE lite endpoint itself (but not the full ICE endpoint it is talking to) can get early media. Since browsers cannot be ICE lite endpoints, this situation also requires no text in the WebRTC document.

The only thing that I've seen mentioned is some interaction between PRANSWER and trickle ICE in which a fingerprint has been received by the offerer, but that fingerprint is incorrect. Leaving aside for a moment that this sounds like it goes beyond "unverified media" and clear into "indistinguishable from forged media", it's still not clear how this can happen.

Minimally, I think the working group needs to understand the circumstances leading to such a situation (hence my request for a ladder diagram); and, ideally, such a situation should be clearly described in the text around unverified media. It does no good to give webdevs an affordance to control the behavior in this situation if they don't have any way to understand what the situation actually is. And if it's not obvious to us, then they have no hope whatsoever.

rshpount · 2017-04-03T17:07:01Z

I see 3 cases here: 1. Two full ICE end points communicating -- this issue is impossible since answering end point will not send ServerHello until it runs the connectivity check, which cannot complete before answer SDP with ICE ufrag and fingerprints is received 2. ICE-lite end point sends an offer to the the full ICE WebRTC end point. WebRTC end point runs the ICE connectivity check and sends ClientHello. ICE-lite end point can receive ClientHello, send back ServerHello and establish DTLS association before the answer is received from WebRTC end point. This means ICE-lite (non-webrtc end poing) can receive data before it receives answer SDP and fingerprints. I think the right solution here is for ICE-lite end point cache ClientHello but not to send ServerHello until the answer is received. This will prevent DTLS association from being established before fingerprints are received. 3. Infamous 1-800 FedEx problem. This has nothing to do with the question, but does create the real problem. Imagine SBC which is a full ICE/DTLS end point on one side communicating with WebRTC end points and SIP/AVP (no DTLS or ICE) on the other side. Imagine then, that WebRTC end point sends an offer to SBC. SBC strips ICE and DTLS information from that offer and sends it to SIP. SIP end point immediately starts sending data to SBC before even sending the answer. SBC did not receive the answer from the SIP end point and did not send the answer to WebRTC end point. There is no ICE session or DTLS association established between the WebRTC end point and SBC, but SBC is receiving data. The question is what to do with this data? Right now the only options are: a: discard b: establish media session between SBC and WebRTC end point before sending the offer to SIP and run transcoding during the early media session. Once session is fully establish do a 3pcc session update to remove transcoding. I believe problem 3 cannot be fixed until ortc/webrtc 2.0 is ready. In either of those cases I do not see how media can flow to webrtc end point before fingerprints are available.

…

_____________ Roman Shpount

On Mon, Apr 3, 2017 at 12:07 PM, adamroach ***@***.***> wrote: On their face, @pthatcherg <https://github.com/pthatcherg>'s assertions appear to be true. Can someone who thinks this can happen draw a ladder diagram demonstrating how the situation under discussion arises? @fluffy <https://github.com/fluffy>? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1026 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEh5SutuXNrJOR4WnGkpUIJOOrEBlZW3ks5rsRkogaJpZM4L-hgv> .

adamroach · 2017-04-03T17:14:19Z

The executive summary of @rshpount's comment as I read it is "this can't happen to a browser." Any counterpoints? @fluffy?

fluffy · 2017-04-04T14:51:37Z

Imagine a browser sends offer to a SBC like thing. The SBC sends PR answer that sets up just a data connection. At this point ICE comes up and a TLS session comes up. Now the SBC forwards the offer as a SIP invite to a PSTN GW calling 1-800-go-fedex which sets up a second TLS connection but that media packets for this are relayed via the SBC. So the first PR answer set up the ICE. The second TLS connection is happening within that TLS context. The PSTN GW completes the ICE handshake, and then starts sending the one way media with IVR prompts from fedex as ringback tone. At the point that it needs to go two way, the PSTN GW sends a 200 with answer which the SBC translates into answer with the fingerprint of the PSTN GW to send to the browser. If the browser discards this media, it looses the initial prompt. Note that the browser gets the fingerprint and knows who it is talking to before it needs to send any media or DTMF. This case would likely work if you buffered all the media and only played it once the fingerprint arrived because the IVR would just wait at the prompt till the person had listened to the buffered media and pressed 1. But the browser would need to speed up the playback or it could never catch up.

rshpount · 2017-04-04T17:43:47Z

Cullen, Can you please explain what does it mean "second TLS connection is happening within that TLS context"? As far as I know, to establish second or any new DTLS association you need to do an ICE restart. You cannot have two DTLS associations over the same underlying transport (5-tuple) since you cannot de-mux DTLS packet. If you do the ICE-restart and get consent for a new 5-tuple, you need a complete offer/answer exchange before you can send data in both directions, which means fingerprints for both end points are available before DTLS association is established. I do think your problem is real, but the early media gets stuck on the SBC with no way to send it to the WebRTC end point since DTLS session is not running yet. So, this is an SBC problem, not a WebRTC end [point problem. Regards,

…

_____________ Roman Shpount

On Tue, Apr 4, 2017 at 10:51 AM, Cullen Jennings ***@***.***> wrote: Imagine a browser sends offer to a SBC like thing. The SBC sends PR answer that sets up just a data connection. At this point ICE comes up and a TLS session comes up. Now the SBC forwards the offer as a SIP invite to a PSTN GW calling 1-800-go-fedex which sets up a second TLS connection but that media packets for this are relayed via the SBC. So the first PR answer set up the ICE. The second TLS connection is happening within that TLS context. The PSTN GW completes the ICE handshake, and then starts sending the one way media with IVR prompts from fedex as ringback tone. At the point that it needs to go two way, the PSTN GW sends a 200 with answer which the SBC translates into answer with the fingerprint of the PSTN GW to send to the browser. If the browser discards this media, it looses the initial prompt. Note that the browser gets the fingerprint and knows who it is talking to *before* it needs to send any media or DTMF. This case would likely work if you buffered all the media and only played it once the fingerprint arrived because the IVR would just wait at the prompt till the person had listened to the buffered media and pressed 1. But the browser would need to speed up the playback or it could never catch up. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1026 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEh5SsM78TPKmUrAYx88KRUyTS8QypBGks5rslj6gaJpZM4L-hgv> .

pthatcherg · 2017-04-04T20:09:15Z

On Tue, Apr 4, 2017 at 7:51 AM Cullen Jennings ***@***.***> wrote: Imagine a browser sends offer to a SBC like thing. The SBC sends PR answer that sets up just a data connection. At this point ICE comes up and a TLS session comes up. Now the SBC forwards the offer as a SIP invite to a PSTN GW calling 1-800-go-fedex which sets up a second TLS connection but that media packets for this are relayed via the SBC. So the first PR answer set up the ICE. The second TLS connection is happening within that TLS context.

It's not clear to me what DTLS handshakes are taking place and what fingerprints the browser sees. Perhaps if you were more specific about what remote descriptions come into PeerConnection on what DTLS fingerprints they have, along with when the DTLS handshakes take place.

…

The PSTN GW completes the ICE handshake, and then starts sending the one way media with IVR prompts from fedex as ringback tone. At the point that it needs to go two way, the PSTN GW sends a 200 with answer which the SBC translates into answer with the fingerprint of the PSTN GW to send to the browser. If the browser discards this media, it looses the initial prompt. Note that the browser gets the fingerprint and knows who it is talking to *before* it needs to send any media or DTMF. This case would likely work if you buffered all the media and only played it once the fingerprint arrived because the IVR would just wait at the prompt till the person had listened to the buffered media and pressed 1. But the browser would need to speed up the playback or it could never catch up. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1026 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHaf-uqQApIbgs2M0Q5pu2fU4MdzWWi7ks5rslj7gaJpZM4L-hgv> .

taylor-b · 2017-04-05T00:35:57Z

I think I understand Cullen's use case. I just didn't consider there being a second DTLS handshake in the picture. Let me break down the important steps that occur here (if I understand correctly):

Offer and provisional answer exchanged between browser and gateway.
DTLS handshake completes using the fingerprint in the provisional answer.
Before receiving a final answer, the WebRTC endpoint receives a Client Hello on a new candidate pair with a new ufrag (addressing @rshpount's point about needing a new 5-tuple).
~~The WebRTC endpoint completes this second handshake, maintaining the two DTLS associations in parallel?~~
~~At this point, early media can be received on the second DTLS association.~~
~~Eventually the answer is received, the first DTLS association is discarded and the fingerprint of the second one can be verified.~~

However, this goes into extreme pranswer edge case territory, which JSEP doesn't currently define. This is what a really robust implementation might do, but I don't see anything preventing it from just ignoring the second Client Hello until it gets the final answer and discards the first DTLS association.

A very related issue is rtcweb-wg/jsep#600 (scroll down to find stuff about maintaining N DTLS associations in parallel...). In the PR I wrote to address this issue, I didn't end up adding any requirements related to early media, since @juberti said "implementations should be allowed to only handle one remote username at a time", which seemed reasonable to me.

EDIT: Nevermind. At step 4, you still can't complete the second DTLS handshake, because you don't have the second remote ICE password yet, so you can't get ICE connected. Which is back to the original problem. The best an implementation could do is cache the second Client Hello.

rshpount · 2017-04-05T01:43:48Z

At the step 3, WebRTC end point should not send Server Hello on this candidate pair until it sends its own consent check on this pair. WebRTC end point cannot send this check until new remote ice-pwd is received in the final answer. Because of this second DTLS association is not established until the final answer is received. Am I missing something here?

…

_____________ Roman Shpount

On Tue, Apr 4, 2017 at 8:36 PM, Taylor Brandstetter < ***@***.***> wrote: I think I understand Cullen's use case. I just didn't consider there being a second DTLS handshake in the picture. Let me break down the important steps that occur here (if I understand correctly): 1. Offer and provisional answer exchanged between browser and gateway. 2. DTLS handshake completes using the fingerprint in the provisional answer. 3. Before receiving a final answer, the WebRTC endpoint receives a Client Hello on a new candidate pair with a new ufrag (addressing @rshpount <https://github.com/rshpount>'s point about needing a new 5-tuple). 4. The WebRTC endpoint completes this second handshake, maintaining the two DTLS associations in parallel? 5. At this point, early media can be received on the second DTLS association. 6. Eventually the answer is received, the first DTLS association is discarded and the fingerprint of the second one can be verified. However, this goes into extreme pranswer edge case territory, which JSEP doesn't currently define. This is what a really robust implementation *might* do, but I don't see anything preventing it from just ignoring the second Client Hello until it gets the final answer and discards the first DTLS association. A very related issue is rtcweb-wg/jsep#600 <rtcweb-wg/jsep#600> (scroll down to find stuff about maintaining N DTLS associations in parallel...). In the PR I wrote to address this issue, I didn't end up adding any requirements related to early media, since @juberti <https://github.com/juberti> said "implementations should be allowed to only handle one remote username at a time", which seemed reasonable to me. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1026 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEh5Sv8brQF1ghxdRHuMlrI2_VQ3YIJfks5rsuHwgaJpZM4L-hgv> .

taylor-b · 2017-04-05T03:18:57Z

Right, that's what I'm saying. See edit.

fluffy · 2017-04-05T04:19:14Z

Sorry - I messed up the original explain because I had the data and audio and video muxed ... but to do it in the muxed case here ... take the steps Taylor has in the above thing but with with the SBC never setting up any DTLS session. It only does the ICE and never initiated the TLS. Later the PSTN GW initiated the DTLS. So in step 2, it would only be ICE completed not DTLS.

I have vague memory there was some flow that used a secondary PR answer with a=dtls-connection:new too.

taylor-b · 2017-04-05T04:41:29Z

Ok; so what fingerprint does the SBC put in the PR answer? If the answer is "a bogus fingerprint", then the WebRTC implementation will fail the DTLS handshake with "bad certificate". The alternative would be that the SBC creates its own certificate, but shares it with the PSTN GW?

rshpount · 2017-04-05T15:32:36Z

First of all, there is no longer a=dtls-connection:new. This got edited out of dtls-sdp draft.

Second, SBC cannot just establish ICE without starting DTLS association with WebRTC end point. Any session description sent to WebRTC end point must have both ICE and DTLS attributes. This means DTLS association is always started after step 2. It is possible to start new DTLS association in the final answer from the same original offer, but this will also require a new ICE session with new ufrag and candidates allocated by SBC. Until ICE password used for these new candidates is delivered to WebRTC end point, WebRTC end point cannot send the consent check to SBC, which prevents ServerHello from being sent to SBC and DTLS association from being established.

All of this being said, there is a problem on SBC side. There are plenty of scenarios when it will receive media from SIP gateway with no way to complete the ICE session and DTLS association setup with WebRTC end point. The only way early media works right now, is if SBC sits on the media path doing transcoding if necessary until final answer is received. To make SBC work with early media without transcoding you need functionality to setup ICE/DTLS before doing codec negotiation, which is not coming until 2.0.

fluffy · 2017-04-26T15:43:59Z

There does not need to be a dtls-connection:new it just comes in an answer that in sip terms would have a different ufrag. In webrtc terms would just be a different answer that arrived after the first PR answer. To answer taylors question, it can put a a fingerprint the SBC can terminate or it can put in a bogus fingerprint but make sure to never negotatte DTLS. It does not need to share a figner print with the SBC.

rshpount · 2017-04-26T16:06:35Z

Once again, dtls-connection:new no longer exists at all. It was replaced by tls-id.

Second, from what everyone else sees, it is impossible to receive unverified media with full ICE end points and consent to send. If you think otherwise, please provide a complete scenario.

taylor-b · 2017-04-27T05:09:00Z

it can put a a fingerprint the SBC can terminate or it can put in a bogus fingerprint but make sure to never negotatte DTLS. It does not need to share a figner print with the SBC.

So, it can use either a fingerprint for an association terminated by the SBC, or it can use a bogus fingerprint? I thought we already talked about why this wouldn't work. Maybe I'm still not understanding the network topology. Here's what I think you're describing; can you point out where I went wrong?

stefhak · 2017-05-02T05:53:25Z

Closing, see comment #849 (comment).

strawman text to show how unverfieid media would work

fde755b

aboba changed the title ~~strawman text to show how unverfieid media would work~~ strawman text to show how unverified media would work Feb 15, 2017

stefhak added the February 2017 interim topic label Feb 16, 2017

alvestrand added the Pending RTCWEB WG action label Mar 23, 2017

stefhak mentioned this pull request May 2, 2017

Specify an AllowUnverifiedMedia RTCConfiguration property #849

Closed

stefhak closed this May 2, 2017

strawman text to show how unverified media would work#1026

strawman text to show how unverified media would work #1026

Uh oh!

Conversation

fluffy commented Feb 12, 2017

Uh oh!

fluffy commented Feb 12, 2017

Uh oh!

stefhak commented Feb 12, 2017

Uh oh!

rshpount commented Feb 12, 2017

Uh oh!

stefhak commented Feb 14, 2017

Uh oh!

rshpount commented Feb 14, 2017

Uh oh!

stefhak commented Feb 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rshpount commented Feb 14, 2017

Uh oh!

stefhak commented Feb 15, 2017

Uh oh!

fluffy commented Feb 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefhak commented Feb 15, 2017

Uh oh!

martinthomson commented Feb 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aboba commented Feb 15, 2017

Uh oh!

pthatcherg commented Mar 30, 2017

Uh oh!

adamroach commented Apr 3, 2017

Uh oh!

adamroach commented Apr 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rshpount commented Apr 3, 2017 via email

Uh oh!

adamroach commented Apr 3, 2017

Uh oh!

fluffy commented Apr 4, 2017

Uh oh!

rshpount commented Apr 4, 2017 via email

Uh oh!

pthatcherg commented Apr 4, 2017 via email

Uh oh!

taylor-b commented Apr 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rshpount commented Apr 5, 2017 via email

Uh oh!

taylor-b commented Apr 5, 2017

Uh oh!

fluffy commented Apr 5, 2017

Uh oh!

taylor-b commented Apr 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rshpount commented Apr 5, 2017

Uh oh!

fluffy commented Apr 26, 2017

Uh oh!

rshpount commented Apr 26, 2017

Uh oh!

taylor-b commented Apr 27, 2017

Uh oh!

stefhak commented May 2, 2017

Uh oh!

Uh oh!

stefhak commented Feb 14, 2017 •
edited

Loading

fluffy commented Feb 15, 2017 •
edited

Loading

martinthomson commented Feb 15, 2017 •
edited

Loading

adamroach commented Apr 3, 2017 •
edited

Loading

taylor-b commented Apr 5, 2017 •
edited

Loading

taylor-b commented Apr 5, 2017 •
edited

Loading