Ice fails - thoughts

hi guys,
today I experienced some strange issue with a friend.

we both see each others but every 15 seconds (precisely 15 seconds) he lose me.
I made a routine in script.js to resubscribe on ice failed so immediately subscribe me again.
this goes on and on

Strange things:

  • Going p2p, this doesn’t happens
  • No error or whatever in licode logs
  • Another pc in same connection in his office -> same result
  • the connection, is blazing fast and stable while connected
  • I never failed a subscription of his video
  • At some point, during a test, he doesn’t lose me anymore and the connection went stable

Hints:

  • I am using twilio service for turn/stun server
  • Never happened before, only happens with his office connection
  • He’s using skype and zoom regularly
  • I tried both nicer and libnice -> same result

Obviously it seems a firewall inbound issue.
What I am not able to explain is why in p2p this doesn’t happens.
Am I configuring something wrong?
I have:
config.erizo.minport = 0; // default value: 0
config.erizo.maxport = 0; // default value: 0

Is this ok? or there is something I can tune?
Maybe an ipv6 issue?
Do you need something specific to debug this issue? I made some screen at webrtc:internals

This is so weird

Is nICEr enabled in your configuration?

Tried with and without nicer. sam result. I use nicer by default tho

What do you think about ICE Restart? I think in this case would be very helpful. What I noticed is that the connection goes to disconnected very quickly and then 10 seconds after the ice fail comes. Reading some nice article from philipp seems that ice restart is to be used in this case. But I still don’t comprehend why this happens.
Do you think it would be difficult to implement this server side?

I don’t think ICE Restart should be the solution for that since you’d have micro cuts from time to time anyway. I don’t think we’ll implement it in the short/medium term, we’re finding more issues with the websockets for instance, since we’ve added nICEr to Erizo, so we’ll first try to reconnect websockets automatically. But any contribution would be more than welcome! :wink:

@Javier @pedro what would you do in this case?
I put a computer in my customer environment and I’m doing some tests.
When it fails I usually get from chrome only host and srflx candidates.
Any connection that does not start in relay will connect and fail after 15 sec.
only sometimes, I get also relays candidates and it goes with no problem.

I can give you access to this machine if you need/want to do some tests.

I also tried to mess around with ice restart but only able to fire a createoffer with iceRestart: true clientside.
Server side I update the ufrag and password and then set the new discovered ice from chrome but wasn’t able to make an answer.

that’s a demonstration: as you can see the state goes to disconnected after few secs then fails.

Youtube

here’s my mail if you need access: francesco.durighetto@bandyer.com

Take a look at this: https://developer.mozilla.org/en-US/docs/Web/API/RTCConfiguration#RTCIceTransportPolicy_enum

Apparently you could set iceTransportPolicy: “relay” in the PeerConnection properties.

We’d love to test it, but haven’t got the time yet. It might help you by using only TURN servers in those cases. Please, try it and let us know if it works!

done tomorrow morning and this does effectively works (forcing relay).
But the strange thing is that in p2p works without a relay. I can make a direct connection from me to the remote peer.
this does not make sense to me. this is probably a server side port issue but currently all are set to open.
I’m on an aws instance.
Any other thoughts?

this is a pcap of the connected → failed subscribe stream: Google drive

Seems like there’s a local (not AWS, since 10.128.128.128 is a private IP) firewall blocking the access: “Code: 13 (Communication administratively filtered)”.

We don’t know the rules of such firewall, p2p would work if both clients are in the same network and they don’t pass through that Firewall, or if Erizo’s IP is in part of a black list, etc. You could also test direct UDP connectivity using tools like nc.

Hi! We have similar problem with one of our customers.
We also tried to force relay and it helped, but we are hoping to find a better solution…
@Francesco_Durighetto did you fix/workaround this problem?