Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been looking into webrtc and used the "webrtc samples" which are good in many ways. It is fairly easy to get something up and running, but I found several areas that were difficult.

* debugging. One users sound just doesn't work while it works perfectly for me with different machines. I am clueless as to how to debug it.

* ice. while it works, I had a hard time understanding, tracking and debugging what was going on.

* closing and restarting connections

* multiple clients in one room?

* echo cancellation. This was frustrating for users.

* Turn. Is there a tool or way to know which clients need a turn server? Are using a turn server?

I ended up guessing that getting it to be a product would actually be fairly time consuming



WebRTC doesn't do everything for you; it's really just responsible for tying together ICE with media streams. Signaling is up to you to figure out. For instance, multiple clients in one room: this is part of the signaling layer and is not WebRTC's responsibility (I built this into zonko.chat if you want to see how it works though).

Closing and restarting connections is signaling layer stuff, ie your responsibility.

Echo cancellation is really supposed to be application layer and up to you as well, but I think this will probably shift to be the browser's/WebRTC's/getUserMedia's responsibility at some point.

Re. TURN: ICE is the process that works out whether a specific client needs to relay through a TURN server. The question is: do you need to implement a TURN server? The answer is: yes, you need a TURN server. If you built a P2P app that you want to work for all users, you will always need a TURN server. You can run coturn on the same box that you serve your app from. Most likely a side project will never hit the scale requiring more than a $5 digitalocean box for TURN.

And yes, it should not be a surprise that products are time consuming to build :) WebRTC is plumbing; you probably were expecting something more like Jitsi.


FYI, echo cancel actually does work ( chrome definitely ), just make sure you specify the audio constraint so that it has a sample rate of 16khz ( aec does not work on the default 44/48khz modes )


Good suggestion, I will have to try that out!


> Echo cancellation is really supposed to be application layer and up to you as well, but I think this will probably shift to be the browser's/WebRTC's/getUserMedia's responsibility at some point.

Echo cancellation typically can't be application layer. The APIs I've seen (Android, iOS, WebRTC), require low level latency and works best as close to hardware as possible.

{ echoCancellation: true } as a track constraint in getUserMedia works.


I've never actually gotten { echoCancellation: true } to work for me, but your sibling comment does have a suggestion I need to try out!

Echo cancellation is a pretty lightweight DSP/FIR task. Whether you do it close to the hardware (? I suspect this is not actually the case with getUserMedia though -- it is still an audio stream algorithm) or in the application layer, echo cancellation requires the same amount of added latency.

But in any case, I did say I suspected echo cancellation would shift to getUserMedia. It's not fully there yet, but it will be.


It depends what the product is. If you're trying to build another Zoom (which I gather from the "rooms" question), yes, it will take quite some time. For one thing, the mesh topology of P2P won't scale up beyond a handful of users, so you'll need to make it client/server. And besides time-consuming, that starts to get operationally expensive. Decoding, compositing, and encoding high-resolution video streams in real time take some processing power.


If you want to try a platform that abstracts some parts of it (such as signaling) and aims to provide an all-in-one package (compared with WebRTC which is a collection of puzzle pieces that you are responsible to put together), have a look at OpenVidu.

The team behind Kurento is working on this (I am part of it) for people who don't really care about all the intricacies of the standard(s), and just want to build a product on too of it. A single Docker container to deploy, and you're all set to write your app.

Still, this is a complex topic so there are a thousand ways this technology can be made easier to use and understand. And I agree with other comments about the issue of debugging, there is totally an empty space in the market for a comprehensive solution that can help troubleshooting when WebRTC fails.


The debugging bit is so frustrating, I almost spend a whole week trying to find a bug with my PeerConnections only to find out that it was that the TURN Server was misconfigured (although Trickle ICE was successful). And even then just setting up a TURN server consumed a whole day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: