openRMC Blog

getUserMedia – What happens when there’s missing media sources?

Posted by & filed under Uncategorized.

This is a cross post of a blog originally released on the website WebRTC Hacks. It was written by the openRMC contributors John McLaughlin, Eamonn Power and Miguel Ponce de Leon and re-published here with permission, enjoy

What is the WebRTC API

WebRTC is an open framework to enable real time communications via the web. The primary usage scenario would browser based, where a Javascript API allows developers to implement their own RTC enabled webapp. Alternatively, various toolkits exist to facilitate native app development.

Standardisation is an ongoing process through the W3C WebRTC working group at the API level and the IETF RTCWeb working group at the protocol level.

As a very high-level overview, there are three main API components in the WebRTC framework:

  • getUserMedia – provides access to media input devices such as webcams and microphones
  • RTCPeerConnection – establishes a peer-to-peer media session between two endpoints, including any relaying necessary, allowing two users to communicate directly
  • RTCDataConnection – closely related to and dependent upon RTCPeerConnection, RTCDataChannel allows a web application to send and receive generic application data peer to peer

In this post we’ll be looking more closely at the getUserMedia() API, and how to deal with its outputs in order to give some meaningful feedback to the developer, and ultimatly the end user.


WebRTC getUserMedia

The getUserMedia API is used to access media streams from media input devices such as webcams or microphones. The stream obtained can then either be used locally by passing it to a HTML <audio/> or <video/> tag, lending itself to many creative and fun applications such as photobooth, facial recognition, image processing etc. Additionally it can be attached to an RTCPeerConnection object and used to establish communications with another user. Figure 1 shows possible usage scenarios.

Figure 1. Example scenario (source W3C)

As an overview, getUserMedia() is called with up to three parameters:

  • mediaConstraints – this is an object specifying the type and optionally the quality of the media streams required. Full details can be found in the getUserMedia API but for the purposes of this post we’ll concentrate on the boolean attributes audio, and video, which specify whether audio and/or video streams are required respectively.
  • successCallback (optional) – this is called with the MediaStream. This is illustrated in figure 2. object encapsulating the media streams requested by the developer
  • errorCallback (optional) – this is called with an error object, which should give an indication as to why the call failed
Figure 2. MediaStream object (source W3C)

Altough the callbacks are optional in the spec, in practice at the very least the successCallback is required, as this is the only way to access the MediaStream object. As we shall see, providing the error callback is also beneficial


What happens when there’s missing media source

So what happens if you call getUserMedia() and ask for a media stream but there’s no device present to support it? Shouldn’t the error callback get called with a nice informative error telling you what happened so you can pass it on to the end user? Well, yes to the first part, but the informative bit leaves a lot to be desired.

From the official perspective, at the time of writing, this is what the spec says should be passed into the error callback:

[NoInterfaceObject]
interface NavigatorUserMediaError : DOMError {
    readonly    attribute DOMString? constraintName;
};

This is subject to constant change, as can be seen by prior versions of the spec (getUserMedia is generally considered the most stable of the APIs!):

NavigatorUserMediaError 07/04/2013

[NoInterfaceObject]
interface NavigatorUserMediaError {
    readonly    attribute DOMString  name;
    readonly    attribute DOMString? message;
    readonly    attribute DOMString? constraintName;
};

NavigatorUserMediaError 12/12/2012

[NoInterfaceObject]
interface NavigatorUserMediaError {
    const unsigned short PERMISSION_DENIED = 1;
    readonly attribute unsigned short code;
};

So let’s look at what actually happens in various browsers when we remove or disable the audio and video input devices.

Chrome 28 error

err: {
        code: 1,
        PERMISSION_DENIED: 1
}

Chrome 29 error

err: {
        constraintName: "",
        message: "",
        name: "PERMISSION_DENIED"
}

To be fair to Chrome, the samples above do conform to the earlier specs, and would reflect what was current when that version of Chrome was in its Canary incarnation. In order to keep up to date on such changes it’s worth subscribing to the Discuss-WebRTC mailing list as posts like this give you a heads up on API changes.

Firefox 22 error

err: "HARDWARE_UNAVAILABLE"

Firefox 23 error

err: "NO_DEVICES_FOUND"

Informative in a sense, though what device(s) weren’t found? Nor does it bear the slightest resemblance to any version of the spec.


Making sense of it all

Fortunately depsite the lack of granularity in the error responses returned by getUserMedia(), with a little forensic work it is possible to derive a better picture of the WebRTC media capabilities of the host browser.

We make use of the adapter.js shim from the Google WebRTC reference app. We have made a modification to this as the latest version at the time of writing, r4259, didn’t have Firefox support for the getAudioTracks() and getVideoTracks() methods of MediaStream (Firefox has only gained support for these methods as of version 23)

We have wrapped the native getUserMedia() call in an API openrmc.webrtc.api.getUserMedia(spec) where spec has the following attributes:

{
    audio : true|false, // Default true
    video : true|false, // Default false
    success : function(cxt) {}, // Success callback
    failure : function(cxt) {} // Failure callback
}

The success and failure callbacks both take a parameter of type openrmc.webrtc.Context, of which more later.

Firstly what sort of errors are we interested in? Since we primarily want to establish the lack of presence of some media device, we came up with the following error set:

Error codes

openrmc.webrtc.Errors = {
    NOT_SUPPORTED : 'NOT_SUPPORTED',
    CONSTRAINTS_REQUIRED : 'CONSTRAINTS_REQUIRED',
    AUDIO_NOT_AVAILABLE : 'AUDIO_NOT_AVAILABLE',
    VIDEO_NOT_AVAILABLE : 'VIDEO_NOT_AVAILABLE',
    AV_NOT_AVAILABLE : 'AV_NOT_AVAILABLE'
} ;

The first two are straightforward. NOT_SUPPORTED is raised when getUserMedia() is called on a browser that doesn’t support WebRTC. CONSTRAINTS_REQUIRED will be raised if getUserMedia() would be called with both audio, and video attributes set to false. This would result in an error anyway, and isn’t allowed.

if(webrtcDetectedBrowser===null) {
    console.log("[RTCInit] : Browser does not appear to be WebRTC-capable");
    var cxt=openrmc.webrtc.Context({
        error : openrmc.webrtc.Errors.NOT_SUPPORTED
    });
    spec.failure && spec.failure(cxt) ;
    return ;
}
:
:
if(!(spec.audio || spec.video)) {
    spec.failure && spec.failure(openrmc.webrtc.Context({
        error : openrmc.webrtc.Errors.CONSTRAINTS_REQUIRED
    })) ;
    return ;
}

AUDIO_NOT_AVAILABLE, VIDEO_NOT_AVAILABLE, and AV_NOT_AVAILABLE are fairly self-explanatory and are raised if any stream of a certain type has been requested, but isn’t available. Which one is determined by checking the returned error and what resources are requested. A getError() helper method is defined based on the detected browser.

if(webrtcDetectedBrowser==='chrome') {
    ns.getError= function(err, spec) {
        // Take account of varying forms of this error across Chrome versions
        if(err.name==='PERMISSION_DENIED' || err.PERMISSION_DENIED===1 || err.code===1) {
            if(spec.audio && spec.video) {
                return openrmc.webrtc.Errors.AV_NOT_AVAILABLE ;
            }
            else {
                if(spec.audio && !spec.video) {
                    return openrmc.webrtc.Errors.AUDIO_NOT_AVAILABLE ;
                }
                else {
                    return openrmc.webrtc.Errors.VIDEO_NOT_AVAILABLE ;
                }
            }
        }
    } ;
}
else if(webrtcDetectedBrowser==='firefox') {
    ns.getError = function(err, spec) {
        if(err==='NO_DEVICES_FOUND' || err==='HARDWARE_UNAVAILABLE') {
            if(spec.audio && spec.video) {
                return openrmc.webrtc.Errors.AV_NOT_AVAILABLE ;
            }
            else {
                if(spec.audio && !spec.video) {
                    return openrmc.webrtc.Errors.AUDIO_NOT_AVAILABLE ;
                }
                else {
                    return openrmc.webrtc.Errors.VIDEO_NOT_AVAILABLE ;
                }
            }
        }
    } ;
}
else {
    ns.getError = function() {
        console.log('No error handler set for '  + webrtcDetectedBrowser) ;
        return openrmc.webrtc.Errors.NOT_SUPPORTED ;
    } ;
}

These are hard errors and are raised as a result of the native getUserMedia() failing, i.e. none of the requested media constraints could be satisfied. The native getUserMedia() will succeed if say audio and video are requested, but only audio is available. In this case, the returned openrmc.webrtc.Context can give more information to the application developer which can be used to inform the end user.

The openrmc.webrtc.Context object provides the following methods:

  • isrtcavailable() – Returns true if the native getUserMedia() call resulted in a valid MediaStream, false otherwise.
  • isrtcaudioavailable() – Returns true if at least one audio track is available, false otherwise.
  • isrtcvideoavailable() – Returns true if at least one video track is available, false otherwise.
  • getError()- Returns an error, if any

The presence of audio and video tracks is determined by the getAudioTracks() and getVideoTracks() methods on MediaStream – in both cases true is returned if and only if the MediaStream object is present, and the relevant getXXXTracks() method returns an array with one or more entries.

that.isrtcavailable = function(){
    return spec.localStream !== undefined ;
} ;

var aavail = (spec.localStream &&
        spec.localStream.getAudioTracks() &&
        spec.localStream.getAudioTracks().length>0) || false ;

that.isrtcaudioavailable = function() {
    return aavail ;
} ;

var vavail = (spec.localStream &&
        spec.localStream.getVideoTracks() &&
        spec.localStream.getVideoTracks().length>0) || false ;

that.isrtcvideoavailable = function() {
    return vavail ;
} ;

Trying it

We have made a simple web application available where you can try it for out yourself at http://www.openrmc.org/files/webrtc/context/index.html. To try it, go to the page and click the "Call getUserMedia" button. Try various combinations with the webcam and/or mic disabled. An error or stream status will be reported when the call returns.

The source can also be viewed and downloaded from https://github.com/openRMC?source=c. It does need to be served by a web server of some sort – simply accessing the page by a file:// URL won’t work

Note While on most modern OS’s it’s usually quite straightforward to disable a webcam if it’s not built in (just unplug it), it can be quite tricky to disable audio input. Typically some sort of virtual driver which will quite happily allow getUserMedia() to return successfully with no supporting hardware. Proper disabling can thus require a dance of device driver disabling and possible reboots (stand up Windows). If you’re going down this path to test, make sure you know how to re-enable said drivers again!

Summary

It’s important to give the end user an informative message when things go wrong when initialising a WebRTC session and hopefully you can see that this approach along with the code snippets gives a developer the possiblity to derive a better picture of the WebRTC media capabilities of the host browser.

The results here should be taken as a snapshot reflecting the state of play for getUserMedia() at the time of writing. As can be seen above, the API is still fluid, and could change several times before being set in stone. It is also possible that Microsoft and Apple will come to the party and add support of some sort to Internet Explorer and Safari, which will change the picture again.

The snippets presented in the post should therefore be taken as what they are – an illustration. The code on GitHub WebRTC Context and the sample app should be kept up to date though, and should therefore be taken as the reference.

The code presented here is also reasonably simplified, as it doesn’t take into account the more complex forms of mediaConstraints that can be passed to getUserMedia(). Adapting the code to take account of those should be quite straightforward however, and for the purposes of example simpler is often better.

Companies behind WebRTC

Posted by & filed under WebRTC.

There are number of companies involved in the WebRTC project in its present state. They are Google, Mozilla, Opera on the browser side with W3C and IETF on the standards side. Only recently Microsoft have also shown an interest in this project but we are not expected to see any development from them until 2013. Having said that they are currently recruiting developers to work on a project combining Skype with WebRTC, so development could occur earlier than expected.

Google and WebRTC

Google have been to the forefront of this project. They want to develop a standard based real time media engine available in all browsers. In order to drive the development of real time communication Google have released nearly $70 million worth of open source code to developers. This open source audio and video codecs came about through the acquisition by Google of companies such as Global IP solutions and On2 Technologies.

In early 2010, Google finalized its acquisition of On2, a video codec company that has developed the VP series of codecs, with the latest codec being VP8. On2 has always positioned its codecs as a patent free replacement to the H.26x series of codecs, which were standardized, patented and widely used. It then went about opening On2’s technologies to the world and open sources VP8 under the name of WebM. The idea was to replace H.264 for web videos and by that, reduce patent costs for everyone – especially Google itself.

Google went on and during 2010 acquired Global IP Solutions (GIPS), a company known for their media frameworks – a piece of technology that makes developing VoIP and video calling applications easier. At the time, GIPS had a large market share in VoIP, which caused most of the industry to scurry and search for alternative solutions. As with On2, Google took GIPS assets and open sourced them. This time they threw out all voice and video codecs that had patent owners and added an additional layer – a Java Script API as an integration layer to web browsers. The idea, have bidirectional media processing and media coding technologies available in every browser. It then went on to push it as a standard at the IETF and W3C, where such standards are set and approved . This real time communication is now called WebRTC. Google have a Google+ forum page which keeps up with the latest WebRTC developments.Google+ WebRTC

Mozilla/Firefox and WebRTC

Mozilla Firefox has started showing off WebRTC support in their browser. Nothing stable enough to be able to release it in their main branch of the browser, but this is a positive step as developers can test in multiple browsers.

Mozilla attended IETF 83 in Paris, and showed an early demo of a simple video call between two Browser-ID-authenticated parties in a special build of Firefox with WebRTC support.

Mozilla have been experimenting with integrating social features in the browser, to combine it with WebRTC to establish a video call between two users who are signed in using Browser ID (now called Persona). The Social-API add – on, once installed, provides a sidebar where web content from the social service provider is rendered. In the demo social service, you can see a “buddy list” of people who are currently signed in using Persona.Mozilla.org

Opera

Opera released its new version of its web browser Opera 12 on June 14 last. The new version includes preliminary support for WebRTC. WebRTC will eventually enable standards-based audio and video chat in Web applications. There is also support for the WebRTC media capture APIs, which allow Web content to capture live media streams from the user’s microphone and web cam.

The WebRTC getUserMedia API works out of the box in Opera 12 and can be used by any website. Due to the potential privacy and security implications, the user is automatically prompted by the browser before the feature is allowed to be activated.Arstechnica.com

Microsoft and WebRTC

“Microsoft, Internet Explorer has put its weight behind WebRTC, a plugin-free technology for voice and video communications in the browser. However, it proposed a different approach other than the one currently favored by other browser vendors, and warned against implementing the technology before there’s a common standard.” Janko Roettgers

“Customizable, Ubiquitous Real Time Communication over the Web,” or short CU-RTC-Web, is Microsoft’s contribution to the W3C WebRTC working group. Microsoft have stated that they have been closely involved in the work on the WebRTC standard with both the IETF and W3C since 2010. Unlike other browser companies their work has been very quite and not as publicly available as other interested parties. This has of course now changed with the release of their version of WebRTC, CU-RTC-Web.

There are a number of reasons Microsoft have taking a different approach, the most notable being the VP8 video codec that has been put forward as the default video codec. Although Microsoft have and still have to some issues with this codec, they feel developers should not be tied down to individual codes. They also have concerns in the predetermined way media is proposed to be sent over the network, they prefer a more flexible and customised approach to implement the technology on legacy devices.

Aside from the issues with codecs and media management, Microsoft feel that eventually all parties concerned will agree on common standards. With CU-RTC-Web Microsoft envisage that the wall garden approach of Skype will come tumbling down and allow interoperability between Skype and other OTT operators such as Google Talk.

Apple/Safari and WebRTC

At present Apple have no part in the implementation or development of WebRTC in their Safari web browser, as they have vested interest in their FaceTime walled garden of audio/visual communication and in the H.264 audio codec, which isn’t part of WebRTC at present.

However WebRTC4all built by the Google development team is an extension for Safari and other browsers. This allows developers to add the WebRTC functions such as audio/video streaming in all browsers now and they will easily be able to switch to the official implementation when it’s added by Apple and others.


Conclusion

Allowing audio and video codecs to be open sourced under a very lenient open source license has made it very attractive to place WebRTC into commercial products. With Goggle, Microsoft, Opera and Mozilla backing WebRTC it would seem that all is well in the browser world. But until all the standards have been set by W3C and IETF, and adapted by all interested parties we won’t celebrate the dawn of a new communication phase just yet.

Aside from all the political posturing, WebRTC is here to stay. It will bring great innovation, vast commercial opportunities and most importantly real time communication for everyone.

What is WebRTC

Posted by & filed under WebRTC.

We have a series of blog posts to release on WebRTC, who is driving WebRTC, Why do we need WebRTC, When and where will it be seen and the Future benefits and possibilities. However first up is an attempt to answer the question What is WebRTC?

WebRTC Logo

“WebRTC is a free, open project that enables web browsers with Real-Time Communications (RTC) capabilities via simple Java script APIs. The WebRTC components have been optimized to best serve this purpose.” WebRTC

WebRTC allows real time peer to peer audio visual communication via a HTML5 compliant browser. Not all browsers have WebRTC capability at present. At the time of writing over 50% of browser support will be available for WebRTC in the coming months, this includes Chrome, Firefox and Opera  with Internet Explorer following with their version CU-RTC-WEB in early 2013. There are no plugins required for WebRTC to work and no expensive pieces of hardware either. Just a WebRTC enabled browser, a camera (which is often quite standard on all new laptops), a mic/headset or mic/speakers and your real time communication available to you.

Having WebRTC integrated in a HTML5 enabled browser means you can now make real-time audio video calls to any other WebRTC enabled device including such devices as tablets, smart phones, e-readers etc. The quality of these calls are only limited by the quality of the hardware and the network.

With IETF having set the standard for protocol and signalling, and W3C having set the standard for the APIs for app developers, this means millions of Java Script developers can now deliver and define web based communication. No longer will it be the domain of the small number of SIP developers and VoIP system resellers.

WebRTC has the potential for real change in how we communicate, much the same way the browser did for information. The effect can be that big. What we need is that all invested parties comply with the standards laid down by W3C and IETF; whether this comes to bear only time will tell.

Check out the WebRTC Explained video presentation by Cullen Jennings of Cisco