getUserMedia – What happens when there’s missing media sources?

Posted by & filed under Uncategorized.

This is a cross post of a blog originally released on the website WebRTC Hacks. It was written by the openRMC contributors John McLaughlin, Eamonn Power and Miguel Ponce de Leon and re-published here with permission, enjoy

What is the WebRTC API

WebRTC is an open framework to enable real time communications via the web. The primary usage scenario would browser based, where a Javascript API allows developers to implement their own RTC enabled webapp. Alternatively, various toolkits exist to facilitate native app development.

Standardisation is an ongoing process through the W3C WebRTC working group at the API level and the IETF RTCWeb working group at the protocol level.

As a very high-level overview, there are three main API components in the WebRTC framework:

  • getUserMedia – provides access to media input devices such as webcams and microphones
  • RTCPeerConnection – establishes a peer-to-peer media session between two endpoints, including any relaying necessary, allowing two users to communicate directly
  • RTCDataConnection – closely related to and dependent upon RTCPeerConnection, RTCDataChannel allows a web application to send and receive generic application data peer to peer

In this post we’ll be looking more closely at the getUserMedia() API, and how to deal with its outputs in order to give some meaningful feedback to the developer, and ultimatly the end user.


WebRTC getUserMedia

The getUserMedia API is used to access media streams from media input devices such as webcams or microphones. The stream obtained can then either be used locally by passing it to a HTML <audio/> or <video/> tag, lending itself to many creative and fun applications such as photobooth, facial recognition, image processing etc. Additionally it can be attached to an RTCPeerConnection object and used to establish communications with another user. Figure 1 shows possible usage scenarios.

Figure 1. Example scenario (source W3C)

As an overview, getUserMedia() is called with up to three parameters:

  • mediaConstraints – this is an object specifying the type and optionally the quality of the media streams required. Full details can be found in the getUserMedia API but for the purposes of this post we’ll concentrate on the boolean attributes audio, and video, which specify whether audio and/or video streams are required respectively.
  • successCallback (optional) – this is called with the MediaStream. This is illustrated in figure 2. object encapsulating the media streams requested by the developer
  • errorCallback (optional) – this is called with an error object, which should give an indication as to why the call failed
Figure 2. MediaStream object (source W3C)

Altough the callbacks are optional in the spec, in practice at the very least the successCallback is required, as this is the only way to access the MediaStream object. As we shall see, providing the error callback is also beneficial


What happens when there’s missing media source

So what happens if you call getUserMedia() and ask for a media stream but there’s no device present to support it? Shouldn’t the error callback get called with a nice informative error telling you what happened so you can pass it on to the end user? Well, yes to the first part, but the informative bit leaves a lot to be desired.

From the official perspective, at the time of writing, this is what the spec says should be passed into the error callback:

[NoInterfaceObject]
interface NavigatorUserMediaError : DOMError {
    readonly    attribute DOMString? constraintName;
};

This is subject to constant change, as can be seen by prior versions of the spec (getUserMedia is generally considered the most stable of the APIs!):

NavigatorUserMediaError 07/04/2013

[NoInterfaceObject]
interface NavigatorUserMediaError {
    readonly    attribute DOMString  name;
    readonly    attribute DOMString? message;
    readonly    attribute DOMString? constraintName;
};

NavigatorUserMediaError 12/12/2012

[NoInterfaceObject]
interface NavigatorUserMediaError {
    const unsigned short PERMISSION_DENIED = 1;
    readonly attribute unsigned short code;
};

So let’s look at what actually happens in various browsers when we remove or disable the audio and video input devices.

Chrome 28 error

err: {
        code: 1,
        PERMISSION_DENIED: 1
}

Chrome 29 error

err: {
        constraintName: "",
        message: "",
        name: "PERMISSION_DENIED"
}

To be fair to Chrome, the samples above do conform to the earlier specs, and would reflect what was current when that version of Chrome was in its Canary incarnation. In order to keep up to date on such changes it’s worth subscribing to the Discuss-WebRTC mailing list as posts like this give you a heads up on API changes.

Firefox 22 error

err: "HARDWARE_UNAVAILABLE"

Firefox 23 error

err: "NO_DEVICES_FOUND"

Informative in a sense, though what device(s) weren’t found? Nor does it bear the slightest resemblance to any version of the spec.


Making sense of it all

Fortunately depsite the lack of granularity in the error responses returned by getUserMedia(), with a little forensic work it is possible to derive a better picture of the WebRTC media capabilities of the host browser.

We make use of the adapter.js shim from the Google WebRTC reference app. We have made a modification to this as the latest version at the time of writing, r4259, didn’t have Firefox support for the getAudioTracks() and getVideoTracks() methods of MediaStream (Firefox has only gained support for these methods as of version 23)

We have wrapped the native getUserMedia() call in an API openrmc.webrtc.api.getUserMedia(spec) where spec has the following attributes:

{
    audio : true|false, // Default true
    video : true|false, // Default false
    success : function(cxt) {}, // Success callback
    failure : function(cxt) {} // Failure callback
}

The success and failure callbacks both take a parameter of type openrmc.webrtc.Context, of which more later.

Firstly what sort of errors are we interested in? Since we primarily want to establish the lack of presence of some media device, we came up with the following error set:

Error codes

openrmc.webrtc.Errors = {
    NOT_SUPPORTED : 'NOT_SUPPORTED',
    CONSTRAINTS_REQUIRED : 'CONSTRAINTS_REQUIRED',
    AUDIO_NOT_AVAILABLE : 'AUDIO_NOT_AVAILABLE',
    VIDEO_NOT_AVAILABLE : 'VIDEO_NOT_AVAILABLE',
    AV_NOT_AVAILABLE : 'AV_NOT_AVAILABLE'
} ;

The first two are straightforward. NOT_SUPPORTED is raised when getUserMedia() is called on a browser that doesn’t support WebRTC. CONSTRAINTS_REQUIRED will be raised if getUserMedia() would be called with both audio, and video attributes set to false. This would result in an error anyway, and isn’t allowed.

if(webrtcDetectedBrowser===null) {
    console.log("[RTCInit] : Browser does not appear to be WebRTC-capable");
    var cxt=openrmc.webrtc.Context({
        error : openrmc.webrtc.Errors.NOT_SUPPORTED
    });
    spec.failure && spec.failure(cxt) ;
    return ;
}
:
:
if(!(spec.audio || spec.video)) {
    spec.failure && spec.failure(openrmc.webrtc.Context({
        error : openrmc.webrtc.Errors.CONSTRAINTS_REQUIRED
    })) ;
    return ;
}

AUDIO_NOT_AVAILABLE, VIDEO_NOT_AVAILABLE, and AV_NOT_AVAILABLE are fairly self-explanatory and are raised if any stream of a certain type has been requested, but isn’t available. Which one is determined by checking the returned error and what resources are requested. A getError() helper method is defined based on the detected browser.

if(webrtcDetectedBrowser==='chrome') {
    ns.getError= function(err, spec) {
        // Take account of varying forms of this error across Chrome versions
        if(err.name==='PERMISSION_DENIED' || err.PERMISSION_DENIED===1 || err.code===1) {
            if(spec.audio && spec.video) {
                return openrmc.webrtc.Errors.AV_NOT_AVAILABLE ;
            }
            else {
                if(spec.audio && !spec.video) {
                    return openrmc.webrtc.Errors.AUDIO_NOT_AVAILABLE ;
                }
                else {
                    return openrmc.webrtc.Errors.VIDEO_NOT_AVAILABLE ;
                }
            }
        }
    } ;
}
else if(webrtcDetectedBrowser==='firefox') {
    ns.getError = function(err, spec) {
        if(err==='NO_DEVICES_FOUND' || err==='HARDWARE_UNAVAILABLE') {
            if(spec.audio && spec.video) {
                return openrmc.webrtc.Errors.AV_NOT_AVAILABLE ;
            }
            else {
                if(spec.audio && !spec.video) {
                    return openrmc.webrtc.Errors.AUDIO_NOT_AVAILABLE ;
                }
                else {
                    return openrmc.webrtc.Errors.VIDEO_NOT_AVAILABLE ;
                }
            }
        }
    } ;
}
else {
    ns.getError = function() {
        console.log('No error handler set for '  + webrtcDetectedBrowser) ;
        return openrmc.webrtc.Errors.NOT_SUPPORTED ;
    } ;
}

These are hard errors and are raised as a result of the native getUserMedia() failing, i.e. none of the requested media constraints could be satisfied. The native getUserMedia() will succeed if say audio and video are requested, but only audio is available. In this case, the returned openrmc.webrtc.Context can give more information to the application developer which can be used to inform the end user.

The openrmc.webrtc.Context object provides the following methods:

  • isrtcavailable() – Returns true if the native getUserMedia() call resulted in a valid MediaStream, false otherwise.
  • isrtcaudioavailable() – Returns true if at least one audio track is available, false otherwise.
  • isrtcvideoavailable() – Returns true if at least one video track is available, false otherwise.
  • getError()- Returns an error, if any

The presence of audio and video tracks is determined by the getAudioTracks() and getVideoTracks() methods on MediaStream – in both cases true is returned if and only if the MediaStream object is present, and the relevant getXXXTracks() method returns an array with one or more entries.

that.isrtcavailable = function(){
    return spec.localStream !== undefined ;
} ;

var aavail = (spec.localStream &&
        spec.localStream.getAudioTracks() &&
        spec.localStream.getAudioTracks().length>0) || false ;

that.isrtcaudioavailable = function() {
    return aavail ;
} ;

var vavail = (spec.localStream &&
        spec.localStream.getVideoTracks() &&
        spec.localStream.getVideoTracks().length>0) || false ;

that.isrtcvideoavailable = function() {
    return vavail ;
} ;

Trying it

We have made a simple web application available where you can try it for out yourself at http://www.openrmc.org/files/webrtc/context/index.html. To try it, go to the page and click the "Call getUserMedia" button. Try various combinations with the webcam and/or mic disabled. An error or stream status will be reported when the call returns.

The source can also be viewed and downloaded from https://github.com/openRMC?source=c. It does need to be served by a web server of some sort – simply accessing the page by a file:// URL won’t work

Note While on most modern OS’s it’s usually quite straightforward to disable a webcam if it’s not built in (just unplug it), it can be quite tricky to disable audio input. Typically some sort of virtual driver which will quite happily allow getUserMedia() to return successfully with no supporting hardware. Proper disabling can thus require a dance of device driver disabling and possible reboots (stand up Windows). If you’re going down this path to test, make sure you know how to re-enable said drivers again!

Summary

It’s important to give the end user an informative message when things go wrong when initialising a WebRTC session and hopefully you can see that this approach along with the code snippets gives a developer the possiblity to derive a better picture of the WebRTC media capabilities of the host browser.

The results here should be taken as a snapshot reflecting the state of play for getUserMedia() at the time of writing. As can be seen above, the API is still fluid, and could change several times before being set in stone. It is also possible that Microsoft and Apple will come to the party and add support of some sort to Internet Explorer and Safari, which will change the picture again.

The snippets presented in the post should therefore be taken as what they are – an illustration. The code on GitHub WebRTC Context and the sample app should be kept up to date though, and should therefore be taken as the reference.

The code presented here is also reasonably simplified, as it doesn’t take into account the more complex forms of mediaConstraints that can be passed to getUserMedia(). Adapting the code to take account of those should be quite straightforward however, and for the purposes of example simpler is often better.