Home

Day 99 of #100DaysOfSpec: loading the media resource, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

Continuing to read in 4.7.10.5 Loading the media resource

Some extra info after the algorithms from the previous days’ notes.

The preload attribute takes one of the following keywords, which can be changed while the media resourced is being buffered or played:

  • none: hint that the user might not need this resource, or that the server can save on traffic.
  • metadata: hint that the user might not need the resource, but it might be a good idea to fetch metadata and possibly the first few frames. Metadata includes dimensions, duration, etc, and fetching that will put the readyState attribute at HAVE_METADATA. If some frames are fetched, that attribute will likely be HAVE_CURRENT_DATA or HAVE_FUTURE_DATA. I’m curious as to what the use case might be for setting the preload attribute to metadata…maybe some perf black magic?
  • auto: Hint that the UA can load as much as it sees fit for the user.
  • The empty string value for preload, like the auto keyword, has an Automatic state.

HTML authors can dynamically switch this preload attribute value once playback begins (i.e. don’t download the video unless the user clicks play, in which case ZOMG DOWNLOAD NOW).

The UA can choose to ignore the settings in this attribute, as it is just a hint. This is supposed to be in service of UX though: taking user settings or being smart about connectivity health. The autoplay attribute can also override preload. It’s okay, validation-wise, if those two attributes don’t agree.

There’s a buffered attribute that returns an object representing how much of the media resource the UA has buffered.

Day 98 of #100DaysOfSpec: loading the media resource, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

Continuing to read in 4.7.10.5 Loading the media resource

The resource fetch algorithm, contd.

Headings below are my own.

Algorithm when resource is available

  • When the algorithm is able to determine the media resource’s duration, dimensions, and “other metadata”, that means the resource is usable. This is when the UA can start doing timeline (playback positions, etc) and duration work. This is also when the height and width attributes are set, and a resize event is fired. At the end of this work, the readyState attribute is set to HAVE_METADATA (+ loadedmetadata DOM event, for you scripting folks).
  • After this work, the UA enables relevant audio/video tracks specified on the media resource in the audioTracks/videoTracks object. The first one listed in videoTracks is the selected video track, and all the others are disabled. This step can be triggered again by other events.
  • “Once the readyState attribute reaches HAVE_CURRENT_DATA, after the loadeddata event has been fired, set the element's delaying-the-load-event flag to false. This stops delaying the load event.” The UA would also stop any buffering at this point.
  • A UA is required to determine a media resource’s duration before playing the file.

Algorithm after the media resource has been fetched (possibly before decoded)

  • UAs fire progress event, set networkState to NETWORK_IDLE, fire suspend event.
  • Can reset networkState to NETWORK_LOADING if the UA ever needs to get more data.

Algorithm after a connection interruption forces the UA to give up fetching (some data has been received)

  • Fetching process gets canceled.
  • The UA reports an error—and fires an error event—and then the networkState is at NETWORK_IDLE.
  • UA has to stop delaying the load event.
  • The whole resource selection algorithm is aborted.

Algorithm if the media data is corrupted

  • Like above, fetching gets cancelled, error stuff gets set/fired.
  • If the readyState attribute’s value is “equal to” HAVE_NOTHING, the element’s poster frame is shown. Some conditional outcomes here for networkState.
  • Like above, stop delaying load and abort the algorithm.

Algorithm if the user aborted the fetching process

(For example, pressing a “stop” button). This part of the process is not triggered by the load method itself being invoked during the algorithm.

  • The steps are pretty much the same as the last case, except that the error code is MEDIA_ERR_ABORTED, and an abort event is fired instead of an error event.

Algorithm if the media data has “non-fatal errors or uses, in part, codecs that are unsupported”

The server has to cause the UA to render what it can handle. All other data is ignored.

Algorithm when the media resource declares a UA-supported media-resource-specific text track

The algorithm has to go through the steps to expose that text track if the media data is set to CORS-same-origin. There’s a security issue with sending possibly sensitive info in subtitles across domains.

 

One final note is that it’s quite possible that a media element would never reach the final step of the algorithm, which is aborting the algorithm itself. An example is a live-streaming infinite audio file for an online radio station.

Day 97 of #100DaysOfSpec: media elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

Continuing to read in 4.7.10 Media elements

The resource fetch algorithm

Assumes an absolute URL has been provided by the resource selection algorithm.

Like the resource selection algorithm, this is also a long one, so here’s just some brief hmm-interesting notes:

  • There’s a set of substeps to the algorithm to allow UAs to implement preload=“none” (wait until user explicitly requests media resource).
  • UAs can throttle data downloads, as well as allow users to block or slow those downloads. If the user blocks the download, the UA has to treat it as stalled, as opposed to a scorched-earth closed connection.
  • UAs can choose whether or not to download more content at any given time. Use cases include long buffering times, waiting on user input, or when the user navigates away from the page containing the media element. If the UA chooses to suspend the download they fire an event called, well, suspend. The spec reminds us that the preload attribute the author set on the media element is a hint as to what might be an appropriate buffering time for the file; that can help inform the UA’s decision.
  • “The user agent may use whatever means necessary to fetch the resource”. Now THERE’S a thrilling action flick.
  • “This specification does not currently say whether or how to check the MIME types of the media resources, or whether or how to perform file type sniffing using the actual file data.” It’s interesting to see a spot in the spec where no guidance is given because everyone is doing it differently and can’t agree on how to proceed. I assume there’s more areas like this than are called out, but this one is a big one.

More on this algorithm tomorrow.

Day 96 of #100DaysOfSpec: media elements, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

Continuing to read in 4.7.10 Media elements

4.7.10.5 Loading the media resource

(Further headings in this post are my own)

The media.load() method “causes the element to reset and start selecting and loading a new media resource from scratch”. This means any “pending events and callbacks” the media element may have are scrapped.

The media element load algorithm

“All media elements have an autoplaying flag, which must begin in the true state.” I wonder why?

Interesting that the media element load algorithm directs UAs to initially set particular attributes to Not-a-Number (NaN). I would have assumed that unset attributes would just be…empty.

The resource selection algorithm

This algorithm is invoked synchronously, but after a couple steps runs asynchronously (at the same time as other scripts and tasks). Suppose this is why there’s some setting and unsetting of the delaying-the-load-event to true.

These media algorithms are pretty involved. I want to find the person(s) at work who implemented them and shake their hand(s).

Day 95 of #100DaysOfSpec: media elements, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

Currently reading in 4.7.10 Media Elements

4.7.10.3 MIME types

With some MIME types, you can describe the media resource with a codecs parameters (video/mp4; codecs=“stuffgoeshere”).

MIME types are not necessarily a silver bullet, a guarantee that the UA can definitely play a media element with that resource. They’re merely suggestions that a resource is playable, and they can help rule out the usability of some files (like if a resource’s MIME type is specified as “pineapple”, and I can’t play a pineapple file, that is helpful).

Accordingly, the canPlayType() method returns very non-committal messages: an empty string (can’t play), maybe, or probably.

“The MIME type application/octet-stream with no parameters is never a type that the user agent knows it cannot render.” It has to treat this type as if there’s not really any type information at all. The “no parameters” bit is important; if of a resource of this type has parameters, then it can be treated as a specific MIME type.

4.7.10.4 Network states

The networkState attribute stores the media element’s current network activity. Values:

  • NETWORK_EMPTY: nothing has happened yet
  • NETWORK_IDLE: resource has been selected by the algorithm, but nothing currently happening
  • NETWORK_LOADING: UA actively trying to download the resource’s data
  • NETWORK_NO_SOURCE: algorithm is active, but hasn’t found a resource to use yet

Day 94 of #100DaysOfSpec: media elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

4.7.10 Media elements

This section presents the media elements’ (audio and video) IDL interface and goes more into depth about commonalities across media elements. Those shared attributes include src, crossorigin, preload, autoplay, mediagroup, loop, muted, and controls.

4.7.10.1 Error codes

Media elements have an error status associated with them: a MediaError object stored in the error attribute. It would appear that this status only tracks one most recent error at a time. You can return such an error code by getting media.error.code.

Error codes:

  • MEDIA_ERR_ABORTED: fetching stopped at user’s request (stop loading a slow web page, I think might be an example)
  • MEDIA_ERR_NETWORK: a network error stopped the fetch. This error would be thrown only if the UA already determined that the media resource was a usable file; same with MEDIA_ERR_DECODE.
  • MEDIA_ERR_DECODE: now here’s the BS-y error! “An error of some description occurred”. Ok?
  • MEDIA_ERR_SRC_NOT_SUPPORTED: “The media resource indicated by the src attribute was not suitable.”

4.7.10.2 Location of the media resource

The only info I think might be new here since a previous section is that there’s a currentSrc IDL attribute on the media element that gives…the current source. It starts out as an empty string, and could be accessed in scripting languages via media.currentSrc.

Day 93 of #100DaysOfSpec: the track element

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

4.7.9 The track element

“The track element allows authors to specify explicit external timed text tracks for media elements.”

  • kind: lol why is this not called “type”
  • src: required
  • srclang: BCP 47 language tag
  • label: user-visible title for track
  • default: enable this track in absence of user preference overrides

Keywords that can be used for the kind attribute include:

  • subtitles: overlaid on video when it’s hard to understand the audio
  • captions: overlaid on video; more complete transcription of dialog and sounds, that could be considered complete enough for the hard-of-hearing to rely on
  • descriptions: “synthesized as audio”; descriptions of the visual video portion, for when those visuals are unavailable or unusable for whatever reason.
  • chapters: chapter titles for navigating through the video, which the UA interface displays as an interactive list.
  • metadata: not displayed; “tracks intended for use from script”

Default value, if the kind attribute is missing, is subtitles.

Other notes:

  • Doesn’t have an end tag or ARIA roles.
  • A media element (video or audio) can only have one each of a track element with: kind attribute determined to be subtitles or captions, and with default attribute true; kind attribute as description, default; kind attribute as chapters, default. I interpret the spec to mean that these conditions can be met when the kind attribute defaults to subtitles, instead of having to be explicitly set, but that is just an interpretation.
  • Can have as many track elements with metadata in the kind attribute as you want!
  • One use case for multiple track elements is the option to have subtitles/captions in different languages.

Day 92 of #100DaysOfSpec: the source element

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

UAs = user agents = browsers, etc.

4.7.8 The source element

“The source element allows authors to specify multiple alternative media resources for media elements.” That is, audio and video elements.

  • Can’t stuff any content inside this element.
  • Attributes besides globals are src and type (of resource). src is required.
  • No end tag or ARIA roles.
  • Probably relevant to some folks: “Dynamically modifying a source element and its attribute when the element is already inserted in a video or audio element will have no effect.” You want to instead mess with the src attribute on the video or audio element.
  • Hey, this area of the spec actually gives some example MIME types for the type attribute! *celebratory trumpet sound*
  • That type attribute can help the UA determine if the file is a useful media type worth fetching.

Day 91 of #100DaysOfSpec: the video element, contd. and the audio element

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.7.6 The video element

UAs = user agents = browsers, etc.

4.7.6 The video element, contd.

  • UAs have to provide controls for closed captions, audio description tracks, etc. without “interfering” with the way a page would usually render. So, no thinking outside the box. ;]
  • Can display the video content fullscreen or in a new window. Controls to enable alternate views also can’t mess with expected page rendering, but the UA can be freer with control placement when the video is in these independent views.
  • The UA can “allow video playback to affect system features that could interfere with the user's experience”. The example they give is disabling screensaver that otherwise would have popped up while watching a longish video.

4.7.7 The audio element

“An audio element represents a sound or audio stream.”

Attributes, besides globals:

  • src
  • cross origin: how these requests are handled
  • preload: hint at how much buffering needed
  • autoplay: hint that the UA can autoplay the audio
  • media group: “Groups media elements together with an implicit MediaController”
  • loop
  • muted
  • controls: show the UA controls

More notes:

  • If it has a src attribute set, can contain 0+ track elements.
  • If no src attribute, 0+ source elements, then 0+ track elements.
  • After these elements, the audio element’s content model is transparent, which means whatever you can put in the element’s parent (or closest ancestor that isn’t also transparent…), you can put inside the element. In this case there is an exception against media elements.
  • Also this element’s content is not exposed to the user, it’s fallback content for older browsers, not meant for accessibility (just like the fallback content in video).
  • Only allowed ARIA role is application.

Day 90 of #100DaysOfSpec: the video element, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.7.6 The video element

UAs = user agents = browsers, etc.

These notes extend the previous day’s post.

  • If the UA can’t fetch a poster image by the URL stored in the poster attribute on video, there is no poster image. I assume the first frame of the video just gets used for preview?
  • The video’s format has an impact on which frame is associated with a playback position.
  • In UA’s implementation, the audio has to be synced up with video, “at the element's effective media volume”.
  • The UA can choose to give visual cues on the video element as to the element’s status.
  • If the UA can’t render the video, it can link out to another method of playback or the raw video data.
  • Unless overridden by styling, the UA is beholden to rendering video content at its original aspect ratio, centered in the playback area, as large as possible without getting clipped in one direction or another.
  • Intrinsic height and width of the playback area are determined from the poster frame, or the video resource as a backup, if available. Default “object size” is 300px x 150px (CSS pixels, since a device’s “effective” pixels is not a fixed value).

Didn’t get too far in today’s reading because somedays you just. aren’t. feelin’ it.