Home

Day 44 of #100DaysOfSpec: The translate attribute

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.5 Global Attributes

3.2.4.5 The translate attribute

The translate attribute defines whether or not to translate the element's text when the page is localized. Values are easy: “yes” or “no”. If not set, the element inherits this attribute.

What this attribute’s value does in practice is set a translation mode on the element (“no-translate” or “translate-enabled”). If the element doesn’t have the attribute set, it also inherits translation mode. If the element is the root element, its translation mode is “translate-enabled”.

There is a list of an element’s attributes that can and should be translated, if the translate attribute was set to “yes”. Good to know it’s not just the text contents of the element; this is actually something I was wondering the other day, for practical application on a project.

Day 43 of #100DaysOfSpec: 3.2.5 Global Attributes, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.5 Global Attributes

3.2.5.3 The lang and xml:lang attributes

The lang attribute sets the “primary” language of an element (in a BCP 47 value). I don't know that I've ever noticed it on a tag other than html. Child elements inherit lang.

lang is used on any HTML element, xml:lang is used on an HTML element in an XML document, and can also be used within other namespaces if specifically allowed (like in MathGL or SVG).

To return the language of a node, the UA needs to find the closest ancestor with a lang attribute. If this goes bottom up instead of top down, that would be pretty inefficient if the html element is the only one with a language set.

xml:lang takes precedence over lang in this parsing, if both are set on the same element.

If the attribute is nowhere to be found, the UA looks at the “pragma-set default language”, which appears to be the language set by a meta tag in the head of the HTML file. If that's not found, it looks to HTTP to find any language info.

The language never defaults to a particular language (that would be biased and very easily incorrect); if there is no information that matches known values, the language of the document will be treated as “unknown”.

“User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, for dictionary selection, or for the user interfaces of form controls such as date pickers).”

Day 42 of #100DaysOfSpec: 3.2.5 Global Attributes, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.5 Global Attributes

3.2.5.1 The id attribute

Other than being unique, at least one character, and devoid of spaces, "there are no other restrictions on what form an ID can take; in particular, IDs can consist of just digits, start with a digit, start with an underscore, consist of just punctuation, etc." A couple of surprises there. I've had a couple of cases in which I've had to prefix a classname with text so that styling wouldn't choke on a classname starting with a digit. Bless the poor souls who have to inherit IDs made entirely of punctuation.

"Particular meanings should not be derived from the value of the id attribute." Unsure what is meant here.

3.2.5.2 The title attribute

To be honest, I don't think I've ever used a title attribute on anything other than an image or anchor link. But as mentioned in the last post, it's chill to use it on any HTML element.

The spec warns not to rely on the title for anything important, due to accessibility problems. Chiefly not being able to trigger the title when interacting with keyboard or touch. But if we don't have finger-hover technology on consumer mobile devices within the next 5 years, I'll be shocked.

If an element doesn't have a title attribute set, the implication is that the nearest ancestral title also applies to this element. You would need to override with a title attribute on the item itself, even an empty string, in order to clear that implication. The title algorithm should actually return this, not just someone on an HTML working group being fussy about ~~meaning~~

You can force line breaks in titles, but "caution is advised". Well...why? Mileage may vary or it will actually break something? Who knows!

Day 41 of #100DaysOfSpec: 3.2.5 Global Attributes

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.5 Global Attributes

These attributes can be added to any HTML* element:

  • accesskey
  • class
  • contenteditable
  • dir
  • hidden
  • id
  • lang
  • spellcheck
  • style
  • tabindex
  • title
  • translate

*Specifically HTML. An element in another namespace, like XML, doesn't get these attributes, but inherits from an ancestral HTML element.

There's also a long list of event handler content attributes that can be set on any HTML element. Spans things like events from user input, changes to the document, changes to audio/video status, etc. The list is interesting because you could call onvolumechange on a div, but that would be ineffectual.

As mentioned earlier in the spec, you can also place a custom data attribute ("data-[whateveryouwantgurl]") on any HTML element.

Day 40 of #100DaysOfSpec: 3.2.4.3 Paragraphs, in Content Models

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.4.3 Paragraphs.

3.2.4.3 Paragraphs

A "paragraph" in the spec connotes more than the <p> tag. In fewer words, it is a chunk of text (phrasing content) discussing a discreet idea. A paragraph could also be, for the spec's purposes, an address, a part of a form, a byline, or a poem stanza.

Some elements (a, ins, del, map) can "straddle" paragraphs, and so the UA needs to ignore these elements when interpreting which portions of phrasing content are individual "paragraphs". Judging from the first example given, this seems like a protection against messy markup.

Reading the other examples starts to get like one of those puzzles where you have to guess how many triangles there are in an abstract design. Bananas!

Day 39 of #100DaysOfSpec: 3.2.4 Content Models, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.4 Content Models.

3.2.4.1.8 Palpable content

Elements with a content model that allows flow or phrasing content should contain at least one "node" of palpable content. Possibly how you've heard this manifest is "you shouldn't have an empty

tag". Likely the most common use case of people neglecting to include palpable content is when they add an empty container to a page, to be filled in with Javascript.

The spec "encourages" conformance checkers to flag an error/warning/notice when they find elements empty of palpable content.

Check out the spec for the full list of palpable content elements, which also includes non-inter-element-whitespace text. Some of these elements have conditions on them; for example, a <ul> counts as palpable content if it has at least one <li> child.

3.2.4.1.9 Script-supporting elements

script, template.

3.2.4.2 Transparent content models

Elements with a "transparent" content model inherit the content model of the parent, and have the same descendent requirements as this parent.

If a transparent element doesn't have a parent, its content model is treated as flow content.

Day 38 of #100DaysOfSpec: 3.2.4 Content Models, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.4 Content Models.

3.2.4.1.6 Embedded content

"Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document."

Includes audio, canvas, embed, iframe, img, math, object, svg, and video. On occasion (spoiler alert!) the spec will define fallback content for these elements.

3.2.4.1.7 Interactive content

These elements have an activation behavior, "normally culminating in a click event". The user can trigger a synthetic "click" by other means, such as voice input or keyboard navigation.

If a click() method triggered the synthetic click, "the isTrusted attribute must be initialized to false". Seems like a security measure.

If you ever wondered how fussy this stuff is, here's a good example: "Click-focusing behavior (e.g. the focusing of a text field when user clicks in one) typically happens before the click, when the mouse button is first depressed, and is therefore not discussed here."

Day 37 of #100DaysOfSpec: 3.2.4 Content Models, contd.

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.4 Content models.

3.2.4.1.3 Sectioning content

Defines "scope of headings and footers". A lot of people only use header and footer elements for the universal, presentational header and footer of a website. However, these elements add meaning to sectioning content.

Elements in this category:

  • article
  • aside
  • nav
  • section

Each of these might have a heading and an outline.

There are other elements that are "sectioning roots"—not sectioning content—but they can have an outline, as well.

3.2.4.1.4 Heading content

Heading elements: h1, h2, … h6. These are headers of sections that can be marked up inside a section or stand alone.

3.2.4.1.5 Phrasing content

Text of the document, and elements that mark up text within the "intra-paragraph level".

Check out the full list in the spec, because there are a lot of surprises there (to me, at least): form elements, canvas, object, select, video…I was expecting elements that seemed strictly text-oriented.

Text can apparently mean anything, re: content models:

  • nothing
  • Text nodes
  • sometimes its own content model
  • also a phrasing model
  • can be inter-element whitespace (like that moment when you're first starting out when you realize carriage returns in your code are ignored and collapsed).

"Text nodes and attribute values must consist of Unicode characters, must not contain U+0000 characters, must not contain permanently undefined Unicode characters (noncharacters), and must not contain control characters other than space characters."

Day 36 of #100DaysOfSpec: 3.2.4 Content Models

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2.4 Content models.

Just gonna leave this here: "an HTML element must have contents that match the requirements described in the element's content model. The contents of an element are its children in the DOM, except for template elements, where the children are those in the template contents (a separate DocumentFragment assigned to the element when the element is created)."

Space characters are always allowed between elements. This is something that sort of happens naturally when authoring documents, but I feel like I've heard similar questions ("can I/should I put spaces here?") from beginners. This space is called inter-element whitespace.

Inter-element whitespace is ignored when processing the document's semantics and whether or not the contents

HTML elements can be "orphan" nodes. Interestingly: "creating a td element and storing it in a global variable in a script is conforming, even though td elements are otherwise only supposed to be used inside tr elements".

3.2.4.1 Kinds of content

Elements can (and nearly always do) belong to more than one category:

  • Metadata
  • Flow (encompasses almost all other categories, except for some of "metadata")
  • Sectioning
  • Heading
  • Phrasing
  • Embedded
  • Interactive

There's a nice little chart, too.

3.2.4.1.1 Metadata content

Metadata content:

  • Defines presentation/behavior of the rest of the content (like CSS styling) or
  • Sets relationship of doc to other documents (like next and previous links) or
  • "Conveys other 'out of band' information."

Not sure what files under that last misc item, feel like "title" might be a good example unless that falls under the category for whatever reason.

Short list of example elements: base, link, meta, noscript, script, style, template, title

3.2.4.1.2 Flow content

Most elements in the <body>, see spec for full list.

Day 35 of #100DaysOfSpec: 3.2 Elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 3.2 Elements.

3.2.1 Semantics

Elements, attributes, and attribute values have meaning! Allows browsers, search engines, and other HTML processors to know how to present HTML documents. I like how the example lingers on how semantic markup is meaningful for accessibility (speech-based screen readers).

The semantics of a page can change over time (for example, modifications made to the markup via Javascript), so a user agent needs to update the document's presentation according to these changes to the document.

3.2.2 Elements in the DOM

The spec requires implementation of the HTML elements listed in the spec, and all these elements need to be accessible to scripting languages.

All element interfaces (I think this is being used as a synonym for API?) inherit from the HTMLElement interface. Some elements don't have any other requirements for implementation outside of this root interface. Not quite sure of an example there, as they are not given, but I wonder if the span element has no further requirements, being a pretty semantically-vague element. This being the spec, though, I wouldn't be surprised if there were another 8 different rules about how to process a span.

The HTMLUnknownElement interface is used for non-spec-defined elements.

Attributes in the main interface include:

  • title
  • lang
  • translate (boolean value)
  • dir
  • dataset (read-only)
  • hidden (boolean)
  • tabIndex
  • accessKey
  • accessKeyLabel
  • contentEditable
  • isContentEditable (boolean)
  • spellcheck (boolean)

3.2.3 Element definitions

This section details what-to-expect from element definitions, which include:

  • The element's categories: used for defining the element's content models.
  • Contexts in which this element can be used: can be redundant info to content model.
  • Content model: rules on which elements need to be included as children/descendents.
  • Tag omission in text/html: defines when an element's tag can be self-closing (like an <img />).
  • Content attributes: attributes that can set on the element; some are normative (required by spec, listed first), and some are not.
  • DOM interface

Unless restrictions are listed, attributes can contain any text value.