I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft’s official opinion of the spec. Subsections not listed below were read without comment.
Currently reading in 3.2.5 Global Attributes
220.127.116.11 The lang and xml:lang attributes
lang attribute sets the “primary” language of an element (in a BCP 47 value). I don’t know that I’ve ever noticed it on a tag other than
html. Child elements inherit
lang is used on any HTML element,
xml:lang is used on an HTML element in an XML document, and can also be used within other namespaces if specifically allowed (like in MathGL or SVG).
To return the language of a node, the UA needs to find the closest ancestor with a
lang attribute. If this goes bottom up instead of top down, that would be pretty inefficient if the html element is the only one with a language set.
xml:lang takes precedence over
lang in this parsing, if both are set on the same element.
If the attribute is nowhere to be found, the UA looks at the “pragma-set default language”, which appears to be the language set by a meta tag in the head of the HTML file. If that’s not found, it looks to HTTP to find any language info.
The language never defaults to a particular language (that would be biased and very easily incorrect); if there is no information that matches known values, the language of the document will be treated as “unknown”.
“User agents may use the element’s language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, for dictionary selection, or for the user interfaces of form controls such as date pickers).”