Day 8 of 100DaysOfSpec, 2.4 Common Microsyntaxes

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft’s official opinion of the spec. Subsections not listed below were read without comment.

Today’s reading demonstrates how UAs should handle and format different types of data.

2.4.1 Common parser idioms

So many space characters: U+0020 SPACE, “tab” (U+0009), “LF” (U+000A), “FF” (U+000C), and “CR” (U+000D). Having trouble figuring out what these different spaces actually are…

2.4.2 Boolean attributes

You can’t have “true” and “false” on a boolean attribute. It is simply whether the attribute exists or not. If the attribute does exist, its value (when checking for expected/standard values) is case-sensitive and can’t have leading or trailing whitespace.

2.4.3 Keywords and enumerated attributes

Enumerated attributes can take as their value a keyword from a specified list. Each of those keywords represents a “state”. There can be some overlaps as to which has which state; non-conforming values so old sites don’t break; and two default values: an “invalid” default and a “missing value” default.

Again, in order to match an expected value, the keyword is case-sensitive and can’t have extraneous white spaces.

“The empty string can be a valid keyword.”

2.4.4 Numbers

The spec is very specific about what a negative number is.