Day 17 of 100DaysOfSpec, Common Microsyntaxes forevah

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft’s official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 2.4 Common microsyntaxes, color and down through 2.4.10 Media queries.

2.4.6 Colors

A “simple color” has a value 0-255 for the RGB channels of digital color. A “valid simple color” is the hex code for the color, preceded by the “#” symbol. Notably, it’s seven characters long—#000000, not #000 shorthand. Serializing simple colors returns lowercase for all the digits (#ffffff vs #FFFFFF).

The spec includes an algorithm for parsing legacy color values, like “dodgerblue” (aww, the ol’ days of tables and color names). A cool reason not to use these legacy names: it takes nine steps to parse a hex code, and 20 steps—plus five substeps—to parse a legacy color name.

2.4.7 Space-separated tokens

“A set of space-separated tokens is a string containing zero or more words (known as tokens) separated by one or more space characters, where words consist of any string of one or more characters, none of which are space characters.”

In unordered and ordered sets, none of the tokens are duplicated.

Some sets of tokens require certain allowed values, and all handling of case sensitivity is done on a case-by-case (heh) basis.

I’m unsure of what is the application/use for tokens. In my experience, “token” has a strong correlation with security and authentication. But it seems token here is a very general term for “misc piece of data we want to store and use”. I’m going to have to put this on the to-ask-my-colleagues list.

2.4.8 Comma-separated tokens

The same as above, but with commas. The space characters are optional, though. Why would the spec have two different syntaxes for sets of tokens? My guess is that both were already in production, so they came up with guidelines that fit both syntaxes.

2.4.9 References

Just gonna leave this here: “A valid hash-name reference to an element of type type is a string consisting of a “#” (U+0023) character followed by a string which exactly matches the value of the name attribute of an element with type type in the document.”

2.4.10 Media queries

A media query successfully “matches the environment” if the string is empty or simply has space characters. So…you could use it as a hack for styling things only in a browser that supports media queries.