Home

Day 77 of #100DaysOfSpec: 4.6 Edits (ins, del elements)

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.6 Edits

4.6.1 The ins element

“The ins element represents an addition to the document.”

  • cite attribute links to the edit source and/or more information.
  • datetime attribute gives the date and optional time for when the change was made.
  • Code examples given in the spec show an ins element wrapped around p. It would be a good idea to mark up an addition like this when there is no other in content in the p tag besides an addition. You wouldn’t (or shouldn’t) add an empty p tag to a document, so it also wouldn’t make semantic sense to have an addition to an otherwise empty paragraph.
  • ins elements should not cross implied paragraph boundaries.” (See spec for an example)

4.6.2 The del element

“The del element represents a removal from the document.”

  • Same cite and datetime attributes as the ins element.
  • Same rule about not crossing implied paragraphs.

4.6.3 Attributes common to ins and del elements

cite:

  • When the cite attribute points to a really long document, you’re encourage to use a “fragment identifier” (http://example.com/documentation#relevant-section for example) for the relevant portion of the cited document.
  • The cite value can be surfaced in a user agent’s UI so that they can follow the link to more info/context, but machines are the main benefactors of the attribute.

datetime:

  • If set, has to be a valid date string.
  • If the algorithm that parses this attribute doesn’t return a valid date, there is none associated with the edit element.
  • This is also a “private use” attribute can be shown to the user if the user agent (browser, etc.) wishes.

Day 76 of #100DaysOfSpec: bdi, bdo, span, br, wbr elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

4.5.26 The bdi element

“The bdi element represents a span of text that is to be isolated from its surroundings for the purposes of bidirectional text formatting.”

  • Essentially to prevent confusing the algorithm that handles direction of text when you have right-to-left text inside left-to-right text (or vice-versa).
  • The dir attribute doesn’t inherit from its parent.
  • Default value for dir is auto.

4.5.27 The bdo element

“The bdo element represents explicit text directionality formatting control for its children.”

  • Essentially allows you to override the algorithm mentioned in the bid notes.
  • Obviously, you need to supply the element with directionality information. The dir attribute can be set to ltr or rtl only. You can’t use auto because…that wouldn’t be an override.

4.5.28 The span element

“The span element doesn't mean anything on its own…”

Basically just a hook for styling or global attributes. Probably of greatest interest to web devs is how this element relates to others, e.g. what you can put inside it. That would be “phrasing content” elements.

4.5.29 The br element

“The br element represents a line break.”

  • Self-closing tag (<br />).
  • Can’t have children: elements or text.
  • Could style fancily if you wanted to.
  • Should only be used for semantically-necessary breaks in text (poetry, new lines in a physical address), as opposed to what a lot of people use them for: a hack for vertical spacing or for “thematic” breaks in text where two p tags would be more appropriate.

4.5.30 The wbr element

“The wbr element represents a line break opportunity.”

Never heard of this one before! You use it to tell the user agent that it’s okay to wrap a really long chunk of text that would otherwise be parsed as a single word and possibly overflow its container. You could use CSS to achieve something similar, but this would give you greater control and make the distinction on a semantic level, vs. just visual.

Self-closing tag.

4.5.31 Usage summary

I think it is nice that this section exists, as the “Text-level semantics” chapter is a long one.

Day 75 of #100DaysOfSpec: mark, ruby, rb, rt, rtc, and rp elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

Fun fact: I have never used any of the elements in today’s reading.

4.5.20 The mark element

“The mark element represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.”

  • Semantics inside a quotation: the HTML author is highlighting some text to give it more attention than the original source (“emphasis my own” type of deal).
  • Semantics in main body content: highlighted text is relevant to the “user’s current activity”. I figured this would probably be text that changes given some interaction, and similarly the example in the spec is highlighting on text that matches an in-page search term.
  • The strong element is for emphasis; the mark element is for relevance. A subtle difference.

4.5.21 The ruby element

“The ruby element allows one or more spans of phrasing content to be marked with ruby annotations.”

So…I was a bit confused at first because I thought this was referring to Ruby the programming language. Huh? Why does this particular programming language get its own element? But, oh:

“Ruby annotations are short runs of text presented alongside base text, primarily used in East Asian typography as a guide for pronunciation or to include other annotations. In Japanese, this form of typography is also known as furigana.”

Got it.

Which is fun because the programming language was created by Japanese developer Yukihiro "Matz" Matsumoto. I suppose one could imply this is where the name comes from, but it doesn’t necessarily seem that way from a https://en.wikipedia.org/wiki/Ruby_(programming_language#Thename.22Ruby.22 text: quick Wikipedia peek).

Anyway. This whole section is completely new to me and I’m not sure how to share notes without just rewriting the spec, so I’m going to take the cop-out method here and link back to this section of the spec.

4.5.22 The rb element

One of the things that melted my brain in that last section on ruby was the omission of closing tags. It is ok to omit the end tag of rb if the element is “immediately followed by an rb, rt, rtc or rp element, or if there is no more content in the parent element.” Same general concept applies to rt, rtc or rp elements.

As for the definition: “the rb element marks the base text component of a ruby annotation.”

This element doesn’t have any semantic meaning on its own: it helps a parent ruby element decide what said ruby element represents. If that parent doesn’t exist, the rb’s representation is the same as its contents. Same general concept applies to rt, rtc or rp elements.

4.5.23 The rt element

“The rt element marks the ruby text component of a ruby annotation.” The bits that give notation about the main rb text.

4.5.24 The rtc element

“The rtc element marks a ruby text container for ruby text components in a ruby annotation.”

The rtc element can be used for processing categorization of a ruby element’s content.

4.5.25 The rp element

“The rp element is used to provide fallback text to be shown by user agents that don't support ruby annotations.”

  • Has to come immediately before or after an rt or rtc, but can’t be jammed between two rt elements.
  • People often use the content of this element to put parentheses around the ruby text (rt elements), that bit which gives notation to the main text of the ruby element.

Day 74 of #100DaysOfSpec: sup, sub, i, b, and u elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

This whole section of elements has such unsatisfyingly short names, ha.

4.5.16 The sub and sup elements

“The sup element represents a superscript and the sub element represents a subscript.”

To be used semantically, not stylistically. Examples are abbreviations in some languages, or mathematical expressions (exponents are what jumps to mind). Not mentioned by the spec, but labels corresponding to footnotes are a common and seemingly-appropriate use of sup.

4.5.17 The i element

“The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text…”

Examples they give of “different” text:

  • Taxonomic designation (“the majestic otter, lutra lutra”)
  • Technical term
  • Idiom from another language
  • Transliteration
  • Internal thought (as in a novel)
  • Ship name in “Western” text

  • If the i element contains text in another language, you should mark that language using the lang attribute.
  • There are some overlaps with element eligibility where you might want to go with another element: em for stressing emphasis, dfn for defining a term.
  • It’s curious that the spec just assumes you know that the i element is associated with italics. It’s not until the end of this section that it mentions the i element can be italicized but doesn’t necessarily need to be. And it doesn’t mention in non-normative text that browsers will typically apply italics.

4.5.18 The b element

“The b element represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood…”

  • As the b element is so semantically vague and basically just a shortcut to styling, the spec encourages authors to find a more appropriately-semantic element where possible.
  • Same note as i, that b is not necessarily bold.

4.5.19 The u element

“The u element represents a span of text with an unarticulated, though explicitly rendered, non-textual annotation, such as labeling the text as being a proper name in Chinese text (a Chinese proper name mark), or labeling the text as being misspelt.”

That was not really the description I was expecting. The spec suggests different elements you can use in lieu of the u element in different cases because “authors are encouraged to avoid using the u element where it could be confused for a hyperlink”.

Day 73 of #100DaysOfSpec: code, var, samp, and kbd elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

4.5.12 The code element

“The code element represents a fragment of computer code.”

  • There’s no semantic way to designate the coding language. Classes are okay to use for syntax highlighting.
  • You don’t HAVE to use code inside pre, just if you want to display this as a block of pre-formatted text. Throughout these posts I’ve been using the code element inline.

4.5.13 The var element

“The var element represents a variable. This could be an actual variable in a mathematical expression or programming context, an identifier representing a constant, a symbol identifying a physical quantity, a function parameter, or just be a term used as a placeholder in prose.”

There are many use cases where MathML would be a better choice for your markup (math equations more complex than your second-grade stuff). But MathML support is pretty much worthless, so.

4.5.14 The samp element

“The samp element represents (sample) output from a program or computing system.”

Wait, so…could this be used to show output from web languages? Probably not, as the element is only allowed to contain phrasing content. BUT THAT WOULD HAVE BEEN COOL.

4.5.15 The kbd element

“The kbd element represents user input (typically keyboard input, although it may also be used to represent other input, such as voice commands).”

Where you place the kbd element changes its semantic meaning, which I don’t think I’ve come across yet. I don’t think I can adequately explain that without re-stating the spec, so maybe just read the original.

Day 72 of #100DaysOfSpec: abbr, data, and time elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

4.5.9 The abbr element

Probably I should have tried to get through this section yesterday, as it can be tied in semantic meaning to the dfn element.

“The abbr element represents an abbreviation or acronym, optionally with its expansion.”

  • The title attribute on abbr includes only its expansion: <abbr title=“World Wide Web Consortium”>W3C</abbr>.
  • It’s okay to use the element w/o a title attribute to hook into some CSS styles. Only if it makes semantic sense (is actually an abbreviation/acronym), of course.
  • A title attribute on one abbr element does not cascade to other attr elements in the document containing the same text value.

4.5.10 The data element

“The data element represents its contents, along with a machine-readable form of those contents in the value attribute.”

  • The value attribute is required.
  • Use cases:
    • Provide both a human-readable and a machine-readable format for information in one element. To be honest the only thing I could think of is a date with the ISO format on a value attribute, but in that case you’d be better off using the time element. Can anyone else think of an example?
    • As another way to provide info to scripts, similar to how developers use data-* attributes (those attributes feel more natural to me than using a data element, but ya never know, this could be more semantic in some contexts).

4.5.11 The time element

“The time element represents its contents, along with a machine-readable form of those contents in the datetime attribute.”

  • If you don’t set a datetime attribute, the time element can’t have any element descendents (loosey-goosey text ok).
  • The datetime value has to match one of the syntaxes listed in the spec.

Day 71 of #100DaysOfSpec: cite, q, and dfn elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

UAs = user agents = browsers and other HTML document parsers/renderers

4.5.6 The cite element

“The cite element represents a reference to a creative work. It must include the title of the work or the name of the author (person, people or organization) or an URL reference…”

  • No need to take such a literal tack with “creative”: pretty much any body of text created by a human being will do.
  • What might be unclear from the definition of cite is that it doesn’t contain the text content of the reference itself, just the title/name of where you got it from.

4.5.7 The q element

“The q element represents some phrasing content quoted from another source.”

  • The UA inserts the quotation marks, so you shouldn’t add them yourself.
  • Attributes include your standard global attributes, as well as cite, which is a link out (valid URL) to the original source or more information.
  • The q element and plain text contained in quotations are equally semantic/valid ways of marking up a quotation from another source.

4.5.8 The dfn element

“The dfn element represents the defining instance of a term.”

  • The definition that matches the term in the dfn needs to be included in the nearest paragraph / description list group / section ancestor.
  • What term exactly is being defined:
    1. title attribute, if set on the dfn element, otherwise…
    2. …if there is only one child element (including loose text), that is, an abbr with a title attribute, the term is that abbr’s title, otherwise…
    3. …it’s the text content of the dfn element.
  • The title attribute on the dfn contains ONLY the term being defined.
  • dfn does not inherit title for use in this way.
  • “An a element that links to a dfn element represents an instance of the term defined by the dfn element.”

Day 70 of #100DaysOfSpec: strong, small, and s elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics.

4.5.3 The strong element

“The strong element represents strong importance, seriousness, or urgency for its contents.”

  • Nested strong elements increases “importance”.
  • Using strong does not change a sentence’s meaning.

4.5.4 The small element

“The small element represents side comments such as small print.”

  • Use cases: “disclaimers, caveats, legal restrictions, or copyrights”, also licensing requirements or attribution
  • Not to be used just to “de-emphasize” something: a very common mistake, I’d think.
  • Only for short amounts of text.
  • It could make semantic sense to wrap a small element in a strong element, which would be important fine print (“read the fine print!”).

4.5.5 The s element

“The s element represents contents that are no longer accurate or no longer relevant.”

If you wanted to mark up a document edit that is a deletion, you’d use the del element instead.

Day 69 of #100DaysOfSpec: a, em elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.5 Text-level semantics, a new section!

UAs = user agents = browsers and other HTML document parsers/renderers

4.5.1 The a element

  • Whoa: the a element is categorized as 4 different types of content: flow, phrasing, interactive, and palpable.
  • The content model is “Transparent, but there must be no interactive content descendant.” It makes sense that an anchor link can’t contain actionable elements, as its default behavior is to trigger a click event. “Transparent” is a little confusing: it means that the elements required/allowed inside an a element are the same ones as allowed in its parent element—specifically, from the category of content that allowed the a tag to be in the parent in the first place.
  • Besides the default ARIA role (link), you can set button, checkbox, menuitem, menuitemcheckbox, menuitemradio, tab or treeitem
  • If an a element doesn’t have an href attribute, it’s no longer considered a hyperlink, but a placeholder for a hyperlink. In this case, all the other attributes need to be removed.
  • Dang, an a element could contain a section element and, obeying the “transparent” content model rules, still conform to the spec.

Available attributes:

  • href
  • target: spec mentions this is the browsing context for “hyperlink navigation and form submission”. Funny, because the general wisdom currently is that “buttons should be buttons”, i.e. don’t use an anchor link for a form submission.
  • download: “Whether to download the resource instead of navigating to it, and its file name if so”
  • rel: relationship between the document and linked resource
  • hreflang: language of the link resource
  • type: Hint for the type of the referenced resource. Intentionally en-vagueing with “hint for”. I believe I remember in a different section there being complicated instructions for parsing link type and this attribute being non-binding in some way…

4.5.2 The em element

“The em element represents stress emphasis of its contents.”

  • Having nested em elements increase “level of stress”.
  • Stress is not stylistic, it changes semantic meaning of a sentence.

Some splitting-hairs stuff:

The em element isn't a generic "italics" element. Sometimes, text is intended to stand out from the rest of the paragraph, as if it was in a different mood or voice. For this, the i element is more appropriate.

The em element also isn't intended to convey importance; for that purpose, the strong element is more appropriate.

See, this is a bit weird to me. Semantically changing the stress on a sentence more often does imply a different mood or voice. Stress can also convey importance. I think this is a judgement call situation.

Day 68 of #100DaysOfSpec: figure, figcaption, div, and main elements

I am reading and taking notes on the HTML specifications for 100 days as part of #The100DayProject. Read the initial intent/backstory. I am a Microsoft employee but all opinions, comments, etc on this site are my own. I do not speak on behalf of my employer, and thus no comments should be taken as representative of Microsoft's official opinion of the spec. Subsections not listed below were read without comment.

Currently reading in 4.4 Grouping content

UAs = user agents = browsers and other HTML document parsers/renderers

4.4.11 The figure element

“The figure element represents some flow content, optionally with a caption, that is self-contained (like a complete sentence) and is typically referenced as a single unit from the main flow of the document.”

  • Self-contained in this context simply means a complete thought, as opposed to an article that can be published independently, outside of the original content.
  • Can contain flow content, followed or proceeded by one optional fig caption element.
  • Use cases: marking up illustrations, diagrams, photos, code listings, etc.
  • When writing content for the figcaption, provide identifying info that would allow you to move the figure element to any position or order in the document, rather than using relative labelling (“below”, for ex).
  • “A figure element's contents are part of the surrounding flow.” So if the contents of the figure are only “tangentially related” to the content around it, you’d want to use an aside element, which could in turn wrap a figure.

4.4.12 The figcaption element

No additional comments here.

4.4.13 The div element

Funny that one of the most common HTML elements has such a short stub of an writeup. I suppose that matches its semantic vagueness: “The div element has no special meaning at all.”

  • “Authors are strongly encouraged to view the div element as an element of last resort, for when no other element is suitable.”
  • Where divs can come in handy, is by using class, lang, and title attributes to lend common styles and semantics to groupings of elements.

4.4.14 The main element

“The main element represents the main content of the body of a document or application. The main content area consists of content that is directly related to or expands upon the central topic of a document or central functionality of an application.”

  • No article, aside, footer, header or nav element ancestors allowed!
  • Has no effect on document outline (which is not implemented in browsers anyway).
  • Doesn’t contain content that appears site-wide; should just be content unique to this page.
  • Only one main per page.
  • UAs are encouraged:
    • to support keyboard nav to the main element.
    • Make the main’s first child element the next element to gain focus.
  • Not intended to denote the main section of a subsection in the document.
  • Should use ARIA role=“main” attribute on the main while we wait on UAs to implement the default role.

And that’s the end of 4.4 Grouping content!