MAMA: HEAD structure

By Brian Wilson


  1. Introduction
  2. HEAD and its sub-elements
  3. TITLE element
  4. META element
  5. SCRIPT element
  6. LINK element
  7. STYLE element
  8. BASE element


The elements that constitute the Head section of a page contain meta information about the document, providing data used to access crucial external resources or other important information. The importance of these elements is highlighted by their dominant usage above other elements.

The elements listed in this section usually exist within the HEAD element, but the statistics presented for these elements/attributes are for occurrences anywhere in the document.

The HEAD element is the most popular of any markup element, and its top 5 sub-elements are also in the top 20 of ALL markup elements. Not all of the sub-elements are so popular. The obsolete ISINDEX element was only found in 63 URLs—it seems well and truly dead, having long-since been replaced by the much richer interactive Forms we know and love today.

Fig 2-1: Elements in the HEAD block
ELEMENTFrequency ELEMENTFrequency

The Profile attribute of the HEAD element

This attribute was detected 19,030 times in MAMA's URLs. It is used to specify a URL location of a metadata profile. At least ~90% of the values point to the XFN metadata system, which is used by the Rel attribute of hyperlinks to indicate relationships between authors and other people. Other types occurring with some frequency were Dublin Core and RSS syndication metadata.

Fig 2-2: Popular values for the HEAD Profile attribute
(See also the complete frequency table.)
Profile attribute valueFrequency,770,149

TITLE element

With 3,459,207 occurrences, this is the second most popular element overall. When planning MAMA, it did not seem like storing the contents of the TITLE element would be all that compelling. I did not know how long the contents might reach either. So I settled for clamping the stored value to 255 characters. Any content after the first 255 characters was not stored. This makes some statistics about TITLE length meaningless, but others can still be useful; 128,874 URLs had empty TITLE elements, and 23,322 URLs had the maximum TITLE length (255 characters).

META element

This element ranks 6th overall in usage. Because this level of popularity was expected, MAMA stored additional details about some of the attribute values that were expected to be interesting. Values for the HTTP-Equiv and Name attributes were saved, as well as specific values for the Content attribute for the Content-Encoding and Generator usages.

Fig 4-1: META element/attribute frequency

META Name attribute values

The numbers here show that "keywords" and "description" are about equally popular. Perhaps they are two great tastes that go great together on the Web? Yes. Over 90% of the time, these two types of META are indeed used together.

Fig 4-2: Top META Name attribute values
(See also the complete frequency table.)
META Name valuefrequency META Name valuefrequency

META Http-equiv attribute values

The big story here is that most documents declare their MIME type using a META statement (over 75% of all URLs analyzed). All other usages are dwarfed by the "Content-type" value.

Fig 4-3: Top META Http-equiv attribute values
(See also the complete frequency table.)
META Http-equiv valuefrequency META Http-equiv valuefrequency

META Http-equiv="content-type" charset values

Of the 2,679,505 URLs that used the META Http-equiv="content-type" value, 2,363,865 of them (88.22%) also specified a "charset" parameter to provide encoding details. The top value was the western encoding "iso-8859-1", which was 4 times as likely as any other detected value. Encodings can sometimes be a bit cryptic, so the following guide to languages and encoding values may be helpful with the short summary table below (Fig 4-4) as well as the full frequency table for the "charset" value:

  • Cyrillic (includes Russian): windows-1251, koi8-r, iso-8859-5
  • Japanese: shift_jis, euc_jp, x-sjis, iso-2022-jp, shift-jis
  • Chinese: Trad. Chinese: big5, x-x-big5; Simp. Chinese: gb2312, gbk
  • Korean: euc-kr, ks_c_5601-1987
Fig 4-4: Top META content-type/charset component values
META charset valuefrequency META charset valuefrequency

META Name="generator": Editors and Content Management Systems (CMS) used

MAMA also looked at these values in the section on markup validation. The most noticeable nugget here is that the many incarnations of Microsoft FrontPage are the definite leaders for this value. FrontPage occurs more than 8 times as often as any other META Name="generator" value. The following two tables are summary totals for the individual values found in the full per-URL frequency table.

Fig 4-5: Editor usage tracked via META Name="generator"
Editor substringfrequency Editor substringfrequency
Microsoft FrontPage347,095Microsoft Visual Studio22,936
Adobe GoLive41,865Adobe PageMill15,148
Microsoft MSHTML40,030Claris Home Page6,259
IBM WebSphere32,218Adobe Dreamweaver5,954
NetObjects Fusion26,355Apple iWeb2,504
Microsoft Word24,892  
Fig 4-6: CMS usage tracked via META Name="generator"
CMS substringFrequency

SCRIPT element

We will be looking at scripting in much greater depth in a future section on Script, so for now we will just take a quick look at the element and its attributes.

Note: Many people seem to have problems spelling "language" - a number of misspellings of this occur fairly frequently.

Fig 5-1: SCRIPT element/attribute frequency

We will be looking at the dominant use of the LINK element for CSS in much greater depth in a future section on CSS, so for now we'll just take a quick look at the element, its attributes plus a detail view of the values for the Rel and Type. Although the Href and Rel attributes are not required by HTML 4.0, authors appear to treat them that way—they are both used in over 99% of all LINK usages. The frequency of the Rev attribute is higher than expected, but a random sampling reveals only the "made" value is in wide use.

Fig 6-1: LINK element/attribute frequency

LINK Rel attribute values

This attribute was tracked in MAMA by breaking the value down into space-separated components. External style sheet statements are present in 90% of LINK instances, over 20% use the shortcut icon syntax, and ~8.5% of LINK elements specify an alternate form (most likely RSS or similar based on the Type attribute values in the next section).

Fig 6-2: Top LINK Rel attribute values
(See also the complete frequency table.)
LINK Rel valuefrequency LINK Rel valuefrequency

LINK Type attribute values

The Type attribute is not required, but most authors seem to use the Type attribute for stylesheet and RSS uses. The Type usage ratio appears to fall off considerably when specifying a shortcut icon though—there are ~450,000 uses of Rel="icon" syntax, but "image/*" MIME types only happen 1/3 of the time.

Fig 6-3: Top LINK Type attribute values
(See also the complete frequency table.)
LINK Type valuefrequency LINK Type valuefrequency

STYLE element

As was already mentioned, we will be looking at style sheets in much greater depth in a future section on CSS, so for now we will just take a quick look at the element and its attributes.

Fig 7-1: STYLE element/attribute frequency

BASE element

This element's original purpose was to declare a common root URL for relative URLs in a document, so it is a bit surprising to find that the original usage has been usurped in popularity by frame target control.

Fig 8-1: BASE element/attribute frequency

This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.


The forum archive of this article is still available on My Opera.

No new comments accepted.