MAMA: HEAD structure
- HEAD and its sub-elements
- TITLE element
- META element
- SCRIPT element
- LINK element
- STYLE element
- BASE element
The elements that constitute the Head section of a page contain meta information about the document, providing data used to access crucial external resources or other important information. The importance of these elements is highlighted by their dominant usage above other elements.
The elements listed in this section usually exist within the
HEAD element, but the statistics
presented for these elements/attributes are for occurrences anywhere in
HEAD and its sub-elements
HEAD element is the most popular of any markup
element, and its top 5 sub-elements are also in the top 20 of ALL
markup elements. Not all of the sub-elements are so popular. The obsolete
ISINDEX element was only found in 63 URLs—it
seems well and truly dead, having long-since been replaced by the much richer
interactive Forms we know and love today.
Profile attribute of the HEAD element
This attribute was detected 19,030 times in MAMA's URLs. It is used to specify
a URL location of a metadata profile. At least ~90% of the values point to the
XFN metadata system, which is
used by the
Rel attribute of hyperlinks to indicate
relationships between authors and other people. Other types occurring with some
frequency were Dublin Core and RSS syndication metadata.
|Profile attribute value||Frequency|
With 3,459,207 occurrences, this is the second most popular element overall. When
planning MAMA, it did not seem like storing the contents of the
element would be all that compelling. I did not know how long the contents might
reach either. So I settled for clamping the stored value to 255 characters.
Any content after the first 255 characters was not stored. This makes some
TITLE length meaningless, but others
can still be useful; 128,874 URLs had empty
and 23,322 URLs had the maximum
TITLE length (255 characters).
This element ranks 6th overall in usage. Because this level of popularity was
expected, MAMA stored additional details about some of the attribute values
that were expected to be interesting. Values for the
Name attributes were saved, as well as specific
values for the
Content attribute for the Content-Encoding
and Generator usages.
Name attribute values
The numbers here show that "keywords" and
"description" are about equally popular. Perhaps
they are two great tastes that go great together on the Web? Yes. Over 90% of
the time, these two types of
META are indeed used together.
|META Name value||frequency||META Name value||frequency|
Http-equiv attribute values
The big story here is that most documents declare their MIME type using a
META statement (over 75% of all URLs analyzed).
All other usages are dwarfed by the "Content-type" value.
|META Http-equiv value||frequency||META Http-equiv value||frequency|
Of the 2,679,505 URLs that used the
value, 2,363,865 of them (88.22%) also specified a "charset"
parameter to provide encoding details. The top value was the western encoding
"iso-8859-1", which was 4 times as likely as any
other detected value. Encodings can sometimes be a bit cryptic, so the following
guide to languages and encoding values may be helpful with the short summary table
below (Fig 4-4) as well as the full frequency table
for the "charset" value:
- Cyrillic (includes Russian): windows-1251, koi8-r, iso-8859-5
- Japanese: shift_jis, euc_jp, x-sjis, iso-2022-jp, shift-jis
- Chinese: Trad. Chinese: big5, x-x-big5; Simp. Chinese: gb2312, gbk
- Korean: euc-kr, ks_c_5601-1987
|META charset value||frequency||META charset value||frequency|
Editors and Content Management Systems (CMS) used
MAMA also looked at these values in the section on markup validation. The most
noticeable nugget here is that the many incarnations of Microsoft FrontPage are
the definite leaders for this value. FrontPage occurs more than 8 times as
often as any other
The following two tables are summary totals for the individual values found in
the full per-URL frequency table.
|Editor substring||frequency||Editor substring||frequency|
|Microsoft FrontPage||347,095||Microsoft Visual Studio||22,936|
|Adobe GoLive||41,865||Adobe PageMill||15,148|
|Microsoft MSHTML||40,030||Claris Home Page||6,259|
|IBM WebSphere||32,218||Adobe Dreamweaver||5,954|
|NetObjects Fusion||26,355||Apple iWeb||2,504|
We will be looking at scripting in much greater depth in a future section on Script, so for now we will just take a quick look at the element and its attributes.
Note: Many people seem to have problems spelling "language" - a number of misspellings of this occur fairly frequently.
We will be looking at the dominant use of the
for CSS in much greater depth in a future section on CSS, so for now we'll just take a quick look at the element, its attributes
plus a detail view of the values for the
Type. Although the
Rel attributes are not required by HTML 4.0, authors
appear to treat them that way—they are both used in over 99% of all
LINK usages. The frequency of the
attribute is higher than expected, but a random sampling reveals only the
"made" value is in wide use.
Rel attribute values
This attribute was tracked in MAMA by breaking the value down into space-separated
components. External style sheet statements are present in 90% of
instances, over 20% use the shortcut icon syntax, and ~8.5% of
elements specify an alternate form (most likely RSS or similar based on the
Type attribute values in the next section).
|LINK Rel value||frequency||LINK Rel value||frequency|
Type attribute values
Type attribute is not required, but most authors
seem to use the
Type attribute for stylesheet and RSS
Type usage ratio appears to fall off
considerably when specifying a shortcut icon though—there are ~450,000
but "image/*" MIME types only happen 1/3 of the time.
|LINK Type value||frequency||LINK Type value||frequency|
As was already mentioned, we will be looking at style sheets in much greater depth in a future section on CSS, so for now we will just take a quick look at the element and its attributes.
This element's original purpose was to declare a common root URL for relative URLs in a document, so it is a bit surprising to find that the original usage has been usurped in popularity by frame target control.
This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.