MAMA: Common attributes
The common attributes are those that are used across a multitude of elements. They are often attributes of critical importance to the most popular features that Web browsers have. They are listed here under the same umbrella for a single goal— when viewed together, comparing the use of these attributes with their many applicable elements can expand our understanding of an attribute itself and how/when authors tend to use it.
This attribute has a number of heterogenous uses—many of them very popular.
It comes as no surprise, then, that it is very widely used; 3,220,308 of MAMA's
URLs (91.77%) carry the attribute in some fashion. Comparing the usage of
Name to the
(which shares at least part of its functionality) demonstrates clear differences.
Name has especially deep coverage in its uses
not to mention its often-paired use with Form fields. In many of these cases,
usage runs above 95%. There are also some noticeable uses with vendor-specific
attributes (such as
where penetration is almost 100%—a sure sign that a program is responsible
for the attribute creation; humans just are not that reliable!
Name attribute frequency
The average quantity of the
Name attribute in MAMA's
URL set (when it was used) was 13.0 times. This average is a bit lower than
that of its sister attribute
Id, and this is also
reflected in the extreme use cases for
Name. As usual,
there is an extreme (some might say absurd, but I will not pass judgment) use
Name attributes in one document), with
the next-nearest neighbor falling into a very distant second
place. The full frequency table is begging
for your attention.
|http://www.broekemasierbestrating.nl/Default.htm/ (URL no longer active)||598|
Name attribute values
The list of values for this attribute can be a bit confusing, as it is a
combined value list representing all types of
The top 15 slots are almost all from
and other distinct uses like
Forms, CSIM and
hyperlink anchors stand out with high representation as well.
This attribute is used to identify document structures for use in CSS. Multiple
elements can carry the same class name, thereby allowing authors to create
arbitrary groupings to control presentation. In all, 2,139,184 of the URLs in MAMA
(60.96%) had at least one occurrence of the
attribute. In addition, 98.52% of URLs that use
Classalso use CSS
in some manner. The value for the
Class attribute is
a space-separated list of class names, but the typical usage is a single class
name—only 296,136 of the URLs using the attribute (13.84%) had any
Class value carrying more than one class name.
Elements using the
By relative percentage, the
Class attribute shows a
strong usage tendency. Block- and Form-related elements have high usage but
inline/phrasal elements usually have much lower representation.
The only real exceptions to this are hyperlinks (
SPAN element (and also, inexplicably, the
ABBR element). In fact, the generic block and inline
respectively) are among the highest relative representation of any of the elements.
On the other hand, basic structural elements like
BODY have surprisingly low relative
Class usage. Adobe GoLive's editor appears to generate
Class attribute for its custom
element in every page it touches.
Class attribute frequency
Most pages were found to use this attribute, and did they ever! Some of the
extreme cases found employed
Class attribute in impressive
quantities. Because the attribute is used as an aggregation mechanism with disparate
elements, usage in high quantities is to be expected. The average number of
Class attributes in a page (when they were used) was found
to be 48.4. The highest quantity of the attribute recorded by MAMA was 98,439 times,
but the live version at the time of writing is even higher: 102,627
times! It is a spreadsheet application—a gigantic grid of cells, each with a
Class attribute—a sure way to inflate an attribute if ever
there was one. That single case has 4 times as many
attributes as any other case found in MAMA. A full
frequency table of
Class attribute quantities is available.
|http://rpo.library.utoronto.ca/poem/19.html/ (URL no longer active)||15,940|
Class name values
In Hickson's Google research, he takes a close look at
attribute values. As the main editorial force behind WhatWG's and the W3C's HTML 5,
this was definitely useful and informative data to gather and examine. Hickson said
that one value of the
Class attribute was "baffling"—the
value of "link". In the URLs sampled from MAMA, the
class seems to be used often in relation to hyperlinks. The obvious question then
is why an author would not just use an
AREA element selector instead of creating a custom class.
Well, a small sampling of URLs using this class value showed that it was applied to
structures related to a hyperlink just as often as it was applied
to a direct hyperlink itself (like being applied to a
table cell encapsulating nothing but a link). In such cases, using a simple CSS
Element selector would not be sufficient. Yes, other methods could be used to reference
it (and probably are—CSS selectors are not yet tracked in MAMA), but this method at
least is widely used.
The frequency table of
Class attribute values compares
favorably to Hickson's Google research. In all, 15 of the top 20 values from MAMA's list
are in the top 20 from Google's list, and the top 2 values ("footer"
and "menu") are the same order in both. The most
popular value "footer" is twice as popular as its
natural companion "header"; so, could
one say that authors prefer page footers to page headers in their designs? One big
noticeable trend from the
Class value list: there are a
high number of class names of the form:
The popularity of each class value decreases as the integer value at the end
increases. MAMA detected values like this going at least up to
"style117" and probably higher. A high (but untested)
correlation was noticed between class names of this type and the use of Macromedia
Dreamweaver scripting library functions. As Macromedia Dreamweaver is not always
the easiest editor to detect, this correlation will remain a theory.
This attribute is used to specify CSS at ground-level—"in the trenches" so
to speak. Using CSS in this way negates many of its broad control advantages;
styles applied only affect the current element and its descendents. In all, 1,878,916
of MAMA's URLs (53.54%) use it in some fashion. It is used most often with the
elements - to be expected since these elements don't have any special intrinsic
rendering behaviors on their own. There is a noticeable authoring fondness for
Style with Form-related elements, and Table-related
elements (although there are some exceptions with the latter, like the
TR element). Most pages use
with Inline/Phrasal markup elements sparingly, while its popularity with most block
elements fares much better.
This attribute is used to create a document-wide unique identifier for an element.
Id attribute was originally meant to supercede the
use of the
Name attribute, but with 1,782,769 of MAMA's
URLs (50.80%) using
Id and 3,220,308 using
Name (91.77%), it is a transition that seems to be
still very much "in progress".
DIV uses the
attribute more than twice as often as its nearest neighbors in the frequency
DIV usage of
Id is also 5
times as popular as its related cousin
IFRAME and Form-related elements have rather high
UL representation is also quite high,
but there doesn't seem to be any obvious justification for that outcome.
Netscape 4.x's proprietary
elements each have over 50% relative usage of
Id attribute frequency
About half of the pages in MAMA used the
Pages that used the attribute averaged about 15.8 uses per document. As with
many other cases in MAMA, the extreme use case was unique—it used 3 times as
Id attributes as the next-nearest URL. A
full frequency table of
quantities is ready for any curious readers.
Id attribute values
At first glance, it does not seem like examining these values would be interesting.
For an attribute that is supposed to contain unique values, the chances of value
overlap between URLs should be much lower than with many other attributes, right?
Not so fast. The really interesting thing to note is there is considerable overlap
Class attribute values.
"Footer" is the most popular value for each
attribute, but many of the most popular values for each attribute hold different
relative positions in the value lists. #2 on the
"Content" is #6 on the
list, #3 on the
Id list is #9 on the
list, and so on. Hickson's Google study only looked at
atcode class=tribute values, but perhaps should have looked at
Id as well.
It is apparent from the top values in both
Id rankings that authors continually have to work around
unfilled semantic niches in the standards.
Note: There is an interesting discrepancy between
HTML and CSS treatment of the
Id value. In HTML, an
Id value must begin with a letter
([A-Za-z]), but in CSS there is no such restriction in referencing an
Id value. In theory, the HTML constraint should limit
Id values to the more limited HTML form, but browsers
are usually more forgiving and allow the CSS interpretation of an
value, so in "the wild" there are many cases where
values begin with a different character. In MAMA, 135,994 of the URLs using
Id (7.63%) had at least one value that began with a
character other than [A-Za-z].
Id value trends
The full list of
Id attribute values also points out
one other interesting tendency: The top 100 consists of repeating
archetypes where the value only varies by the addition of numeric counters. This
obviously indicates cases where more than one of a single type is used/expected,
such as with "table", "image",
or "menu". Of the top 100
values, over half of the values consist of variations on just 7 of the
value substrings shown in the table below (Fig 5-4). These patterns are used over
and over and do not stop with the top 100. There are many other values that also
show templated trends, and substring values like "table"
aren't used just 10 times in the entire value list—that is just the number of
times they are used in the top 100 values. In actuality, it is used 95 times in
Id value list (and probably more, given an exhaustive
complete list of
This attribute is used to set "advisory information" about an element. In
practical terms, this means authors can specify any value they want. It was
found in 1,010,147 of MAMA's URLs (28.79%). It is most popular with hyperlinks
AREA), as well as
TH elements eclipses the relative usage by
TD elements nearly 4:1. A few elements have
extraordinarily high usage rates:
ACRONYM - probably because HTML 4 goes out of its
way to define special
Title behavior with these
At the request of a co-worker, one extra
feature was tracked by MAMA: Newlines in the source code. Historically,
these have caused various problems in some browsers and it was hoped that
it could be useful in a testing capacity at some point down the line to track
them. Of the pages that used the
21,759 (2.15%) had at least 1 embedded Newline.
This attribute indicates the base language of the content. It takes as a value
RFC 1766 language codes. In
practice, some values in the full frequency table
stray from this ideal—they occasionally include encodings such as
"utf-8" and other types of "languages" (such as
offers a few head-scratchers:
SPAN is a much more
popular user of
HTML is also quite popular—considerably higher than
application to the
BODY element. Both
also stand out with especially high representation.
This attribute is meant to give the "base direction of directionally neutral
text" for an element's content and attribute values. There are two acceptable
values, "ltr" (left-to-right) and
"rtl" (yep—right-to-left). The attribute was
detected at least once in 136,997 of the URLs MAMA analyzed (only 3.90%).
The complete list of values detected for this
attribute show some other uses not defined by HTML, aside from the usual typos
and other uninteresting noise. One use is apparently to define a base directory
("dir") associated with the element to which it is applied. However, these other usages
are absolutely trampled by the two main accepted values in terms of popularity.
The least popular of the two, "rtl" occurs more
than 100 times as much as the next nearest non-HTML value. The left-to-right
value "ltr" is more than 10 times as popular as
"rtl". Authors have a clear preference for using
this attribute on the
HTML elements over others.
Dir attribute value popularity
Elements using the Dir attribute
show the strongest
Dir attribute tendency, with the
BLOCKQUOTE element actually having the highest relative
usages are understandable but
BLOCKQUOTE may not be
as obvious. A moment's reflection reveals why—to maintain the full integrity
of a quotation, the natural language and internal direction of the content must
An accesskey is supposed to be a single character used to give focus to an
element. In all, 80,026 URLs had at least one element carrying an
Accesskey attribute, but the two most popular elements
that use it are
A (68.57%) and
(44.36%). Three other elements also used this attribute on a regular basis. The
full frequency table of values shows that
authors do well at restricting their values to a single character. Curiously,
the most popular character used is "s" (with no obvious rationale to its
popularity). After that, numbers dominate the list. Ten of the top 15
Accesskey values are the digits "0"-"9" (with "1"
being most popular). Looking beyond the numbers, we find that the entire English
alphabet ranks next before anything else. The top 36 spots consist of the
digits "0"-"9" and the English alphabet "a"-"z" (MAMA ignored case-sensitivity
when generating this frequency table). The
LABEL elements seem to have a greater affinity for
the attribute than other elements, and
A enjoys a
much higher relative usage rate than the analogous
This attribute gives an explicit position in the tabbing order for the current
page. In MAMA, 49,081 URLs use the
at least once—usage on the
INPUT element represents
over 70% of that total, while its next most-popular use on the
element trails far behind at just over 30%. HTML 4 defines only a narrow set of
elements that can use
Tabindex, but in common usage
some elements that aren't in the HTML 4 allowed set are more popular than some
of those that are—for example,
usages are more popular than
A look at element-relative usage shows that authors prefer to use this
attribute with form widgets (
BUTTON) over other elements.
This attribute is a URL that should provide a "longer description" of the resource.
In MAMA, 26,641 URLs used
Longdesc in some manner. Only 4 elements
were found to use this attribute in noticeable quantities:
INPUT, with the
IMG usage occurring
far more than any other type (95.39% of all URLs with
usage). All the values here should be absolute or relative URLs, but
the reality is a bit different. The full frequency
table for this attribute at first seems rather unremarkable. All of the
frequencies are quite low, indicating the unique nature of the attribute values.
What also stands out is how many of the values are definitely not
URLs, such as:
"boy hunched alone with hands in arms"
A rough estimate of the number of non-URL
attribute values is just over 1/3.
This attribute is not very widely used—only 6,643 URLs were detected to carry
the attribute in any capacity. The
are where it is used most, but there are some other oddities in the list below;
how exactly does a
SPAN element set to
Disabled react any differently than a
that doesn't carry the attribute?
This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.