Hierarchy

  Jul 19, 2004

For some time now, something has been bothering me about the common way of marking up a document structure. First of all, it doesn't seem like there is a consensus on wether or not to use multiple H1 tags, or any at all.

Seems to me, there are three possible alternatives to marking up a document structure: 1) The website name is marked-up with H1, the entry title, or titles if there are several on the same page, are marked-up with H2; or 2) The website name is in the title-tag, the entry titles are marked up with H1, one page might contain several H1 tags; or 3) The website name is in the title-tag, the entry titles are marked up with H2, there is no H1 tag.

Which is the preferred method? In my opinion, to mark-up the website name as H1 is kind of redundant, seeing as how it's already in the title-tag. Note that I'm only talking about markup here, visual presentation is something completely different, and I'd rather if that be left to another discussion. I'm talking about structural markup.

Secondly, since H1-H6 are supposed to denote the structure of a document, do they really have a place in the side column as section headlines? This seems to be how most people, myself included, have chosen to mark-up their menu-column section headlines. When you think about it, it might seem like the preferred way of marking them up, but upon closer inspection, it doesn't look right.

I'm going to use Dan Cederholm's SimpleBits.com as an example. Like most other websites which have a side column with some kind of menu options in it, Dan too has chosen to mark-up those headlines using Hn-tags. The reason I'm using Dan's site as an example, specifically, is because I assume that he has thought long and hard about his markup, that he didn't just look at how someone else had done it and did something similar.

It seems right, doesn't it, to use Heading tags for marking-up what is essentially headlines, even though they reside in the menu column? Well, take a look at this outline of SimpleBits.com, and how the menu headlines appear to belong hierarchically underneath the last entry.

Because SimpleBits.com is likely to change at one point or another, I've copied the current document outline here:

  • Bulletproof Slants (entry title)
  • Overcast: More Stock Web Icons for Sale (entry title)
  • Subscribable Validation (entry title)
  • A Gathering in Boston (entry title)
  • Stockholm; Stock Web Icons for Sale (entry title)
  • Wireless City (entry title)
    • My Book (menu section title)
    • QuickBits (menu section title)
    • Stock Icons For Sale (menu section title)
    • Featured Publication (menu section title)
    • Recommended (menu section title)
      • Misc. Sites
      • Weblogs
    • Misc. (menu section title)

As you can see, the use of Hn-tags in the column menu suggests that the menu section titles are hierarchically below the last entry ("Wireless City").

I assume that a structure such as this is probably what most of us are going for, wether we're aware of it or not:

  • Content
    • Entry title
    • Another entry title
    • Yet another entry title
  • Menu
    • Menu section title
    • Another menu section
    • Yet another menu section
      • Possibly a sub section

I think you'll agree that this outline, as opposed to the previous one, more accurately describes a given weblog's document structure.

My current structure is even worse, seeing as how I've used Hn-tags to mark-up the dates of each entry, as well as my "slogan", which, when you look at the outline, doesn't make sense at all.

A few random examples:

I hope I haven't offended any of the people whose websites I've made examples of, by scrutinizing and picking their respective document structures apart. I made examples of these (random) people's websites only because I have great respect for each of them, and assume that they have thought very carefully about how they mark-up their content.

Run your website through the W3 validator (verbose mode, check the "outline" box); is the structure accurate? In your opinion, how should a document structure be marked-up? Are any of the examples brought up here accurate?

Update:

These two articles expand on this topic, and, in my opinion, more eloquently explains how to pursue a logical document structure using Headings, in practice.

Update 2:

The W3C is pretty clear on this issue after all, not in the HTML spec, but in the Web Content Accessibility Guidelines.

Since some users skim through a document by navigating its headings, it is important to use them appropriately to convey document structure. Users should order heading elements properly. For example, in HTML, H2 elements should follow H1 elements, H3 elements should follow H2 elements, etc. Content developers should not "skip" levels (e.g., H1 directly to H3). Do not use headings to create font effects; use style sheets to change font styles for example. HTML Techniques for Web Content Accessibility Guidelines 1.0

Lots of smart ideas are being expressed in the comments section of a related post at Eric Meyer's. I'm especially sympathetic to the notion of separating the content from the website wrapped around it, i.e. not using Headings what-so-ever in navigational elements (which I briefly mentioned in this post), but this would require some other way of finding the navigation semantically, such as a new <navigation> element, which the WHAT Working Group is considering.

Update 3:

Simon Collison hosts a panel in which he asks Richard Rutter, Jason Santa Maria, Andy Budd, Andy Clarke, Mike Davidson, D. Keith Robinson, Jon Hicks and Paul Scrivens about their perspective on the use of Headings.

Permanent link

Comments

  1. This is one thread I'll follow carefully, as I don't have anything to contribute but more questions. Seeing your sample outline makes logical sense, but so does applying some form of heading to the site title which doesn't make as much sense in outline view.

    Once again, this is the problem with semantic markup -- the guidelines suggest but don't actually guide usage. It's up to author interpretation, and nine times out of ten we get it wrong until someone comes along and definitively tells us how to do it right.

    It seems ridiculous that the validator's outliner is what's supposed to guide us in this case due to its relative obscurity, but it does look like a useful guide.

    Comment by Dave S. at 15:54, 19 Jul, 2004 #

  2. There is an Extension for Mozilla which shows the outline in the sidebar:
    Outline Extension 0.1
    Hit alt+O after installing it.

    What I use for my site right now is:

    * heading for entry title
    --* possible sub-headings
    --* possible sub-headings
    * heading for entry title
    --* possible sub-headings
    * heading for menu-section
    * heading for menu-section

    However starting with h2. (work in progress)
    h1 will be the Section Title. On second thought this only makes sense for multple entries per page. When viewing a page with only a singe item, the Entry-Title should be h1, theoretically. However this would somewhat complicate using sub-headings within posts, because they would have to changed depending on context (single-entry vs. entry-list), which most software I know doesn't support right now.

    So for me I'll probably live with the compromise keep using h2 for entr-headings on single-item pages.

    Comment by Sencer at 17:48, 19 Jul, 2004 #

  3. Very interesting. I've always looked at heading order as a level of importance, and sidebar items aren't as important (to me) as entry titles. But the outline view from the W3C is telling us something different.

    As Dave mentioned earlier, the spec suggests, but doesn't guide us to the correct usage.

    There's been recent debate over the h1 element, and how this shouldn't be repeated -- or saved only for site titles. But it looking at headings as an outline, this clearly wouldn't be wrong depending on your stucture.

    The combinations are endless, and without clear instruction, we're left to make our own educated guesses when it comes to implementation.

    Comment by Dan Cederholm at 18:06, 19 Jul, 2004 #

  4. How about marking up the page so that the main content and any sub content have an associated h1, but then hiding them using CSS if you don't want them to be displayed?

    h1 - News
    h2 - News item 1
    h2 - News item 2
    .........
    h2 - News item x

    h1 - Secondary Content (hidden using CSS)
    h2 - Latest Links
    h2 - Favourite Sites
    etc

    Comment by Andy Budd at 18:15, 19 Jul, 2004 #

  5. Dan: "I've always looked at heading order as a level of importance"

    Thanks, I couldn't quite find the proper words myself, but yeah, that's what I think most people do; they use headings to denote importance, not realizing that it makes the document structure -- which headings are actually supposed to denote -- seem scrambled.

    My opinion is that H1 should not be used for the site title at all, that's what the title-element is for. Just my opinion, of course (and not one I've had long enough to try it in practice).

    In my example, a heading called "Content" and one called "Menu" would be marked up as H1, and I think this is probably the best way to mark-up the content structure, even though it looks really weird if you view headings as something that denotes significance, as opposed to structure.

    Andy: I agree, that's basically what I was thinking with my example outline, although I didn't mention the visual representation.

    Comment by Tomas Jogin at 18:23, 19 Jul, 2004 #

  6. I'ver started using H# tags for orders of importance and only for "header" like elements. This makes the most sense to me. It also creates a viewing size of the content when the stylesheet is not present that reads more correctly with regard to emphasis.

    In my new mark-up for DxF, the header tags now denote how important they are in the layout, even if they don't follow a concrete order.

    For what it's worth...

    Comment by Andrei Herasimchuk at 18:24, 19 Jul, 2004 #

  7. BTW, in my new mark-up. I use one and only one H1 tag for the lead article title, then use H2 for sub-section heading. H3 is for document titles in sections if they exist, or things that are less important than sub-headings. H4 is for paranthetical headings and H5 for menu navigation.

    Comment by Andrei Herasimchuk at 18:27, 19 Jul, 2004 #

  8. Andrei: That's the "traditional" way of marking up content, the problem is that the document structure doesn't match the intended page structure.

    To take your website as an example this time, your structure (the front page) suggests that your "Bonfire" and "Hot off the press" sections, etc, are located hierarchically below your your latest entry, which of course they're not.

    To think of headings as something that denotes structure, as opposed to emphasis (importance) seems odd and it is certainly not how most people have used it before, but it is nevertheless the correct way to use heading tags.

    To quote the validator: "If this does not look like a real outline, it is likely that the heading tags are not being used properly. (Headings should reflect the logical structure of the document; they should not be used simply to add emphasis, or to change the font size.)"

    Comment by Tomas Jogin at 18:36, 19 Jul, 2004 #

  9. > In my example, a heading called "Content" and one
    > called "Menu" would be marked up as H1, and I think
    > this is probably the best way to mark-up the content

    I disagree. I don't think there's need for yet another layer that seperates data and meta-data.

    HTTP-Header vs. (X)HTML-Document
    HTML-Head vs HTML-Body

    And now an additional layer for "real" content vs. menu-fluff? If there is no meaningful title for what you call "content" (like e.g. "Homepage", "Archive", "Webdesign-Section) then there I really see no point. But maybe that's already what was meant and I just understood it wrong...

    Comment by Sencer at 18:43, 19 Jul, 2004 #

  10. Sencer: I think you misunderstand me. What I'm suggesting has nothing to do with separating data from meta-data, it only concerns representing an accurate document structure, as opposed to a scrambled document structure.

    Again, headings should be used to denote structure, not emphasis (importance). This is not the way most people use headings today, that's why I wrote this post in hope of getting a few suggestions of how this should be done.

    Comment by Tomas Jogin at 18:47, 19 Jul, 2004 #

  11. Andrei, your structure has a lot of missing headings, however (see Design by Fire outline). Those would mostly be fixed by replacing all your h5's with p's for the subtitles/taglines of the various sections.

    Below is what my structure now looks like (after reading this outline). Please ignore the fake titles and all, I'm still in development of the site :)

    * The KuraFire Network
    ... o Site Navigation
    ... o Latest Blog entry
    ... ... + Some article on CSS
    ... ... ... # Some sub heading
    ... o Previous entries
    ... ... + Some Random Entry Title
    ... ... + Some Random Entry Title
    ... ... + Some Random Entry Title
    ... ... + Some Random Entry Title
    ... ... + Some Random Entry Title
    ... o Recent comments
    ... ... + Doug Bowman on Name of some article that was commented on
    ... ... + Doug Bowman on Name of some article that was commented on
    ... ... + Doug Bowman on Name of some article that was commented on
    ... ... + Doug Bowman on Name of some article that was commented on
    ... ... + Doug Bowman on Name of some article that was commented on
    ... o Additional data
    ... ... + Article Summary
    ... ... + June 2004
    ... ... + Recommended


    I disagree with Tomas on the sitetitle h1-vs-title issue, as his suggestion would prevent one from ever righteously putting the title of a specific entry in the TITLE element of the page, which almost all blogs do (when you view the permanent URI of that entry), and which, imho, makes sense as well. If only for being more practical, as being able to see in your taskbar or tabbar that you're on a particular entry is very useful AND userfriendly.

    Comment by Faruk Ates at 18:53, 19 Jul, 2004 #

  12. Hi Tomas,

    maybe I was. Let's use an analogy (which certainly doesn't mean we have to apply it 1-1 to the net):

    A Book. If you look at the Table of Contents, you'll see

    * Preface
    * Chapter Introduction
    * Chapter Some of this
    * Chapter Some of that
    * Bilbliography
    * Glossary
    * Index

    In the same way, My proposal would be to use:

    * Entry about cats
    * Entry about Dogs
    * Entry about Lions
    * Menu

    What I was trying to say with my first and second comment is, that IMHO it's not benefit to group entries under a subsection if you are going to call that Section: "Content". What is there to gain?

    Comment by Sencer at 18:54, 19 Jul, 2004 #

  13. Faruk: Other than the use of H1 for site titles, it seems we agree whole-heartedly (and I'll have to think some more on that, you're right that the title-element doesn't always contain the title of the web page, didn't think of that).

    Sencer: I'm not sure your book analogy is entirely applicable, seeing as how a table of contents is one thing, and a document structure is something else.

    Comment by Tomas Jogin at 18:57, 19 Jul, 2004 #

  14. "Again, headings should be used to denote structure, not emphasis (importance). This is not the way most people use headings today, that's why I wrote this post in hope of getting a few suggestions of how this should be done."

    I think it's definitely justifiable to consider the name of your site as a root element of the document structure. After all, your site is where all the various sections and documents are - why not include it in the core hierarchy?

    Most of us here don't use H1 for making their site title bigger or more obvious anyway, as we all use some form of image replacement technique.

    Comment by Faruk Ates at 18:59, 19 Jul, 2004 #

  15. I had similar questions and came up with this. (Ironically the markup at fiftyfoureleven doesn't follow that idea.)

    Basically, the H1 holds the title of the page (a blog post title, for example), while H2s can hold the title of the subsections of the page (content, nav etc.).

    I think the idea behind getting to this is to think of the entire contents of an html document (not just the post itself, for example), and then break things down from there.

    Ex: The page is about Green Widgets (h1), the navigation is the navigation for the Green Widgets page (nav=h2) etc.

    Pretty much like your example Tomas, however I would use h2's for 'content' and 'menu'.

    Comment by Mike P. at 19:05, 19 Jul, 2004 #

  16. This problem is the main reason I really like XHTML 2. Since headings are just a generic tag they are only taken in the context of their nesting, which makes much more sense than having to keep track of numbers. This becomes especially useful when storing document fragments.

    But I guess waiting for widespread XHTML 2 support isn't going to help the problem.

    Comment by The other David S at 19:49, 19 Jul, 2004 #

  17. Sencer > The name "content" is a place holder for whatever content will be inserted in the section. The reason it is called "content" instead of the actual name is simply that content is dynamic. I agree that the "entire" site can be considered content within itself versus meta-data, but the word "content" here is being used to describe the meat of the page. In a book, this would be everything between the covers excluding the table of contents, prologues, introductions, etc... There might be a better word out there to use rather than "content" but I wouldn't know what to use. I do like your idea of using the actual name of the content instead. I will have to wrap my head around this a little further. This is off-topic though, as Tomas was not arguing with what we should name our sections with, he was arguing which headings to use for those names. He was using "content" and "menu" to simply describe place holders that we all use for various sections. It wouldn't have done him any good to use actual names in that case as no-one would have any idea what he was talking about.

    Faruk > Thank You! This was my frustration with this whole argument whether or not to use the H1 for the site title or not. Tomas, the "title" of the site, which has already been mentioned, is not necessarily the "title" of the site, it is the title of the page. One individual web page that resigns in a web site. Most of us keep this structure by using something like "site title | page title." However, you web site within itself, the actual name of the domain, is the root. When I make an outline for a class, I will always make the title of the lecture or class the root number. It structurally makes sense. Site > Sections in the site > sub sections of main sections > etc...

    As far as the actual implementation of this, I would think it would be fairly simple. I have never seen the w3c outline before today but it was very helpful, so why not make an outline in MS Word or Omni Outliner before creating markup. A simple outline with a nested structure of a site would give a clear view of what should be what on a site. I know I will be doing this from now on.

    Tomas > Thank you for taking the time to write on this. This is something that I as well have always been a little frustrated with. I have always taken the same method as everyone here and viewed them from importance, not a nested outline structure. Now that I know they are handled that way I will surely go about it with a different strategy.

    Comment by Josh Bryant at 19:52, 19 Jul, 2004 #

  18. The W3C recommends that the h1 be the same as the title. I interpret this to mean that both should refer to the title of the document. I only use one h1, and have h2s in the content and navigation.

    Comment by kirkaracha at 20:01, 19 Jul, 2004 #

  19. There was a related discussion last year at SimpleBits. (I stole own comment from there.)

    Comment by kirkaracha at 20:03, 19 Jul, 2004 #

  20. I believe ISO 15445 is as good a place as any to look for guidelines:

    QUOTE cite="http://www.cs.tcd.ie/15445/UG.HTML#H1.NEST"
    ISO-HTML considers that the <H1> [W3C 7.5.5] element specifies the beginning of a major section of a document and contains the title of that major section. [...]
    ISO-HTML considers that the <H1> [W3C 7.5.5] through <H6> [W3C 7.5.5] elements identify sections of increasing depth and requires that the trees formed by the containment of sections be rooted at the <H1> [W3C 7.5.5] element, and that no intermediate level be skipped.

    From this I gather that H1 should not be used for a similar purpose as TITLE (you can and should have many), and that heading levels should form a cohesive hierarchy. I'm sold. :)

    Comment by J. King at 20:17, 19 Jul, 2004 #

  21. In case anyone's interested, I wrote the outliner in comment 2 and, as a result of this post, I've started some debugging - hopefully I'll have an improved version sometime very soon.

    Although I'm not sure that the number of h1 elements on a page makes any difference in the grand scheme of things, the fact that people are talking about this in 2004 rather than 1994 illustrates the fact that no vendor of a popular UA has made any serious attempt to extract semantics from the document to display to the user. Taking the example at hand, other document manipulation programs (Word, Powerpoint, Acrobat Reader) regard presenting an outline view of the document to be a basic function. Yet it's not present in any HTML UAs.

    The situation gets worse if you look at extracting more complex semantics. For example, the first extension I created displayed a list of document accesskeys and tried to extract descriptions for those accesskeys (see the mozillazine post for a link). Now, the extension code is pretty buggy and doesn't work well with all elements, but looking at real-world markup, it's clear that authors don't think with this kind of applicattion in mind (even people who preach semantic markup don't always produce markup from which it's easy to extract semantics). This is, at least partially, a faliure on the part of UA authors to provide tools that extract semantics so that the motivation for producing clear, useful markup is there.

    I suppose there are examples to the contrary, such as displaying tooltips for image titles and for acronym elements. Of course, Netscape 4 didn't help by displaying 'alt' as a tooltip. Whilst useful, these examples are pretty trivial and don't really benefit from improved document structure - the extra information is presented if the relevant attribute is present.

    Comment by jgraham at 20:47, 19 Jul, 2004 #

  22. jgraham: "Taking the example at hand, other document manipulation programs (Word, Powerpoint, Acrobat Reader) regard presenting an outline view of the document to be a basic function. Yet it's not present in any HTML UAs."

    I think you're spot on here, I can't think of any browser which treats Headings as anything but a modifier of emphasis. Well, except Lynx I guess.

    Given that all browsers, except Lynx, treat Headings as something that denotes importance/emphasis, as opposed to structure, it's no wonder that markup producers do so too.

    Comment by Tomas Jogin at 20:53, 19 Jul, 2004 #

  23. If you look in Microsoft Word, there is something like a normal view and an outline view (excuse me that I draw on Word for the analogy)

    Normal view is a bit like what you get if you look at a HTML document without any visual formatting and the 'correct' tags to mark-up your document.

    When structuring my HTML documents, I try to give information that is on the same level the same kind of heading number.
    h1 would be page title (not site title), h2 a subsequent text header.
    But I also use h2 as the level for sidebar information headers since most of the time they are on the same kind of information level as the h2 heading in the body text. Giving a sidebar heading level h3 or lower would make them part of the logical tree of the preceding h2, like outline view would show. This is not logical at all.


    But if you would have, say, an index page with a h1 saying "index" and you want to group information that would be a h3 level on another page, then according to the outline view of the W3C validator you made a mistake, while Word's outline view would still nest the information properly.

    Looking at the index page in normal view would give you the same kind of information as you get from another page that has both h2 and h3 information levels.

    If I had to apply the W3C rules I would have to promote all H3 heading on my index page to h2, which would not make any sense in normal view.

    How to solve this?

    The question remains if outline view does make a difference (except maybe for DOM scripting?) in real life, since most people read pages in normal view, which could be of a more consistent logic than the outline view would suggest. Most designers would use that logic in structuring their documents, as the links used in the article show.

    Comment by Martijn ten Napel at 20:57, 19 Jul, 2004 #

  24. forgive a novice's confusion, but... why should i care? what is the significance of having a document structure that is up to snuff? and who decides what's snuff and what isn't?

    Comment by newbee at 21:36, 19 Jul, 2004 #

  25. newbee: Well, unless I'm mistaking, assistive screen reader software reads out the titles of Heading tags. A mess of a structure will thus be conveyed as a mess of a structure.

    If you're new to why web standards and/or accessibility is a good thing at all, this discussion is probably a bit over your head, if you're really interested you might want to take a very deep breath and dive into the wonderful world of accessibility and web standards.

    Comment by Tomas Jogin at 21:44, 19 Jul, 2004 #

  26. Maybe the issue lies in the categorization of the side column. This is really nothing more than a list of other pages/links/resources oftered by the site and maybe should be marked up as a list. Or even better, a definition list <DL>, using the<DT> tag for the "headers" and the <DD> tags for the list items. Other formatting provided by CSS of course. This is just my two cents.

    Comment by cedmond at 21:56, 19 Jul, 2004 #

  27. I have gotten in the habit of marking up entire documents as a series of nested dl elements, because it expresses structure and hierarchy so clearly. I don't really see what h1 means aside from the
    top level in a hierachy.

    If I need more conventional structure, an H2 for example, it's easy enough to do with xslt.

    Comment by Lucas Gonze at 00:29, 20 Jul, 2004 #

  28. I have a Question to add in all this: How would you guys code a date in a blag page? I personally(as seen in my work in progress site) have the dates before actual blog entry titles, sort of like headers themselves, so I coded them accordingly: h2's for dates, h3's for entry titles. Would the proper, structural way to code them be to recode them as a span or something?

    Thanks, by the way, for this discussion. I'm probably onna recode my code(again) using the ideas here. I love how the community rallies together on topics such as this.

    Comment by Funkatron at 02:43, 20 Jul, 2004 #

  29. Title - Relavant Title pertaining to content of page, i.e. Name of Book
    h1 - Author, i.e. web design company or Book Author
    h2 - Lets say slogan
    h3 - Top Navigation
    h4 - Sub catgeory of Top navigation
    h5 - Article Title
    h6 - Footer


    Isnt that a much logical list of how to outline your pages' content? If you look into it as just mark up it makes sense, dont it?

    ok sorry about the double post first was a boohbooh. hope no third time.

    Comment by Kay Bentain at 03:18, 20 Jul, 2004 #

  30. This is something I've been thinking about quite a lot recently, especially after reading Dan's book.

    Personally, I have a very simple outline on my home page, but this ignores the weighting/importance of each heading in trying to achieve a logical structure (ie in my case, the sidebar gets the same importance, structure-wise, as the content). Not sure that this is optimal.

    All through school and uni, we were hammered with a structure for writing essays, reports, etc - title at the top, then headings, sub-headings, etc. Why shouldn't this hold for writing on the web as well? It defines a logical page structure, after all.

    And by using CSS, we can retain that structure while still fitting our desired look and feel (in my case, removing the h2s as they are a little redundant).

    Time for a code revisit, methinks. :)

    Comment by Cam at 04:37, 20 Jul, 2004 #

  31. On a side note, the w3 specs on headings do not state the heading numbers denote heirachy. Rather, they state:

    "There are six levels of headings in HTML with H1 as the most important and H6 as the least. Visual browsers usually render more important headings in larger fonts than less important ones."

    This leads to news sites like The Age implementing, IMHO correctly, the most important headline of the day as H1 and the other stories as H2.

    As such, I think the w3's outline view is broken since, by the w3s specs, heading numbers only outline importance.

    Comment by Adrian at 07:58, 20 Jul, 2004 #

  32. To take your website as an example this time, your structure (the front page) suggests that your "Bonfire" and "Hot off the press" sections, etc, are located hierarchically below your your latest entry, which of course they're not.

    I see what you mean now.

    The biggest problem, imho, is that I don't view my site as an outline of content, and therefore attempting to be semantically correct at this level is somewhat of a pointless exercise in my brain.

    My site hierarchy is such that the latest article is at the top, then there are other sections outside of that, yet of lesser importance. If I were to draw a bulleted outline of them, it would look like:

    H1.0 - Lead article

    H2.0 - Bonfire
    P - Item
    P - Item
    P - Item

    H2.1 Hot off the Press
    H2.1.1 - Title
    P - Item
    H2.1.2 - Title
    H2.1.3 - Title

    The problem with the H# tags is they do not allowing this outline structure, where H1.0 and H2.0 and H2.1 can live at the same indentation level. I'm somehow supposed to make the lead article section an H1 (or even H2) tag, and then also make the Bonfire the same H# tag, but then style it so it looks visually less important. However, if this structure is not viewed with the stylesheet, then Bonfire *appears* as important as the lead article due to nothing more than having the same font size along with the same indentation level.

    This is not what I want, nor is it how I think of my content.

    So until then, I'll use the header tags as a means of reflecting the importance in my brian, regardles if the heirarchy is incorrect. (For me, Bonfire is H2.0 while Hot off the Press is H2.1, but I can't add the #.# convention to the semantics of the code.) I would rather live with the misuse of H# tags than with inferring the incorrect importance when viewe3d without the style sheet.

    Am I making sense here?

    Comment by Andrei Herasimchuk at 09:49, 20 Jul, 2004 #

  33. There seems to be some confusion between the W3C specification referenced in comment #32, and the ISO spec (comment #20).

    If (as seems to be suggested by the W3C Validator document structure output that Tomas references throughout the article) the ISO "hierarchical" purpose of heading tags is the correct one, then I think Andy Budd (#4) is probably the closest to the quote-unquote correct way to mark up a document:

    h1. My Blog
    --h2. An entry
    ----p. Some writing
    ---h3. A sub-heading
    ----p. A bit more writing
    --h2. Another entry
    ----p. Some more writing
    --h2. Last entry
    ----p. Even more writing
    h1. Sidebar
    --h2. Blogroll
    ----ul. List'o'links
    --h2. Buy Me Stuff
    ----a. Amazon wishlist link

    et cetera. The structure of the document is correct, although within the design of the site you may not wish to actually display the h1 text.

    It's debateable what the value of having multiple h1 tags would be in terms of search ranking, which I guess is as much behind our desire to mark stuff up correctly (at least on commercial sites) as the need to be semantically (and/or hierarchically) correct.

    Comment by Matthew Pennell at 10:22, 20 Jul, 2004 #

  34. It seems to me that all of this rather depends on marketing needs than anything else. On a site intended to promote a brand, the name of the brand is going to be the most important thing. On the other hand, sites that sell products should probably use <h1>s for the product name on each product detail page, and so on. The objectives of the site determine what the most important thing on the page is, and surely that should be what you pop in an <h1>?

    Comment by Dave Child at 10:53, 20 Jul, 2004 #

  35. Andrei:

    "Am I making sense here?"

    Yes, yes you are. However, what you want to achieve visually can still be done when using improved hierarchy in H# tags. You just have to add some superfluous tags to denote the overall sections of your page:

    H1.0 - Lead article

    H2.0 - Further DxF Content:

    H3.0 - Bonfire
    P - Item
    P - Item
    P - Item

    H3.1 Hot off the Press
    H3.1.1 - Title
    P - Item
    H3.1.2 - Title
    H3.1.3 - Title


    That would be improved hierarchy for your page, yet still have Bonfire and Hot off the Press as less important when viewing without stylesheets. Then you just use styling to not display the superfluous <h2>Further DxF Content</h2>.

    Comment by Faruk Ates at 13:14, 20 Jul, 2004 #

  36. This is something I put a lot of work into on my site. I'm still working on it, but the goal is to get the XHTML markup to a level where I'll be happy with it for at least 18 months without major changes.

    As you can see from the outline view of my site, the document structure is:

    Document (not site) title
    -- A description of the main content
    ---- Blog entry title
    ---- Blog entry title
    ---- Blog entry title
    ---- Blog entry title
    ---- Blog entry title
    ---- Blog entry title
    ---- Blog entry title
    -- Contextual information
    ---- About the page
    ---- Latest project
    ---- Recent Links
    ---- Syndication

    The describes what the page is, and the 's section this document into two parts: the main content and the contextual information (incorrectly labelled "links"). These second-level sections are then further split with 's.

    I can't see a better way of structuring the page. The big problem is that Movabletype doesn't have nice header support, so for entries with subheadings, the outline screws up. For the individual entry pages, the should be the title of that entry. I still haven't come up with a solution I'm happy with, other than rolling my own blogging software.

    You'll have to excuse the rambling, there is not yet enough coffee in my brain.

    Comment by David Barrett at 13:19, 20 Jul, 2004 #

  37. it sounds simple but when im asced to design a css based site i always start with the site with NO CSS, this way i can make sure the markup is logical, its accessed in the correct manner and it just all makes sense.

    Comment by mark at 13:29, 20 Jul, 2004 #

  38. I disagree that numbering the 'most important' topic on a page as <h{n+1}> and sibling topics deeemed 'less important' as <h{n}> is at-all useful. Although allowed by the letter of the spec, it certianly doesn't match the examples given, the tighter spec of ISO-HTML or, most significantly, any likely use-case for a user (or user-agent, if you prefer) extracting semantics from the document.

    As far as I can tell, the most likely things that a user would want to do with headings are to create a document outline to aid in the navigation of long documents (this might be very useful for a small-device where scrolling is difficult or an aural UA to prevent unnecessary sections being read) and adding section numbering (as suggested by the spec). Using headings to denote something other than a tree-like structure diminishes the effectiveness of either of these UA features.

    The only benefits I can see for having structural siblings marked up as different heading levels is that the default styling in nonstandard UAs (i.e. ones that the author neglects to provide a stylesheet for) might be better. However I'm not even sure this is the case - look at a site that does this with author styles off and you get a strong impression of a heirachy of secions rather than a set of siblings.

    I suppose one might argue that marking the most important article of a set as a higher level heading provides useful information, but I'm struggling to think of use cases where this wouldn't be obvious from the context or useful things that a UA could do with this information. All the use cases I can imagine involve specific knowledge of the site purpose and structure and therefore could be implemented just as easilly if the additional information were encoded in a specific class or id.

    If you're using different heading levels for content at the same level in your document heirachy as a means to attach style, you should seriously consider using classes or ids to the same job.

    Comment by jgraham at 13:37, 20 Jul, 2004 #

  39. I don’t we should be using a heading structure for blogs. To me, blogs are quite different to the type of sites that this structure was intended for. This is why I think we have a problem.

    When applying the heading structure to an “standard” information site, then the structure works - it will only be used to mark up the main content. A document in an information site suits a pyramid style of writing, and also a logical heading structure.

    On a blog however, the main content is displayed outside of a normal document structure, that is, a blog is a list of entries where the hierarchy is purely time-based (latest entry at the top etc.) Other important content, that would fit the heading structure, sits in the sidebar but as this is typically associated with navigation, this causes confusion. I think we're always going to run into a problem when trying to apply a heading structure to this type of site.

    So, I don't think we should be trying to apply this heading structure to blogs - the content just doesn't work with the structure.

    Does anyone agree?

    Comment by Paul Nattress at 13:56, 20 Jul, 2004 #

  40. Cam: That's perfectly optimal as Headings are, according to the W3 validator, not supposed to be used to denote emphasis.

    Andrei: In your latest example outline, your H2 items would be hierarchically (structurally) located below the Lead Article. Does your Hot Off The Press and Bonfire sections really belong inside your Lead Article?

    Comment by Tomas Jogin at 14:00, 20 Jul, 2004 #

  41. Paul: "I don�t we should be using a heading structure for blogs. To me, blogs are quite different to the type of sites that this structure was intended for."

    I disagree. The very first website, created by the inventor of the WWW and HTML, was a blog, about what was fresh on the web. Blog's aren't something new and completely different, they're just online journals, certainly not something that can't be structurally marked up.

    Comment by Tomas Jogin at 14:04, 20 Jul, 2004 #

  42. Tomas said: "Blog's aren't something new and completely different, they're just online journals, certainly not something that can't be structurally marked up."

    In fact, I would dare be so bold as to say that Blogs are generally easier to be marked up with a proper structure, than non-blog websites. Blogs, by definition, have a very standard structure that is virtually always the same, no matter what blog you go to. Nearly all of them consist of a collection of entries/articles, which each have their own heading, sometimes subheadings, and sometimes comments. Furthermore, nearly all blogs online have similar additional content / sections, like a blogroll, about page, entry archives, and then a few other, less widely-spread things (like a Portfolio).

    Non-blog sites can vary immensely in that aspect, making them far more challenging to give a proper structure, as more often than not you'll have to sit down and take a good long look at what the true hierarchy of the site is or will be. With a blog, you often already have that hierarchy, and it's just your specific adaptation and use of Heading tags that make the difference between blog A and blog B.

    Comment by Faruk Ates at 14:18, 20 Jul, 2004 #

  43. New version of outliner extension

    Requires Firefox 0.9 or above - bug reports in the mozillazine thread or by email, please. You may need to grant forums.mozillazine.org permission to install extensions (although with 0.9 - 0.9.2 I don't think it's an issue).

    Actually using the headings on the page for a purpose (rapid site navigation) gives a rather definite perspective on whether a site has good headings or not and, in particular, you quickly appreciate that blogs that mark up their headings well (several contributers to this discussion) are creating a better user experence than people who mark them up poorly (in particular, the navigation items need to be clearly seperated from actual post content, site or article metatdata such as ISSNs and author names don't need to be headings) which in turn are much better than blogs that don't mark them up at-all (slashdot and doubtless others too).

    Google would also be even more useful if they could find someone with a clue about markup.

    Comment by jgraham at 14:37, 20 Jul, 2004 #

  44. I think a further source of confusion (which possibly makes this problem unsolvable at this point) is the meaning of the term 'document'. As Faruk correctly points out, the title element should identify the contents of a document. Hn elements specify the level of importance of each heading within a document.

    It is common practice to include the name of the site in the title element (which to some may be a diversion from the specification). The question is do the site title, navigation and indeed all other non-content elements of a particular 'page' fall within the scope of the document? Or is a document (in it's common form on web pages) actually made up of several documents where the importance of hn elements should be considered to be self-contained.

    We are given the div element to add structure to a document so perhaps the importance of heading elements should only be relative to the div in which they are contained. A possible structure may then be:

    div "header"
    -h1 "Site Title"
    /div

    div "navigation"
    -h1 "This is the navigation"
    --h2 "Articles"
    ---li "Article 1"
    ---li "Article 2"
    --h2 "Photos"
    ---li "Photo Gallery 1"
    ---li "Photo Gallery 2"
    /div

    div "content"
    -h1 "Article title"
    -p "This article is about..."
    --h2 "Section One"
    --p "Body text"
    --h2 "Section Two"
    --p "Body Text"
    -div "Sidebar"
    ---h3 "Sidebar Title"
    ---p "Some interesting but 'less important' information related to the main article"
    -/div
    /div

    Not the definitive answer I'm sure, just a suggestion, but the crux of the issue seems to be the scope of the importance of various heading elements in a page/document relative to each other.

    Comment by Greg Fahy at 16:21, 20 Jul, 2004 #

  45. I really like Greg's structure example, using multiple H1's, one for each div-defined section of the page.

    I can see various reasons why your (Greg) solution would work very well, but I can also see that many people will disagree with it and stick to the approaches most people are currently using.

    For one thing, I think we'll all agree with the notion put forth by several people in this discussion already: there is no clear, well-defined guideline that tells us HOW we are supposed to use Hn tags. Not on their own, but neither when seen with the context of the entire HTML page/document.

    Sounds like we need a flattened, usable XHTML Specification that's written for website builders (and not browser developers)... hey, whaddya know, I'm already working on that! :)

    More at 11.

    Comment by Faruk Ates at 16:39, 20 Jul, 2004 #

  46. This has been interesting reading. Just my .02:
    (I respect a lot of the 'posters' here, and am interested in their reactions to my comments)

    In my opinion, title and h1 present the same information, for different purposes. I look at the title tag as meta-data, staged for the browser title bar, and bookmark links. I look at the header tags as part of the pure outline of the page. I approach the page outline as I would a report when I was in school, and had to produce an outline before I handed in a report. Most often (but not always) the title and h1 tags will be identical, or only worded differently.
    I believe the separation of presentation and content suggest that sidebar stuff (ads, links to other posts, navigation, etc.) is not part of the page content, and therefore does not belong in the outline. Therefore, I would avoid use of heading elements in 'sidebar content' altogether.
    This would lead to this 'academic-styled' outline:
    h1 - One per page - Page Title
    h2 - Sub-head
    h2 - Sub-head
    h3
    etc. (with no nav / other post links).

    Thoughts?

    Comment by Jim C at 17:21, 20 Jul, 2004 #

  47. Jogin.com has an interesting post up about document structure, and the lack of consensus about Hn tags to denote structure....

    Trackback from webgraphics at 18:18, 20 Jul, 2004 #

  48. Okay, in the spirit of the title of this blog, I have to voice a counter opinion. In fact, it got so lengthy that I made it an entry in my long-neglected blog (v2.0 pending!). You can read it there if you like, but my arguments boil down to this:

    Some of the H tags used as suggested here are more meta-data then content (eg. <h1>navigation</h1>, <h1>content</h1>). Thus, to fit into the basic paradigm of markup languages, those additional labels for the outline should be in an attribute, not as content. Secondly, H tags don't enclose blocks, so it's semantically difficult to figure out what they refer too (like when people didn't use closing p tags).

    The traditional use of H tags has a lot of value both by the fact of how it's interpreted by all UAs since the beginning of time, and also how search engines use them.

    To leverage all these new benefits you destroy the traditional use of heading tag which has clear value to all web pages in exchange for a descriptive hierarchy system that may not apply well to all conceivable page structures, and presents many technical difficulties in presenting dynamic content in different contexts.

    Before I get flamed, I should mention that I support the general concept of better header hierarchies, but before we go to this level of semantic precision let's take a look at what we're losing by throwing away the tried and true concepts.

    Comment by Gabe at 19:13, 20 Jul, 2004 #

  49. "Does your Hot Off The Press and Bonfire sections really belong inside your Lead Article?"

    No, they don't. That's very tue.

    But my point was that by putting them outside the hierarchy structure at the same H# level as the lead, when viewed without the stylesheet those sections now have as must visual weight as the lead, which is *not* what I want.

    Comment by Andrei Herasimchuk at 19:34, 20 Jul, 2004 #

  50. Andrei: I can see what you mean, visual (as opposed to Lynx) browsers treat Heading tags as if they denoted headline emphasis, which the Validator, and the spec if you think about it, says that they do not.

    Comment by Tomas Jogin at 19:38, 20 Jul, 2004 #

  51. jgraham: "The very first website, created by the inventor of the WWW and HTML, was a blog, about what was fresh on the web."

    Sorry to nitpick, just wanted to correct this: the first web pages Tim built were to put up contact directories of the scientists at CERN, according to his book anyway.

    It's no less a noble birth for our humble medium either way. In fact, I've gotten a friend excited about all the blogging I've been incorporating at work and he's going to use blogs to tie together information from the various physics labs he does web development for, some here in San Diego and some in Switzerland. Hmmm, physics labs, in Switzerland, using the web... this sounds familiar... :)

    Comment by Al Abut at 21:34, 20 Jul, 2004 #

  52. Well Tomas, you dismissed my argument in my blog without addressing a single issue. How about some actual reasoning? For starters you could address this issue:

    Search engines use headings to determine relative importance of information on a page. This can be useful. Having the current blog post in H1, the next posts in H2 and the menu headings in H3 means you get better search results. If we go totally structural, what practical means are there to replace this functionality (assuming Google et. al follow your advice)?

    Comment by Gabe at 21:41, 20 Jul, 2004 #

  53. Now that my off-topic anal jerk post is out of the way, let me say thank you for posting an informative article. It made me view the role of document structure in a forehead-smacking new way, which I didn't think would happen with the tired "structural markup" threads out there and given how thoroughly the SimpleQuizzes covered the topic.

    I thought I new everything about the latest and greatest in good structural markup, and although I busted my butt when redesigning my blog last month to create it "the right way" - don't just use CSS, but use real h/p/ul tags, use no images and instead apply the various IR techniques, cut down on meaningless divs/spans, apply classes and IDs as little and as high in the markup tree as possible - it was still lacking in this area and insensitive to how alternative user agents would view it.

    Now, after a quick bit of tweaking, the new document structure makes a bit more sense. So this blog post had a quick and immediate impact on my work - thank you!

    Comment by Al Abut at 22:03, 20 Jul, 2004 #

  54. Gabe: "Having the current blog post in H1, the next posts in H2 and the menu headings in H3 means you get better search results."

    Please provide proof of some kind that what you're saying is true; that, for instance, using H2 for each of them would -- in fact -- mean worse search results.

    Secondly, your comment implies that this is some kind of scheme that I just invented; that Headings were never supposed to denote structure; that this is some kind of "advice" of mine on the future implementation of user agents. That is just wrong, plain and simple. I didn't invent anything, I didn't think of a new use for Headings, I merely pointed out that Headings are supposed to denote document structure. The W3 says so, not me.

    Comment by Tomas Jogin at 22:04, 20 Jul, 2004 #

  55. Gabe:

    I don't think anyone has suggested that elements of text such as blog post titles or sidebar headings shouldn't be marked up using H tags - simply that for an HTML document to make sense hierarchically, it may be necessary to re-think how we use them; changing from "the sidebar isn't as important as the blog entry, so I'll give it a lesser H tag" to "the sidebar is a new section of the site (a branch of the hierarchy) so I'll give it the same H tag as the other branches of the same level".

    Your point about search engine rankings doesn't stand up to scrutiny either - are you suggesting that having more H tags (i.e. more 'important' information in Google's eyes) is going to hurt your search performance?

    I think not.

    Comment by Matthew Pennell at 22:14, 20 Jul, 2004 #

  56. Some of the H tags used as suggested here are more meta-data then content (eg. <h1>navigation</h1>, <h1>content</h1>). Thus, to fit into the basic paradigm of markup languages, those additional labels for the outline should be in an attribute, not as content.

    Actually, they should be elements not attributes - there should be a <navigation> element, a <content> element and so on. The WHAT group are working on extensions to HTML and I plan to advocate that these elements appear in the Web Apps 1 specifcation. Additionally, XHTML 2 may contain some of this functionality.

    In the meantime, providing headings to seperate out content and non-content elements of a page is useful in a number of situations. It provides a nice outline view of the document. It is excellent for text only browsers, since it's clear whether a given section of the document is content or navigation (and, if the browser provides an outline view or the ability to skip to a specific heading, it allows rapid navigation to different document sections). It's doubly useful for speech browsers since it gives a clue as to whether the following section of the document is likely to be interesting or should be skipped. In fact, any rendering where the layout is linear benefits from clear delimitaion of the navigation and content parts into their own section (with a heading) so they are easy to navigate between (with a sutiably well-written UA).

    Secondly, H tags don't enclose blocks, so it's semantically difficult to figure out what they refer too (like when people didn't use closing p tags).

    Actually, in theory, it's trivial - all headings refer to the part of the document between the heading and the next heading of equal or greater 'importance'. If this isn't true, you're probably doing something wrong.

    XHTML 2 improves on this as well by replacing h{1-6} with <section> and <h> so it's trival to know which heading applies to which block of markup.

    The traditional use of H tags has a lot of value both by the fact of how it's interpreted by all UAs since the beginning of time

    Most UAs (a least visual ones) don't do anything special with <h{n}> tags at all. The only feature that they provide is a default stylesheet that gives a particular style to particular heading levels - exactly what one would expect for heading levels of decreasing importance, but ultimatley unimportant in 2004 when sophisticated styling solutions are avaliable.

    If, by chance, this default UA rendering isn't what you want, yet your headings actually head sections and provide a clear document heirachy, it's irrelevant. That's why we have CSS - so authors have control over presentational style. The markup is for the machine, the CSS for the human. The number of people who will see your site with no style is tiny. The number of web browsers that see your site with no style is limited only by the number of visitors. Hacking markup because of issues with the browser default style will almost certianly prevent your site working well with browsers that use the markup in some meaningful way, yet provides almost no benefit.

    and also how search engines use them.
    I'm not familar with the exact details of search engine algorithms but I can't see this being a big issue. If search engines did place much greater importance on h1 elements than h2 elements and so on down to p, I'm sure we'd see entire sites enclosed in <h1> tags styled to be like paragraphs. If your headings are reasonably descripive of your content and you are using good, structural, markup, including proper heirachical headings you should get excellent search engine results (assuming anyone links o you ;) ).

    Comment by jgraham at 22:16, 20 Jul, 2004 #

  57. Jogin has an excelelnt article on using the Hx tags for structual markup. Use the W3C Validator with the Outline option enabled to see what structure your web page is marked up with. The results are quite revealing (ie: mostly the pages have no real...

    Trackback from Elastic Rat at 23:12, 20 Jul, 2004 #

  58. Mathew - no, of course I didn't think anyone is suggesting that. But the point is that where before you had an [h1]page title[/h1] enclosing the page title, now you have to have [h1]content[h1] [h2]page title[/h2] [h1]navigation[/h1]. It's very typical for search engines to have a total weight that gets split up to words within the page, more heavily favoring higher headings. If you have h1s that really have nothing to do with your content other then describing the structure it dilutes the terms that you actually want favored. In general the terms you want favored coincide with what you want the user to see...

    jgraham -
    Actually, they should be elements not attributes

    Good point, but either way I think the important thing is that the ones we don't want displayed stay out of the content.

    Actually, in theory, it's trivial - all headings refer to the part of the document between the heading and the next heading of equal or greater 'importance'. If this isn't true, you're probably doing something wrong.

    What if you want the heading to show up underneath a block, like a caption? Sure, the proper way might be to use absolute positioning, but there are all sorts of instances where we order the markup in order to give us the right hooks to make CSS work. Of course, if we use tags or attributes for the hierarchy description then it is a moot point.

    If, by chance, this default UA rendering isn't what you want, yet your headings actually head sections and provide a clear document heirachy, it's irrelevant.

    This gets at the crux of my point. People don't always think in terms of page hierarchies. Many people see an unstyled page that says at the top "Content" and "Navigation" are gonna be confused. Sure for us designers it makes perfect sense, but regular people don't think in those terms.

    That's why we have CSS - so authors have control over presentational style. The markup is for the machine, the CSS for the human.

    Well technically it's the content that's for the human, the CSS is just used to present it. When you have structural information for the purpose of creating a clean hierarchy, it may not make sense to a human as a heading (even if it works in outline mode).

    The number of people who will see your site with no style is tiny.

    But for the ones that do, shouldn't the headings be headings that make sense to the reader?

    Hacking markup because of issues with the browser default style will almost certianly prevent your site working well with browsers that use the markup in some meaningful way, yet provides almost no benefit.

    The way that headings have traditionally been used does have meaning. The meaning of a heading is as fundamental a concept to human readers as the paragraph or list. The fact that it has presentational connotations does not invalidate its semantic meaning. A heading is a description of some content. I am merely arguing that those descriptions should be tailored to the context of the page.

    Besides, you can get just as much meaning from using H tags to specify importance, it's just different information. For instance, you could have:

    [h1]Main Blog Entry[/h1]
    [h2]Previous Entry #1[/h2]
    [h2]Previous Entry #2[/h2]
    [h3]Blogroll[/h3]
    [h3]Articles[/h3]
    etc.

    This conveys some valuable information just as a proper outline does, but the information is in many ways complementary. That's why I'm saying we shouldn't try to force H tags into an entirely new role when there is still some value to the old role. Not to mention the technical issues.

    What if outlines were created from tags like <section label="Content"> that enclosed the blocks and we leave H tags for the visible headings. All in all a more elegant solution. For one thing, you don't have to worry about the numbering... the hierarchy is implicit from the structure so you can incorporate the same content into bigger and bigger pages without renumbering things.

    Look, I realize I'm the odd man out here, but I don't think i can make my point any clearer without examples. I may do that in the future, but does anyone even see what I'm getting at here?

    I'm sure we'd see entire sites enclosed in <h1> tags styled to be like paragraphs.

    See my reply to matthew at the top.

    Comment by Gabe at 00:03, 21 Jul, 2004 #

  59. Gabe: "People don't always think in terms of page hierarchies. Many people see an unstyled page that says at the top "Content" and "Navigation" are gonna be confused. Sure for us designers it makes perfect sense, but regular people don't think in those terms."

    It seems to me like you're focusing way too much on the particular placeholder names I gave to the H1 tags in my example outline, those being "Content" and "Navigation".

    A commenter above, with the signature "Al Abut", reworked his document structure according to the essential topic of this discussion. Instead of "Content" and "Navigation", his H1 tags are titled "Latest entries" and "Related links". Those are just his particular choices, of course. This discussion has nothing to do with naming conventions.

    Here's his styled webpage, his outline, and if you have Firefox as well as the Web developer extension (recommended) you can also view his unstyled content by disabling CSS. If you have no way of viewing his structural unstyled content, you'll just have to trust me when I say that it is in no way confusing (although perhaps the dates shouldn't be marked up as Headings).

    Secondly, and I know I've told you so before, but it doesn't seem to stick:

    "That's why I'm saying we shouldn't try to force H tags into an entirely new role when there is still some value to the old role. Not to mention the technical issues."

    There is no new role. This is not a new invention.

    Comment by Tomas Jogin at 01:11, 21 Jul, 2004 #

  60. It seems to me like you're focusing way too much on the particular placeholder names I gave to the H1 tags in my example outline, those being "Content" and "Navigation".

    I just find those headings convenient to illustrate the point that forcing headings into a strict hierarchy does not necessarily yield intuitive headings. Sure it always helps the outline, but it takes something away from the raw document view.

    There is no new role. This is not a new invention.

    As has been mentioned, there is conflict in the W3C specs as to how headings should be used. In practice, most people have used headings to indicate levels of importance. Since that is common practice, it makes sense to refer to the hierarchical principle as 'new' even though the idea was there from the start.

    Look, I'm not saying that there aren't valid compromises. And we would definitely agree and many practical improvements that could be made regarding the use of headers (like that dates should not be headings). I feel like you're taking this personally, but I'm not saying using H as hierarchy is bad.

    I'm merely suggesting that using different levels of H to denote importance (as is also mentioned in the spec) has practical value that should not be dismissed without careful consideration.

    Comment by Gabe at 02:25, 21 Jul, 2004 #

  61. I went through this thought process when building my site’s template.

    I decided to use two <h1>s, one for the page title and one for the navigation. My rationale was that the nav was semantically distinct from the content.

    If you want to see it in plain view using FireFox, hit the stylesheet icon at the bottom left and choose Basic View.

    Comment by Mark Tranchant at 09:59, 21 Jul, 2004 #

  62. Tomas (and I suppose Andrei):
    "Does your Hot Off The Press and Bonfire sections really belong inside your Lead Article?" (wrt comments 40, 49, 50)

    This exposes one way of viewing a page, however (my comments from another site):

    Another POV:
    If one thinks of each page of a site as an individual HTML document (which they are), then it could be argued that each document has one overall title/topic/purpose.

    Within that overall HTML document are the components that make it up, and as such they are each stucturally pieces of the whole (the whole being represented by the <h1>My Topic</h1>). Looking at it this way, these components would reside in h2's (and their respective components in h3's - h6's).

    Result?:
    The answer to your question would then be 'yes'. They aren't "inside" the lead article, but components of the HTML document that has as a main title/topic <h1>My Topic</h1>

    Structure or importance?:
    This doesn't address Andrei's concern about the lead article being 'more important' than the components, like the nav etc., but is it really 'more important'? To who? The reader? That depends.

    Looking at it from a structural point of view is, IMO, more objective then trying to determine 'importance'.

    Comment by Mike P. at 10:12, 21 Jul, 2004 #

  63. "In practice, most people have used headings to indicate levels of importance."

    Why is this? The Specs clearly state
    that heading tags “briefly describe[s] the topic of the section it introduces” **.

    That little bit there, if put into practice, will lead to an outline.

    ** it then goes on to say that "Heading information may be used by user agents, for example, to construct a table of contents for a document automatically."

    Comment by Mike P. at 10:19, 21 Jul, 2004 #

  64. Gabe: "I'm merely suggesting that using different levels of H to denote importance (as is also mentioned in the spec) has practical value that should not be dismissed without careful consideration."

    First of all, the spec says so in the chapter "The global structure of an HTML document" (as noted by Andy Budd). Headings are brought up in the structure section of the spec, along with the DIV element. Not in the "Text" section, which contains definitions of tags like em and strong, which are used to indicate emphasis.

    Secondly, the few guides that W3C offers on how to implement the specification suggest using Headings for structural purposes, not to indicate emphasis.

    Thirdly, the fact that you should not skip levels when using Headings, going from H1 directly to H3 or H4 for instance, is not news to anyone of us. Think about that. Why is that? Why is it significant to not skip a step when using Headings? Well, if they were used to indicate emphasis or importance, it would make no sense at all, would it? But if they indicated document structure, on the other hand, it makes perfect sense.

    "In practice, most people have used headings to indicate levels of importance."

    Yeah, and, in practice, most people have used tables for page layout.

    Obviously, Headings not being the same thing as Headlines, and implemented in visual browsers as if they were, causes a lot of confusion. The entire reason any one of us is having this discussion is because W3C offers far too little, and far too few, practical guides on how to implement the spec.

    I'm not saying that everybody should immediately stop using Headings to emphasize content (just because the Validator says so, again, this is not my idea, don't attribute this idea to me), but rather that one should contemplate the consequences on the document structure which one's use of Headings implies.

    Everybody who uses a browser from this century will see the styled version, in which case the underlying document structure is transparent and makes no difference one way or the other.

    Those who do not see the styled version, whom the document structure could possibly be relevant to, would be people using a text-only browser and/or screen reader, or other assistive software, in which case a well thought out and accurate document structure is only going to be of help, as navigation through the document hierarchy is consistent and logical (provided that the software offers a way to navigate the document using the Headings, which, unless I'm mistaking, assistive software actually does).

    Comment by Tomas Jogin at 10:41, 21 Jul, 2004 #

  65. What if you want the heading to show up underneath a block, like a caption?

    Well, in theory CSS, although maybe that's not powerful enough yet. Alternatively, one could use a language with a more flexible heading scheme (XHTML 2). Since neither of these are realistic possibilites, one must either adapt one's design to the limitations of the format or accept that it will not play nicely with advanced UA features. This all assumes that captions should really be marked as headings - I'm not at all sure that this is the case, since they don't head a section.

    Many people see an unstyled page that says at the top "Content" and "Navigation" are gonna be confused.

    Well first, I dspute that "many" people will see this. But, even for those who do, heading up the navigation elements allows them to be quickly skimmed over when lookng for the content and vice-versa. I don't see why that's confusing and it's certianly less confusing than:

    Content Title
    - Subsection
    - Subsection
    - Subsection
    - Useful Links

    Where "Useful Links" have the context of the entire site rather the context of the current article implied from their heading level. The alternative:

    Content Title
    - Subsection
    - Subsection
    - Subsection
    Site Navigation
    - Useful Links

    Is much more transparent, especially, where there is no stylesheet to provide a visual clue about the relationship of dfferent elementts on the page.

    When you have structural information for the purpose of creating a clean hierarchy, it may not make sense to a human as a heading (even if it works in outline mode).

    display:none;

    I don't see an issue with hiding unnecessary elements in visual browsers but making them avaliable to UAs and users who benefit from a better structure; i.e. all users who don't get visual clues about the role of different bits of content.

    Besides, you can get just as much meaning from using H tags to specify importance, it's just different information. For instance, you could have:

    [h1]Main Blog Entry[/h1]
    [h2]Previous Entry #1[/h2]
    [h2]Previous Entry #2[/h2]
    [h3]Blogroll[/h3]
    [h3]Articles[/h3]
    etc.

    OK, but I don't see any way to extract that information and do something useful with it. How could a UA (other than a search crawler) present the information that you consider 'blogroll' to be less important than 'Main Blog Entry'? What benefit does it offer users? The only uses I can imagine require site-specific knowledge.

    What if outlines were created from tags like that enclosed the blocks and we leave H tags for the visible headings.

    Then, by some odd miracle, we'd all have adopted XHTML 2 and its significatly less hackish than HTML 4 heading model. I've always thought the HTML 4 heading model was ugly but it's all we have for the forseeable future, so we just have to deal wth it, sadly.

    Comment by jgraham at 11:17, 21 Jul, 2004 #

  66. How should <hx> tags be used? Tomas Jogin poses the question with some interesting points in his blog entry Hierarchy. Using the W3C validator, Tomas clearly shows that common thoughts on the use of heading tags is slightly off, though puts th...

    Trackback from Camaban at 14:55, 21 Jul, 2004 #

  67. Headings are brought up in the structure section of the spec, along with the DIV element. Not in the "Text" section, which contains definitions of tags like em and strong, which are used to indicate emphasis.

    I'm not suggesting they be used for emphasis. I'm suggesting they be used to head sections that can benefit from a description. I'm also suggesting that the headings that are generally useful to people do not always form a natural hierarchy.

    Thirdly, the fact that you should not skip levels when using Headings, going from H1 directly to H3 or H4 for instance, is not news to anyone of us

    As I tried to explain before, I am not disputing your basic premise. However, I do take the technical issue with this requirement simply because dynamic content may appear in many different contexts. It's relatively simple to process H tag levels using a content management system, but it still is a little less than practical. But just to be clear, I am not arguing against the idea of a hierarchy per se, just that it has to be weighed against other considerations.

    Everybody who uses a browser from this century will see the styled version

    Great, but I still think that only including headings that are relevant to human comprehension of the content should be there. Yes, they should be in a hierarchy as much as possible, but no I'm not going to add levels so that it forms a nice tree. A tree is not the most robust data structure. Sets represent a wider-variety of real world collections in a much truer form.


    heading up the navigation elements allows them to be quickly skimmed over when lookng for the content and vice-versa.

    You're thinking in outline mode. But most viewing of the page doesn't occur in outline mode. If outline mode were built into every browser and people were used to it as a feature, then I feel it would definitely outweigh the benefit of keeping out the 'extraneous' headings.

    However, since it's not, I would prefer to keep my HTML free of content that's not relevent when viewing the document as a whole. Furthermore I want H1 to contain the best descriptor of the page I can get, and H2 to describe the next most relevant sections, etc. This gets me the very practical benefit of better search results.

    OK, but I don't see any way to extract that information and do something useful with it.

    Granted, the hierarchy certainly offers more value in this arena because without it you can't choose which H2s relate to the preceding H1 and which are meant to stand-alone. Unfortunately using H tags destroys the importance of sections concept which I feel is important to preserve.

    Nevertheless, a listing of the headings in order of importance is of some benefit. Note that I'm not saying Hs are for emphasis, I'm saying that we might want to know what's the most important section of this page? If you force a strict hierarchy you totally lose that ability, because suddenly it looks as if the first division of the site into major sections is the most significant, when it may really be inconsequential to the user. Again, in hierarchy mode it makes perfect sense, but I'm not about to sacrifice my search engine optimization for better results in a tool that almost no one uses.

    Then, by some odd miracle, we'd all have adopted XHTML 2 and its significatly less hackish than HTML 4 heading model.

    Agreed. But in the meantime I'm going to make my H1 my page title, and start my nav headings at H4, so my top 3 levels of content headings shine through. I won't be doing any hacking to manipulate the H level of content that appears in multiple occurrences because it's a lot of work for a small payoff.

    Comment by Gabe at 15:57, 21 Jul, 2004 #

  68. There is a great discussion going on over at Tomas Jogin's site about the proper way to structure the headings in an HTML document. He brings up the point that using an H1 for the name of the Web site could be seen as redundant since it is already (...

    Trackback from Jeremy Flint - Red Hot and Daily at 16:06, 21 Jul, 2004 #

  69. I'm actually resolving this issue in my way-too-delayed redesign. The current layout has h1 as the big header thingie, h2 for entry titles, h3 for entry info (that's way wrong), and h3 for menu headers.

    My new one uses h2 for the menu headers as well, so they're on the same level as entry headers.

    Comment by Johan Svensson at 16:30, 21 Jul, 2004 #

  70. But most viewing of the page doesn't occur in outline mode.

    No, but people do skim over headings to find the content they're looking for (think newspapers). Clearly delimitng sections with headings (and seperating out content from non-content) will help them do this.

    I would prefer to keep my HTML free of content that's not relevent when viewing the document as a whole.

    But it is relevant because a random visitor to you page has no idea what the structure is. A visual user can guess from the style, other users need to look at how you have delimited the page into sections. Having a clear heirachical structure helps tremendously here (see my previous example where the addition of an extra heading distinguishes between links assosiated with content and links assosiated with the site as as whole).

    Note that I'm not saying Hs are for emphasis

    But you are saying that you should use the 'n' in <h{n}> to emphasise more important headings (or those assosiated with more important content) over 'less important' headings. That's quite similar...

    I'm saying that we might want to know what's the most important section of this page?

    Well I still don't see how any UA can use this information (which is essentially an arbitary value judgement by the author) a way that will be portable across various sites.

    suddenly it looks as if the first division of the site into major sections is the most significant, when it may really be inconsequential to the user

    Inconsequential compared to whether sections of navigation should be h2, h3 or h4? That seems pretty inconsequential to me...

    Dividing the site into sections is likely to be helpful to the user because it allows them to quickly identify, navigate to, and use, the most important part of the page. From their point of view, the navigation may be the most important part of the page.

    I'm not about to sacrifice my search engine optimization for better results in a tool that almost no one uses.

    And for a better experience for disabled users, and users with no CSS, and for users of any other tool that tries to interpret your markup for a purpose other than just applying style.

    In any case, it seems I'm unlikely to convince you to change your approach. Still it's only markup.

    Comment by jgraham at 21:47, 21 Jul, 2004 #

  71. Gabe: "Nevertheless, a listing of the headings in order of importance is of some benefit. Note that I'm not saying Hs are for emphasis, I'm saying that we might want to know what's the most important section of this page?"

    If you use Headings to indicate importance (not structurally, but in regards to the content), then you are using them to indicate emphasis. It's the same thing.

    "If you force a strict hierarchy you totally lose that ability, because suddenly it looks as if the first division of the site into major sections is the most significant, when it may really be inconsequential to the user."

    A logical consequent structure appears more inconsequential than an incoherent and illogic structure? Sorry, I don't buy that.

    "Again, in hierarchy mode it makes perfect sense, but I'm not about to sacrifice my search engine optimization for better results in a tool that almost no one uses."

    I'm not so sure that a logical document structure is as detrimental to one's search engine optimization as you imply.

    Finally, since you keep saying that a logical marked-up document structure is likely to confuse visitors, please tell me how/why Richard Rutter's reworked document structure might confuse visitors. Instead of talking about this as some kind of dangerous and possibly detrimental abstract theory, tell me how Richard's real working example might confuse visitors, because it doesn't confuse me at all, and I don't understand how the document structure might be the source of confusion for anyone else.

    Comment by Tomas Jogin at 22:07, 21 Jul, 2004 #

  72. Thomas Jogin recently posted an interesting entry that asks about HTML headings. He starts out this way: For some time now, something has been bothering me about the common way of marking up a document structure. First of all, it doesn't see...

    Trackback from Thoughts From Eric at 01:53, 22 Jul, 2004 #

  73. Gah! I already told you that my problem is not with using a strict hierarchy per se. All your rebuttals are based on an assumption that I think a hierarchy is always wrong. I'm just pointing out benefits of the old way of doing things. If you can't see that there are any benefits then of course I can never convince you. Perhaps Eric Meyer and Anne Van Kesteren will have some effect, since they are more established in web design circles.

    If you use Headings to indicate importance (not structurally, but in regards to the content), then you are using them to indicate emphasis. It's the same thing.

    Firstly, my main claim here is that H1s should be more important than H2s, etc. I don't think you should use H tags to emphasize things, they should be used to label sections, just as you suggest. When I talk of imporance I'm speaking only of relative importance between H tags.

    The difference is that the structure of the document has nothing to do with the relative importance of the sections, rather they are generally organized to facilitate the desired presentation via css. Given the very real implications to SEO, I'd like my headings to be ranked by importance to the overall document.

    A logical consequent structure appears more inconsequential than an incoherent and illogic structure? Sorry, I don't buy that.

    Again, you're thinking in outline mode. When a person views a page, they are not thinking in terms of the hierarchy of the document. The only headings that are necessary are the ones that are directly relevant to the content. Some of those sections they need identified will be lower or higher then others in the hierarchy, but we shouldn't feel a compelled to put in entries for sections that are self-explanatory just so the headings are at the right relative depth. Neither should we be forced to move headings up to the highest level so that hierarchy stays strict. If there were no search engine implications, or display issues with CSS turned off then I would say why not? But there are, and the only reason you have given so far is so outlining works. Well, there's no reason outlining can't work with skipped H levels, and without the SECTION tag what headings refer to is not well-defined anyway, especially considering that the order of things may have been mangled to meet CSS needs.

    Instead of talking about this as some kind of dangerous and possibly detrimental abstract theory, tell me how Richard's real working example might confuse visitors

    *sigh* I'm not saying it's a dangerous theory. I have said again and again that I think it's a good idea. Given the time, I could work up all my pages to meet both your requirements and mine (although it probably means the navigation would not use headings at all). Initially you jumped on me for using your h1. content h1. navigation structure, now you're saying I'm too abstract. Fine, here is a simple straightforward example how users can benefit from headings that have relative importance:

    I am the web manager for a large University site. We use the ht://dig search engine. It has options to weight H tags, this is immediately useful to the large proportion of our users that search. I want my headings to look like:

    h1. page title
    h2. section heading
    h3. subsection heading
    h3. subsection headgin
    h2. section heading
    h4. menu
    h4. related sites
    h5. internal
    h5. external
    h4. referrers

    The first think you will notice is that for the most part I use hierarchical principles. The thing is that the hierarchy breaks down since the page is not about the navigation, it's about the main content. Within that content I would be very unlikely to deviate from the strict hierarchy. But the navigation is a different story:

    I don't see how making the navigation use h1s and h2s makes such a huge improvement. Likewise, I don't feel like adding additional headings to balance things out. The search engine gets it right, it looks great without styles, and I have a handy heading convention that I can use across my site which has widely varying content (much of it dynamic and appearing in more than one place) and structure. Letting people know how the navigation and content is structured is fine, but it's not practical if I have to add extraneous markup then hide it with CSS or make the navigation H1s and H2s thus making it look as if the page could be about such banalities as "Menu" or "Links".

    So far you have not acknowledged a single point of mine as having any validity. You have painted me as being against your theory on principle even though I have said again and again that a hierarchy is good with a few caveats as to the use of H tags to specify the hierarchy. Perhaps I have written too hastily, but after everything I've said I find it hard to believe that you can not see where relative importance of H tags has real benefits, and where maintaining a strict hierarchy can become a maintenance nightmare. I'm speaking from the background of institutional web development, where information architecture for one website is a full-time job, and search engine functionality is of critical importance. On the scale of personal websites of course a little extra markup and CSS never hurt anyone, but it can make things difficult when people pass you a 50-page manual to put on-line and they want it marked-up and integrated into the website by the end of the week.

    Anyway, I don't think I can afford to spend any more time on this thread, though I look forward to reading your final rebuttal (if any) and future thought-provoking entries.

    Comment by Gabe at 05:56, 22 Jul, 2004 #

  74. Summary: More thoughts on the current debate surrounding headings, hierarchy and document structure.

    Trackback from box of chocolates at 06:56, 22 Jul, 2004 #

  75. Gabe: Ok, so first of all, you are arguing is that there is "some" positive aspects of using Headings for emphasizing content (i.e. as usual without consideration for how that affects the document structure), even though it is a good idea to think about the document structure. Sure, I agree.

    Secondly, you argue that different headings have relative importance to each other (H1 more important than H2, etc). Great, nobody has argued differently, and the spec is pretty clear about that; you're wasting your breath.

    "Again, you're thinking in outline mode. When a person views a page, they are not thinking in terms of the hierarchy of the document."

    When a person views the styled page, the underlying structure makes no difference one way or the other, because visual represenation is one thing, and presentation is something else completely. Richard Rutter just rethought the underlying structure of Clagnut.com, but if/when he implemented the change it wouldn't have to look different in any way, thanks to CSS.

    Those who do "see" the structure are very likely to be people with disabilities who use assistive software, or people who browse with styles turned off (probably to see better). In either case, a coherent document structure is going to be of help, if it confuses them then one have probably picked some seriously stupid Heading titles (again, see Richard Rutter's non-confusing example.).

    Comment by Tomas Jogin at 11:50, 22 Jul, 2004 #

  76. Wow, Tomas. Nice post, nice discussion. Nice impact - based on all the discussions, and subsequent blog posts that have already been seen, with likely more to come.

    I too had started a comment, then it got too long, so I posted it: Headings, Hierarchy and Document Structure

    Here's the gist: I took a look at what many people are currently doing -- skipping heading levels and ordering components of the site in their source -- and suggest that not allowing to skip levels and forcing a tree structure, or adding in extra headings only to be hidden later may not be what we want moving forward in XHTML 2.

    What I suggest is this: "It may make sense for the content of an article or a resource or blog post or whatever to have a linear structure, but we can�t necessarily force linear structure on the rest of the site wrapped around that content."

    So, moving forward in XHTML 2, why not allow for implied structure through the nesting of section and h elements, but also for explicit declaration of structure with a level attribute? That way we can indicate relative importance/hierarchy via nesting and explicitly, and get the best of both worlds.

    Presumably we'd be able to write software that would respect both implicitly and explicitly declared structure.

    jgraham -- I'm interested in hearing how your extension might or might not be able to work with this example (keeping in mind it is contrived XHTML 2, that uses a form but shouldn't...)

    Comment by Derek Featherstone at 12:48, 22 Jul, 2004 #

  77. Okay, I can't resist one last one.

    Those who do "see" the structure are very likely to be people with disabilities who use assistive software, or people who browse with styles turned off (probably to see better). In either case, a coherent document structure is going to be of help, if it confuses them then one have probably picked some seriously stupid Heading titles (again, see Richard Rutter's non-confusing example.).

    Yes, yes, of course, and if the page was created and ordered with only this in mind, then you could create the perfect hierarchy, and if css 2 support was bug-free you could probably even get the presentation you want via absolute positioning in a majority of cases.

    The problem is that your structure can't always be set up this way because create markup for the purpose of being styled for visual presentation. Eric Meyer's example of the search box in between the title and the content is a perfect example. Also think of a 3-float column design where the navigation is split into two sections by a center column.

    I think Richard Rutter's example is great. In fact, I think all sites can benefit from using a hierarchy to the extent that it's practical. I just know that the site structure that exists to facilitate presentation does not always reflect the conceptual hierarchy that you would want to present to the user.

    Comment by Gabe at 17:20, 22 Jul, 2004 #

  78. Gabe: "Yes, yes, of course, and if the page was created and ordered with only this in mind, then you could create the perfect hierarchy, and if css 2 support was bug-free you could probably even get the presentation you want via absolute positioning in a majority of cases."

    What do you mean? Why would hiding the presence of extra Headings, for the purpose of marking up document structure, need some sort of CSS trickery? Why would it be more difficult than to simply hide the Heading?

    "Also think of a 3-float column design where the navigation is split into two sections by a center column."

    So use one Heading for each section?

    Comment by Tomas Jogin at 17:29, 22 Jul, 2004 #

  79. Dave Shea wrote: “the guidelines suggest but don’t actually guide usage. It’s up to author interpretation, and nine times out of ten we get it wrong until someone comes along and definitively tells us how to do it right.”

    Interestingly enough, the WCAG 2.0 Working Group is incorporating “techniques” sections for markup, style sheets, scripting, and other technologies. Of course they won’t be the end-all solution to everything, but they will essentially say, “here’s how we would do it, and here’s why.”

    Comment by James at 18:52, 22 Jul, 2004 #

  80. Oh, by the way, here’s the link to the Development of Techniques for WCAG 2.0 and the scripting techniques contributions I just mentioned.

    Comment by James at 19:17, 22 Jul, 2004 #

  81. What do you mean? Why would hiding the presence of extra Headings, for the purpose of marking up document structure, need some sort of CSS trickery? Why would it be more difficult than to simply hide the Heading?

    Because, the document hierarchy could be different then the conceptual hierarchy.

    So use one Heading for each section?

    No comment on the Eric Meyer example? Anyway, what if there is no concept tying the sidebars together? Would you just call one Left Navigation and one Right Navigation? That hardly communicates anything valuable.

    Besides, what if the title of the document is at the top of the page and you want the center content to be directly underneath the title heading in the hierarchy? Well that hierarchy is impossible with a float layout unless you simply remove all headings from the sidebars (which may not be bad idea, but I digress...). A compromise might be to do something like Eric Meyer, make the sidebar headings h4 and content headings h2 and h3, but it's hardly a proper hierarchy. To get the right hierarchy you have to re-arrange the content, that is where the CSS trickery comes in. With solid absolute positioning support and a little more robustness (eg. a way to clear an absolutely positioned element would be a start) then we'd be good to go.

    Comment by Gabe at 21:52, 22 Jul, 2004 #

  82. Gabe: "what if there is no concept tying the sidebars together? Would you just call one Left Navigation and one Right Navigation? That hardly communicates anything valuable."

    Eh, well, if the naviagation is completely and utterly arbitrarily split in two, without the slightest thought as to why or how, then yes. However, if the navigation is split in logical parts, (see Richard Rutters example) then you'd name those logical parts appropriately.

    Marking-up a document structure doesn't relieve you of actually thinking through a consistent structure, it only marks it up. If you have no logical structure, well, doh, it can't be marked up as a logical structure, either. I assumed that this was beyond the scope of the discussion.

    I'm not sure what it is I'm supposed to say about Meyer's example. He has marked up a document without considering the consequences to the document structure. So what? So have I, so have most people. That does not negate the fact that Headings denote structure, nor that users of assistive software will find a logical structure more useful than an incoherent structure.

    You also bring up a series of questions related to the presentation, which are totally irrelevant to the underlying structure.

    Comment by Tomas Jogin at 22:06, 22 Jul, 2004 #

  83. jgraham -- I'm interested in hearing how your extension might or might not be able to work with this example

    Well, at the moment, it wouldn't at all since it doesn't understand XHTML 2 (I'm fairly sceptical that XHTML 2 will ever be adopted on a widespread basis, but I digress). If I were adding XHTML 2 support, I might initially try using <section> alone to determine the heading level, since this would be sure to produce a decent structure. But it turns out that all the <h{n}> are back in the spec so I guess authors might use <h{n}> without sections or use <section> without regard to outline (as a 'more semantic' replacement for <div>) and then use <h{n}> to mark out the relatve heading levels.

    Conclusion: having two mutually incompatible heading schemes will make parsing documents for semantics even more difficult. If all the elements were use correctly, it would be fine and XHTML 2 might be almost as easy as HTML 4 (the actual parsing code would be harder since you'd have to keep track of depth, but it wouldn't be so bad). On the assumption that authors will abuse the language wherever it's not entirely clear what is the 'correct' usage, it looks like being a nightmare.

    Comment by jgraham at 00:17, 23 Jul, 2004 #

  84. To be clear, when I say that I would just use <section> above, I mean that I would ignore both any level attribute and the value of n in <h{n}>.

    Comment by jgraham at 00:51, 23 Jul, 2004 #

  85. You also bring up a series of questions relat