Below is the basic structure of an XHTML 1.1 Document:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Page Title</title>
</head>
<body>
<!--Your page content goes here-->
</body>
</html>
XHTML is a flavour of XML and the first line of the above code is the XML declaration which tells the user agent that the document is written in XML 1.0 with a character encoding of UTF-8 unicode (in this case).
The XML declaration is optional. However, if your page is written using a character set other than one of the default XML encodings
(UTF-8 and UTF-16), it is required that an XML declaration be provided to indicate this,
e.g. <?xml version="1.0" encoding="ISO-8859-1"?>. In fact, it is generally recommended to include an XML declaration
in all XHTML 1.1 documents - see
*XHTML 1.1 - Document Conformance - unless
they are to be served as "text/html" (see The XML Declaration and DOCTYPE Sniffing below).
For reference, there is a *list of approved character sets
published by the IANA (use the
preferred MIME name
of the character set in question, if there is one).
The second line is the DOCTYPE which informs the user agent of the precise document type. The DOCTYPE is mandatory for your
code to pass W3C Validation and, for this example, it states that document obeys the
Document Type Definition (DTD) for XHTML 1.1 (or, at least, that it's supposed to!).
If you are writing in pure XHTML 1.1, the DOCTYPE must be written exactly as it appears
in the second (through to third) line of code.
In the DOCTYPE, "-//W3C//DTD XHTML 1.1//EN" is the public identifier and
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" is the system identifier URL.
As a whole, any code prior to the to the root element (html for XHTML) in an XML document
is known as the XML Prolog: this includes the XML declaration, the DOCTYPE and any Processing Instructions or XML comments.
DOCTYPE sniffing (also known as DOCTYPE switching) may come into play when documents are served as "text/html". This is often done with XHTML because of the fact that Internet Explorer browsers do not currently support the "application/xhtml+xml" Content-Type.
Many older browsers did not (and, naturally, still don't!) obey the CSS standard and, when concession was made to standards-compliance in more recent browsers, a more compliant 'Standards Mode' (also known as 'Strict Mode' or 'Standards-Compliant Mode') of rendering was introduced. The old 'Quirks Mode' was retained so that existing pages did not immediately break in the newly-compliant (or, in the case of IE6, only partially compliant) browsers.
Newer browsers usually choose between 'Standards Mode' and 'Quirks Mode', for any given "text/html" document, by the technique of DOCTYPE sniffing (also known as DOCTYPE switching). This technique uses the presence/absence and nature of a DOCTYPE in order to decide upon the rendering mode. Broadly speaking, a document without a DOCTYPE will be rendered in 'Quirks Mode' and a document with a DOCTYPE will be rendered in 'Standards Mode'. However, it is actually a little more complicated than that: documents containing certain DOCTYPEs (e.g. HTML 2.0, 3.0 or 3.2 DOCTYPEs and some DOCTYPEs which do not contain a system identifier) will be rendered in 'Quirks Mode'... and, just to make life more interesting, Mozilla has an additional 'Almost Standards Mode' for some Transitional and Frameset DOCTYPEs. For more information, see:
Despite differing behaviours for different DOCTYPEs, the upshot is that the presence of an XHTML DOCTYPE in a "text/html" document should trigger 'Standards Mode' in most recent browsers (but see the next section). Documents served as "application/xhtml+xml" should not be subject to DOCTYPE sniffing and always be rendered in 'Standards Mode' (in those browsers which support this Content-Type).
The IE bug: To make things even more complicated, a bug in IE6 means that any documents containing an XML declaration or an XML/HTML comment (or indeed anything else, except possibly whitespace) prior to the DOCTYPE are rendered by IE6 in 'Quirks Mode', even if they would have been in 'Standards Mode' if the DOCTYPE were the first line of code. This bug has been fixed in IE7, but only for the XML declaration - a comment will still trigger quirks mode, even if placed between the XML declaration and the DOCTYPE.
The W3C currently recommend that the XML declaration be omitted for those XHTML documents which are served as "text/html" and nor should such documents contain any XML processing instructions - see the W3C Note (work in progress): *XHTML Media Types - Compatibility Guidelines - Item A-1. Despite this, I have never encountered a problem including the XML declaration in "text/html" documents and, because of my personal preference to trigger quirks mode in IE6 without adding a gratuitous comment, this is the one Compatibility Guideline which I don't adhere to (except with regard to processing instructions). The reason I prefer to have IE6 in quirks mode is that IE6's so-called 'Standards-Compliant Mode' is not actually very compliant at all and I find it easier to hack both IE5 and IE6's 'Quirks Modes' in the CSS, than to have to cope with an additional, broken rendering algorithm. This is because the 'Quirks Mode' of IE6 is similar (if not quite identical) to that of IE5 and, more importantly, the CSS hacks are similar. So, for me, the IE DOCTYPE switching bug is actually a blessing!
IE8 has recently introduced the use of the X-UA-Compatible HTTP header to specify the version of IE for which the page is designed, thereby supposedly allowing authors to "opt in" to the more rigorous standards compliance which is now available in IE8, without any DOCTYPE sniffing shenanigans. It rankles a bit to be linking to msdn yet again, but for reference, see: *msdn: META Tags and Locking in Future Compatibility.
Personally, I hate the principle of a compliance opt-in: shouldn't standards compliance be 'standard' - i.e. enabled by default
(with an opt-out if you need it) if we're going to achieve any real forward momentum?
Mind you, if it's only an IE phenomenon and if by including the header
X-UA-Compatible: IE=8
(or the equivalent <)meta http-equiv="X-UA-Compatible" content="IE=8" />
I can be sure that my site will work in IE8 and any subsequent versions then I'm all for it.
I have to say that the idea of kicking Internet Explorer a bit further into the long grass appeals greatly -
I can't possibly quantify the trouble IE has caused me over
the years, but it's at least 20 times as much as any other browser.