#PCDATA & CDATA Sections


 
In this page:

#PCDATA

#PCDATA stands for Parsed Character Data.

If an element's Content Model (see Element Content Models) specifies that #PCDATA is allowed, this means that text may be enclosed directly within the element (e.g. <p>Text</p>). This text is parsed for markup (unlike text within CDATA Sections, which is all treated as character data) and so may not contain less-than (<) or bare ampersand (&) characters. Any such characters must be escaped as, for example, &lt; or &amp; respectively. (Ampersand characters are allowed only if being used to begin Character & Entity References - e.g. &#39;.) Additionally, any greater-than (>) characters which occur within the string ]]> must be escaped as, for example, &gt;.

It is possible for an element to have more than one section of #PCDATA. For example, in the code <p>That is <strong>not</strong> fair!</p>, the p element has two #PCDATA children:
"That is " and " fair!"
and the strong element has one:
"not".

An empty string is also valid #PCDATA. This means that, for example, it is perfectly valid to have a script element with no content, i.e.
<script type="text/javascript" src="process.js"></script>

CDATA Sections

CDATA Sections may be placed wherever #PCDATA is allowed and are used to escape entire blocks of text. Any & or < characters within the CDATA Section are not treated by the XML parser as the start of markup, but instead as character data (as if they had been explicitly escaped). The fact that markup is not recognised within a CDATA Section means that, not only is it unnecessary to escape these characters, it is also impossible to do so: entity and character references are not recognised as such and nor is any other markup.

A CDATA Section begins with the string <![CDATA[ and ends with the string ]]>. This means that a CDATA Section cannot contain the string ]]> within it. CDATA Sections may not be nested.

An example of a CDATA Section is:

<![CDATA[
    Bill & Ben
]]>

CDATA Sections are perhaps most commonly used in XHTML to enclose the content of script or (more rarely) style elements, allowing code to be written without worrying about the need to escape problem characters (< and &) - or to avoid the unintended expansion of Character & Entity References. (This situation arises because these elements are declared with a content model of #PCDATA.) Using external stylesheets or scripts is a more elegant solution and may be necessary if you want to serve your document as "text/html". See the W3C Note (work in progress) *XHTML Media Types - Compatibility Guidelines - Item A.4.

Note: CDATA Sections are not supported by most HTML user agents (with Opera being a notable exception) when serving a document as "text/html". They should be supported, however, in compliant XHTML user agents where the XHTML has been served as "application/xhtml+xml" (see Serving XHTML 1.1).


Page Footer & Copyright

Copyright © Sally Maughan 2005-2009 (Page last updated on 16 May 2009)

*Valid XHTML 1.1 - hosted by *Openstrike

Content based on the W3C Working Draft: *XHTML 1.1 and Recommendation: *XHTML Modularisation 1.1.

W3C, XHTML, XML, HTML, CSS and MathML are *Trademarks of the W3C (*MIT, *ERCIM, *Keio) with which the site's author has no connection.


Up, Next & Previous Links

Your Location

Home > XHTML 1.1 Home > Data Types in XHTML 1.1 > #PCDATA & CDATA Sections