illustrate:
All text in an XML document is parsed by a parser.
Only text within CDATA sections is ignored by the parser.
PCDATA - Parsed character data
XML parsers normally parse all text in an XML document.
When an XML element is parsed, the text between its tags is also parsed:
<message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example, where an element contains two other elements (first and last):
<name><first>Bill</first><last>Gates</last></name>
And the parser will break it down into sub-elements like this:
<name>
<first>Bill</first>
<last>Gates</last>
</name>
Parsing character data (PCDATA) is a term used for text data parsed by an XML parser.
CDATA - (unparsed) character data
The term CDATA is text data that should not be parsed by an XML parser.
Characters like "<" and "&" are illegal in XML elements.
"<" produces an error because the parser interprets this character as the start of a new element.
"&" produces an error because the parser interprets this character as the start of a character entity.
Some text, such as JavaScript code, contains a large number of "<" or "&" characters. To avoid errors, script code can be defined as CDATA.
Everything within a CDATA section is ignored by the parser.
CDATA sections start with "<![CDATA[" and end with "]]>":
<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</script>
In the example above, the parser ignores everything within the CDATA section.
NOTE:
A note about the CDATA section:
- A CDATA section cannot contain the string ']]>'.
- Nested CDATA sections are also not allowed.
- The "]]>" marking the end of a CDATA section cannot contain spaces or newlines.
other
- <![CDATA[]]> cannot be used in all situations, escape characters can;
- For short strings <![CDATA[]]>, it is cumbersome to write, and for long strings, escape characters are poorly readable;
- <![CDATA[]]> means that the xml parser ignores parsing, so it is faster.