DTD Basics   «Prev  Next»

Lesson 6Mixed content
ObjectiveWrite Element Declarations for Mixed Content

Write Element Declarations for Mixed Content and Empty Elements

If an element will contain a mix of text and additional child elements, it is said to have mixed content.
The following series of images displays an example of working with mixed content.

The example above demonstrates mixed content
1) The example above demonstrates mixed content.
<PRODUCT-REVIEW>
 <PRODUCT-REVIEWED>Shelby's Tools</PRODUCT-REVIEWED>
 <REVIEWER>Ann Marie </REVIEWER>
 <REVIEW>

The element declaration for REVIEW in the previous example could be written REVIEW like this. In other words, the element can contain any number of PRODUCT-NAME elements or parsed character data, in any order.
2)
<!ELEMENT REVIEW (#PCDATA | PRODUCT-NAME) *>
The element declaration for REVIEW in the previous example could be written REVIEW like this. In other words, the element can contain any number of PRODUCT-NAME elements or parsed character data, in any order.


When an element is going to contain #PCDATA and child elements, #PCDATA must be listed first with other elements listed after.
3) CORRECT:
<!ELEMENT REVIEW(#PCDATA | PRODUCT-NAME)>
When an element is going to contain #PCDATA and child elements, #PCDATA must be listed first with other elements listed after.

XML Mixed Content
  1. The example above demonstrates mixed content.
  2. The declaration for REVIEW in the previous example could be written REVIEW like this.
  3. When an element is going to contain #PCDATA and child elements, #PCDATA must be listed first with other elements listed after

As shown in the SlideShow above, in a mixed-content declaration, #PCDATA appears first, and it only appears once. Unlike an element that contains only other child elements, you cannot specify the "order" or number of the elements in the mixed-content declaration. Instead, the | is used to separate the elements. Similarly, because the #PCDATA might appear anywhere, you have to use the " * " symbol following the " ) ".

#PCDATA appears first

Question: What is the purpose of #PCDATA in XML?
In XML, #PCDATA is a keyword used to indicate that an element may contain character data (i.e. plain text) as its content. The "#PCDATA" notation stands for "Parsed Character Data" and is used to define elements that contain text content that can be parsed and processed by XML parsers.
For example, consider an XML element named "title" that contains only text content:
<!ELEMENT title (#PCDATA)>

In this example, the "title" element is defined using the "#PCDATA" keyword to indicate that it can contain only character data as its content. Here is an example of an XML document that uses the "title" element:
<?xml version="1.0" encoding="UTF-8"?>
<book>
  <title>XML for Beginners</title>
  <author>John Smith</author>
  <publisher>Publishing Company</publisher>
  <year>2023</year>
</book>

In this example, the "title" element contains the text "XML for Beginners", which is the title of the book.
The #PCDATA keyword is often used in conjunction with other keywords to define more complex element content models, such as mixed content models that allow both text and child elements within an element. For example:
<!ELEMENT paragraph (#PCDATA | emphasis)*>
<!ELEMENT emphasis (#PCDATA)>

In this example, the "paragraph" element can contain either character data or the "emphasis" element in any order, allowing for mixed content within the element. The "emphasis" element is defined as containing only character data using the #PCDATA keyword.

Declaring Empty Element

To declare an empty element in a DTD, you must use the following syntax:

<!ELEMENT elementName EMPTY> 

Note that there are no allowed element contents in this declaration. If there were, it would not be an empty element. Remember that when adding empty elements to your XML document , you must add the / character before the closing angle bracket. If you have an element with this type declaration:

<!ELEMENT Marker EMPTY> 

you would include the tag in an XML document as follows:
<Marker/> 

Use the same method to include comments in the DTD as you would within an XML document. At any point in the DTD, you can use the following syntax to include comments:
<!-- comment text here -->

Mixed Content

The XML Recommendation does not really talk about mixed content or text content on its own. Instead, it specifies that any element with text in its content is a mixed content model element. Within mixed content models, text can appear by itself or it can be interspersed between elements.
Note: In everyday usage, people refer to elements that can contain only text as text-only elements or text-only content.
The rules for mixed content models are similar to the element content model rules that you learned in the preceding section. You have already seen some examples of the simplest mixed content model which is text-only:
<!ELEMENT first (#PCDATA)>

The preceding declaration specifies the keyword #PCDATA within the parentheses of the content model. PCDATA is a keyword derived from Parsed Character DATA. It simply indicates that the character data within the content model should be parsed by the parser. Here is an example element that conforms to this declaration:
<first<John</first>

Mixed content models can also contain elements interspersed within the text. Suppose you wanted to include a description of each contact in your XML document. You could create a new element <description>: <description>Tom is a developer and author for <title>Beginning XML</title>, now in its <detail>5th Edition</detail></description>. In this example, you have a <description> element. Within the <description> element, the text is interspersed with the elements <title> and <detail>.
The next lesson covers referencing DTD declarations in XML.

Ad XML Corporate Portals