XML Syntax - Learn The Basic Structure of a Markup Language

By Daniel Imbellino
Feb 28,2013

  XML follows the same rules as XHTML. After all, XHTML was built on the XML platform. Just as elements in XHTML have attributes, and attribute values, so do XML elements. In XHTML we could take the <img src=”example-image.png” /> tag as an example. The “img” portion is our tag (element), while “src” is our attribute, and “example-image.png” is our attribute value. We could add other attributes to our “img” tag, for example, such as specifying a width or height for our image. XML languages also require a root element, just as in XHTML. What makes XML so great as compared to other markup languages is the fact that it uses true semantic markup! Meaning, the elements used to represent data formatted within a webpage can be directly descriptive of its content (hence, the tags and content are related to each other).

  In XHTML the root element is “html”, and in XML documents there must be one. All XML languages follow a nested element structure. You can nest one element inside another like this:


This is what we call a “parent-child” relationship. Notice the title tag is nested inside the head tag, which is nested inside the body tag. Also notice the <h1> and <p> tags are nested inside the body tag. Since the <title> tag is nested inside the <head> tag, the <title> tag is a “child” element of the <head> tag. Likewise, the <head> tag is considered the “parent” element of the <title> tag. HTML is our root element, and all tags inside it are its child elements, so to speak.

We can do the same thing with XML, nesting one element inside another as shown above. We can define descriptive names, attributes, and attribute values, we just need to remember to specify a “root” element for our language.

An example XML document is shown below. This could be a markup language for the XBOX 360:

<model type=”360” />
<title>FarCry Instincts Predator</title>
<isbn id=”BN76932J10” />
<gender>First-Person Shooter</gender>

Notice we declared the <xbox> tag as our root element. We created a <game> tag that was used to encapsulate our <title>, <isbn />, and <gender> elements. Notice two of our tags are open ended on one side. The 2 open elements were the <model /> and <isbn /> tags. The <model /> and <isbn /> tags were our “open elements”, while “type” and “id” were their attributes, followed by attribute values for both. The “open elements” must end with a forward slash followed by a greater-than sign in order to be considered valid XML. The two open elements we defined resemble the <img /> tag from XHTML. Same principles, same concept.

Open elements follow this idea:
<element attribute="attribute value" />
Where our attribute value can be defined as having access to only certain values if we chose to do so in an XML schema. With Schema's we can represent binary data as well in our documents, such as images, or video files.

XML documents also require an XML declaration like the one shown here:
<?xml version="1.0?>
The xml declaration is used as an identifier for user agents along with the ".xml" file extension.

XML gives web publishers the power to define languages based on descriptive information, unlike XHTML which doesn’t define its content very well. XHTML only specifies formatting for a web based document, it doesn’t define what that document actually consists of, while an informative XML language does. Think about it? The <p> tag in HTML defines the use of a paragraph, but it doesnt define any valuable information as to what is contained in that paragraph. Using our root element (<xbox>) above, provides key information to help identify what the document is about.

Notice we are only defining the structure of a given XML language, not defining the rules for it. You create an XML-schema to define the rules and structure for your language. We will be discussing schemas in another tutorial.

Also, Case does matter! <ELEMENT> and <element> are 2 separate elements. It's perfectly legal to use uppercase, and it's almost the norm with many markup languages. For the sake of clarity, stick to one convention in your markup, either upper or lowercase.