An Overview of XML
XML (eXtensible
Markup Language) is a method for putting structured data in a text file.
Although
an XML document superficially resembles an HTML document in that they
are both the product of a markup language that uses tags, that is where
the similarity ends. XML overcomes the common pitfalls of unstructured
markup languages, of which HTML is the most widely used.
XML produces
files that overcome ambiguity, lack of extensibility, lack of support
for internationalization/localization, and platform-dependency.
XML is
often called a meta-markup language, and that is because XML tags are
used differently than those in convential markup languages. Whereas
<p> means paragraph in HTML, <p> in XML could mean anything
depending on the stylesheet that is used to translate the XML. Therefore, tags
delimit pieces of data in order to structure it appropriately, but leave
the interpretation of the data completely to the application that reads
it.
The rules
of XML syntax are much stricter than for HTML. In HTML, forgotten tags
or attributes are tolerated. But the official XML specification prohibits
second-guessing the meaning of a broken XML file; if the file
is broken, an application has to stop and issue an error.
XML files
are text files. As such, they are larger than equivalent binary files,
but they can be easily compressed. XML files are not intended to be
human- parsable the way HTML file are, but they are readable for purposes
of debugging, for example.
The
XML family of technologies
A family
of technologies has developed around the XML 1.0,
specification that defines what tags and attributes are. These include:
Xlink
describes a standard way to add hyperlinks to an XML file.
CSS,
the stylesheet language, is applicable to XML and to HTML.
XSL
is the advanced
language for expressing stylesheets. It is based on XSLT,
a transformation language for rearranging, adding or deleting tags and
attributes.
DOM
is a standard set of function calls for manipulating XML (and HTML)
files from a programming language.
XML
Namespaces is a specification that describes how
to associate a URL with every tag and attribute in an XML document.
XML
Schemas 1 and 2 help developers to precisely define their
own XML-based formats.
What
is the origin and future of XML?
Development
of XML started in 1996, and it became a World
Wide Web Consortium (W3C) standard in February 1998. It evolved
from SGML, developed in the early '80s and an ISO standard since 1986,
and from HTML, whose development started in 1990.
The designers
of XML took the best parts of SGML, guided by the experience with HTML,
and set out to produce something that is simpler to use. Growth of XML
has outpaced the most optimistic projections.
Choosing
XML for data interchange is analogous to choosing SQL for databases.
It's universally used, but you still have to build your own database
and your own programs/procedures to manipulate it. There are already
many tools available, and many software developers who are proficient
with XML.