The Simple Answer
This question crops up quite often, and the simple answer is "yes". For example, the following document shows how the
u-form for New York City can be represented in XML:
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:cityrole="http://uform.civium.net/~01de90b7c4e2d111d689643ee15b07581d/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
<rdf:Description rdf:nodeID="urn:uuid:~01fddbd808e2da11d6b9d507eb2135547d">
<cityrole:metro_population rdf:resource="17967900"/>
<cityrole:population rdf:resource="7262700"/>
<cityrole:country_name rdf:resource="US"/>
<cityrole:country_code rdf:resource="US"/>
<cityrole:latitude rdf:resource="40.75"/>
<cityrole:name rdf:resource="New York"/>
<cityrole:country rdf:resource="None"/>
<cityrole:longitude rdf:resource="-74.0"/>
</rdf:Description>
</rdf:RDF>
This document uses XML to represent the u-form for New York in RDF (Resource Definition Format), which is the format specified by the Semantic Web Consortium. It was automatically (and very easily) generated using the Python RDF library (from
http://www.rdflib.net) in conjunction with the Information Commons Python API from MAYA Design.
The More Complex Answer
The next question that normally comes up is "Why don't you just use XML then, instead of u-forms?"
While not surprising, this question is deeply mistaken - u-forms are information, XML is a format (or serialization) for information. It's a bit like asking "Why don't you just use XML instead of bibliographic citations in your papers?" Clearly, you can use XML to represent your bibliographic database, just as you can use XML to represent any other types of information represented in u-forms, and the u-form for New York above is just one example. It doesn't mean that XML is a substitute for this information!
However, there is a properly formed question that people rightly ask, once they know more about the Information Commons, which is Why do you use VSMF as a serialization instead of XML? Since so many people are using XML as an extensible serialization method to represent data that is being stored and passed in between machines, it is sensible to ask why we'd want to use another serialization, namely the Visage Standard Message Format.
To understand this, you need to know a little bit more about the history of XML, and the goals of the Universal Database architecture upon which the Information Commons is built. XML was designed to be a human readable markup language - so you could use XML to give well-formed instructions to machines, but you can also write documents that a human can read. While this is a laudable goal in some ways, it puts extra burdens on the XML system, because it is trying to do 2 things at once. You can see from the New York example above that the XML document is pretty formal in nature, and it's questionable whether it's "really" readable by any humans except for experts. And this is a pretty simple example - once the information contained lots of links (for example, URI, URN or UUID references) to other information objects, it wouldn't really be human readable at all. But even though XML isn't very human-readable, the goal of readability enforces compromises on other parts of the architecture. For example, any other datatypes that you want to represent (such as Boolean values, integers, floating point numbers) tend to be built out of strings, instead of building them directly out of bits and bytes.
Obviously, as far as machines are concerned, it's very important to be able to manipulate bits and bytes directly, for the sake of efficiency, without using strings as an intermediary. You might argue that these days are over, and that such penny-pinching engineering doesn't lead to semantic interoperability and open standards. But here, we disagree. We believe that it's very important to build an architecture where small messages are possible, because we're not just creating webpages with rich markup - we're building a global information architecture where even a tiny smoke-detector or battery-operated light switch can send a message. This is why the shortest well-formed message in VSMF has a length of just 1 byte.
This isn't just a bonus for small devices - it's also important for huge databases. Large genetic databases may contain hundreds of thousands of gene-marker Boolean values for thousands of samples. Multiply those together and you've got a lot of Boolean values. In order to make genetic databases semantically interoperable with other knowledge-sources in the Information Commons, we can't ask a researcher who already has terabytes of information stored in a compressed format to start using XML strings instead.
So, where appropriate, it is perfectly possible to represent the Information Commons model using XML, and this is precisely what we do to be interoperable with other systems where necessary. But we don't want to be constrained to only using XML, because there are problems that are best solved using other representation formats.
