Monday, December 9, 2013

From XML data to an EMF model

Recently I had to implement a large XML data storage into my RCP application. There was no schema available and the data was structured in some awkward way. Still the xml structure needed to be preserved as other tools depend on the data format. As I was playing around with EMF for a long time without actually getting serious on it I thought: "well, wouldn't it be nice if EMF could handle all the XML parsing/writing stuff".

Source code for this tutorial is available on googlecode as a single zip archive, as a Team Project Set or you can checkout the SVN projects directly.

Prerequisites

We need some XML example data to work on. So I chose a small storage for CDs:
<?xml version="1.0" encoding="UTF-8"?>
<collection>
 <disc title="The wall">
  <artist>Pink Floyd</artist>
  <track pos="1">In the flesh</track>
  <track pos="2">The thin ice</track>
  <track pos="3">Another brick in the wall</track>
 </disc>
 <disc title="Christmas compilation">
  <track pos="1">Last Christmas<artist>WHAM</artist></track>
  <track pos="2">Driving home for Christmas<artist>Chris Rea</artist></track>
 </disc>
</collection>

Step 1: Creating a schema

EMF is capable of creating a model from an existing schema. As we do not have one yet, we need to create it. Fortunately we need not to do this on our own. For this tutorial we will use an online converter, alternatively you could use xsd.exe from the Microsoft Windows SDK for .NET Framework 4 if you prefer a local tool.

Your schema.xsd should look like this:
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="collection">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="disc" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:sequence>
              <xs:element type="xs:string" name="artist" minOccurs="0"/>
              <xs:element name="track" maxOccurs="unbounded" minOccurs="0">
                <xs:complexType mixed="true">
                  <xs:sequence>
                    <xs:element type="xs:string" name="artist" minOccurs="0"/>
                  </xs:sequence>
                  <xs:attribute type="xs:byte" name="pos" use="optional"/>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
            <xs:attribute type="xs:string" name="title" use="optional"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

A generated schema might need some tweaking here and there. Eclipse offers a nice visual schema editor for this purpose. Just make sure you do not alter the scheme too much. Your sample data still needs to be valid. To verify this, select your xml and xsd file and select Validate from the context menu.

Step 2: Create a model

Before we can create a model, we need to install some additional components. Pretending you started with a vanilla Eclipse for RCP Developers you additionally need to install EMF - Eclipse Modeling Framework SDK. Furthermore install the XSD Ecore Converter (uncheck Group items by category to find it).

You already should have a Plug-in project to store your files to. Now create a new EMF Generator Model. Name it Catalog.genmodel and select XML Schema for the model import.

On the next page select the schema.xsd file for the model input.

Finally rename the file from Schema.ecore to Catalog.ecore. You will end up not only with a Catalog.genmodel, but also with a Catalog.ecore file.

Step 3: Adapting the model

Looking at the ecore model we can see that all elements contain an ExtendedMetaData annotation. They are responsible for storing the XML representation of the model elements. This allows us to rename model classes and attributes without breaking XML import/export. Eg. we could get rid of the Type extension that was added to all EClasses.

Step 4: Using the model

Now that we have a model, we may generate code from the genmodel just as we would do for any other model: open the genmodel and select Generate All from the context menu of the Catalog node. Run your application, copy over your xml data, rename the data file to something.scheme and open it in the generated editor.
 To load the model from source you may use following snippet:
  SchemaPackageImpl.eINSTANCE.eClass();

  Resource resource = new ResourceSetImpl().getResource(URI.createFileURI("C:\\your\\path\\Sample data.schema"), true);

  EObject root = resource.getContents().get(0);
  if (root instanceof DocumentRoot) {
   for (Disc disc : ((DocumentRoot) root).getCollection().getDisc())
    System.out.println(disc.getTitle());
  }