Here we gives some examples of schema that can be used in the conversion, as well as schema that be compared with other DFDL solutions. If you click the "try it" button in non-IE browser, the sample data will show up in the converter so you can try it out. Hopefully this gives insight of how the storage format commands work.
A schema that does a hex-dump of a binary file
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xsd:element name="hexDump" type="xsd:hexBinary" data:dataCounter="EOF"/> </xsd:schema>
A schema that generate XML listType
040100020003000400
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xsd:element name="listOfIntegers" data:dataCounter="xsd:unsignedByte"> <xsd:simpleType> <xsd:list itemType="xsd:short"/> </xsd:simpleType> </xsd:element> </xsd:schema>
<listOfIntegers>1 2 3 4</listOfIntegers>
Boolean in attributes, elements and list
010100010104000101000102000100
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xsd:annotation> <xsd:appinfo> <data:format dataCounter="xsd:unsignedShort"/> </xsd:appinfo> </xsd:annotation> <xsd:element name="booleanTest"> <xsd:complexType> <xsd:sequence> <xsd:element name="booleanList"> <xsd:simpleType> <xsd:list itemType="xsd:boolean"/> </xsd:simpleType> </xsd:element> <xsd:element name="booleanElement" type="xsd:boolean" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> <xsd:attribute name="attrib1" type="xsd:boolean"/> <xsd:attribute name="attrib2" type="xsd:boolean"/> <xsd:attribute name="attrib3" type="xsd:boolean"/> </xsd:complexType> </xsd:element> </xsd:schema>
<booleanTest attrib1="true" attrib3="true"> <booleanList>true true false true</booleanList> <booleanElement>true</booleanElement> <booleanElement>false</booleanElement> </booleanTest>---------------------------------------------------------
Same XML from packed boolean field
0504000B020001
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xsd:annotation> <xsd:appinfo> <data:format dataCounter="xsd:unsignedShort" bitField="lowBitFirst"/> </xsd:appinfo> </xsd:annotation> <xsd:element name="booleanTest"> <xsd:complexType> <xsd:sequence> <xsd:element name="booleanList"> <xsd:simpleType> <xsd:list itemType="xsd:boolean"/> </xsd:simpleType> </xsd:element> <xsd:element name="booleanElement" type="xsd:boolean" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> <xsd:attribute name="attrib1" type="xsd:boolean"/> <xsd:attribute name="attrib2" type="xsd:boolean"/> <xsd:attribute name="attrib3" type="xsd:boolean"/> </xsd:complexType> </xsd:element> </xsd:schema>
<booleanTest attrib1="true" attrib3="true"> <booleanList>true true false true</booleanList> <booleanElement>true</booleanElement> <booleanElement>false</booleanElement> </booleanTest>
Here is an example in BizTalk 2004 Flat File Schema Tutorial by Tomas Restrepo and how to handle it in this converter. You can compare the BizTalk solution with our solution.
CITYLIST CSEATTLE WA00198776 SWASHINGTON WA SARIZONA AZ CTUCSON AZ89112299 . MILKPRICES SEATTLE 1000USD TUCSON 19200USD
434954594C4953540D0A4353454154544C452020202020202020202020202020 202020202020202020574130303139383737360D0A5357415348494E47544F4E 202020202020202020202020202020202020202057410D0A534152495A4F4E41 2020202020202020202020202020202020202020202020415A0D0A4354554353 4F4E202020202020202020202020202020202020202020202020415A38393131 323239390D0A2E0D0A4D494C4B5052494345530D0A53454154544C4520202020 202020202020202020202020202020202020202020313030305553440D0A5455 43534F4E20202020202020202020202020202020202020202020202020313932 30305553440D0A
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:annotation> <xs:appinfo> <data:format dataCounter="" fieldWidth="xs:maxLength"/> </xs:appinfo> </xs:annotation> <xs:element name="MultiFile"> <xs:complexType> <xs:sequence> <xs:element name="CityList"> <xs:complexType> <xs:choice minOccurs="0" maxOccurs="unbounded" data:itemCounter=".\r\n"> <xs:element name="City" data:choiceTag="C"> <xs:complexType> <xs:sequence> <xs:element name="Name"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="30"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="State"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:length value="2"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="NumSewers" data:printf="%0*u\r\n"> <xs:simpleType> <xs:restriction base="xs:nonNegativeInteger"> <xs:totalDigits value="8"/> </xs:restriction> </xs:simpleType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="State" data:choiceTag="S"> <xs:complexType> <xs:sequence> <xs:element name="Name"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="30"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="Initials" type="xs:string" data:dataCounter="\r\n"/> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> <xs:attribute name="readFrom" use="required" type="xs:string" data:dataCounter="\r\n"/> </xs:complexType> </xs:element> <xs:element name="MilkPrices"> <xs:complexType> <xs:sequence> <xs:element minOccurs="0" maxOccurs="unbounded" name="MilkPrice" data:itemCounter="EOF"> <xs:complexType> <xs:sequence> <xs:element name="CityName"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:maxLength value="31"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="Price" data:printf="% *u"> <xs:simpleType> <xs:restriction base="xs:nonNegativeInteger"> <xs:totalDigits value="5"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="Currency" type="xs:string" data:printf="%3s\r\n"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="readFrom" use="required" type="xs:string" data:dataCounter="\r\n"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<?xml version="1.0"?> <MultiFile> <CityList readFrom="CITYLIST"> <City> <Name>SEATTLE</Name> <State>WA</State> <NumSewers>00198776</NumSewers> </City> <State> <Name>WASHINGTON</Name> <Initials>WA</Initials> </State> <State> <Name>ARIZONA</Name> <Initials>AZ</Initials> </State> <City> <Name>TUCSON</Name> <State>AZ</State> <NumSewers>89112299</NumSewers> </City> </CityList> <MilkPrices readFrom="MILKPRICES"> <MilkPrice> <CityName>SEATTLE</CityName> <Price>1000</Price> <Currency>USD</Currency> </MilkPrice> <MilkPrice> <CityName>TUCSON</CityName> <Price>19200</Price> <Currency>USD</Currency> </MilkPrice> </MilkPrices> </MultiFile>Note this in this example and the following examples where we are comparing our solution with other DFDL solutions, we try to use a schema that has a structure that is similar to the schema we are comparing with.
Here is an example from the Virtual XML Garden
01 01 00 01 00 00 00 01 00 00 00 00 00 00 00 00 00 80 3f 00 00 00 00 00 00 f0 3f 02 02 00 02 00 00 00 02 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 40
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:annotation> <xs:appinfo> <data:format byteOrder="littleEndian"/> </xs:appinfo> </xs:annotation> <xs:element name="sextet"> <xs:complexType> <xs:sequence> <xs:element name="group" minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"> <xs:complexType> <xs:sequence> <xs:element name="byte" type="xs:byte"/> <xs:element name="short" type="xs:short"/> <xs:element name="int" type="xs:int"/> <xs:element name="long" type="xs:long"/> <xs:element name="float" type="xs:float"/> <xs:element name="double" type="xs:double"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<?xml version="1.0"?> <sextet> <group> <byte>1</byte> <short>1</short> <int>1</int> <long>1</long> <float>1</float> <double>1</double> </group> <group> <byte>2</byte> <short>2</short> <int>2</int> <long>2</long> <float>2</float> <double>2</double> </group> </sextet>We are able to read and write the binary numeric data types.
Here is another example from the Virtual XML Garden
ROSE KRISTOFFER 0402025555555 ROSE SOFUS 0060000000000
524F5345202020202020202020202020202020204B524953544F464645522020 202020303430323032353535353535352020202020202020200D0A524F534520 202020202020202020202020202020534F465553202020202020202020203030 36303030303030303030302020202020202020200D0A
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:annotation> <xs:appinfo> <data:format byteOrder="littleEndian" dataCounter="" fieldWidth="xs:maxLength"/> </xs:appinfo> </xs:annotation> <!-- COPYBOOK types --> <xs:simpleType name="last-name"> <xs:restriction base="xs:string"> <xs:maxLength value="20"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="first-name"> <xs:restriction base="xs:string"> <xs:maxLength value="15"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="age" data:printf="%0*u"> <xs:restriction base="xs:positiveInteger"> <xs:totalDigits value="3"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="phone" data:printf="%-19s\r\n"> <xs:restriction base="xs:string"> <xs:pattern value="\d{10}"/> </xs:restriction> </xs:simpleType> <xs:element name="copybook"> <xs:complexType> <xs:sequence> <xs:element name="CUSTOMER-RECORD" minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"> <xs:complexType> <xs:sequence> <xs:element name="CUSTOMER-LAST-NAME" type="last-name"/> <xs:element name="CUSTOMER-FIRST-NAME" type="first-name"/> <xs:element name="CUSTOMER-AGE" type="age"/> <xs:element name="CUSTOMER-PHONE" type="phone"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<?xml version="1.0"?> <copybook> <CUSTOMER-RECORD> <CUSTOMER-LAST-NAME>ROSE</CUSTOMER-LAST-NAME> <CUSTOMER-FIRST-NAME>KRISTOFFER</CUSTOMER-FIRST-NAME> <CUSTOMER-AGE>40</CUSTOMER-AGE> <CUSTOMER-PHONE>2025555555</CUSTOMER-PHONE> </CUSTOMER-RECORD> <CUSTOMER-RECORD> <CUSTOMER-LAST-NAME>ROSE</CUSTOMER-LAST-NAME> <CUSTOMER-FIRST-NAME>SOFUS</CUSTOMER-FIRST-NAME> <CUSTOMER-AGE>6</CUSTOMER-AGE> <CUSTOMER-PHONE>0000000000</CUSTOMER-PHONE> </CUSTOMER-RECORD> </copybook>
<copybooks CUSTOMER-RECORD-COUNT="2"> <CUSTOMER-RECORD> <CUSTOMER-LAST-NAME>ROSE </CUSTOMER-LAST-NAME> <CUSTOMER-FIRST-NAME>KRISTOFFER </CUSTOMER-FIRST-NAME> <CUSTOMER-AGE>40</CUSTOMER-AGE> <CUSTOMER-PHONE>2025555555</CUSTOMER-PHONE> </CUSTOMER-RECORD> <CUSTOMER-RECORD> <CUSTOMER-LAST-NAME>ROSE </CUSTOMER-LAST-NAME> <CUSTOMER-FIRST-NAME>SOFUS </CUSTOMER-FIRST-NAME> <CUSTOMER-AGE>6</CUSTOMER-AGE> <CUSTOMER-PHONE>0000000000</CUSTOMER-PHONE> </CUSTOMER-RECORD> </copybooks>
Here is another example from the Virtual XML Garden
[homes] comment=Home Directories browseable=no [printers] comment=All Printers browseable=no
<xs:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:annotation> <xs:appinfo> <data:format dataCounter="\r\n"/> </xs:appinfo> </xs:annotation> <xs:element name="configuration"> <xs:complexType> <xs:sequence> <xs:element name="section" minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string" data:printf="[%s]"/> <xs:element name="properties"> <xs:complexType> <xs:all> <xs:element name="comment" type="xs:string" data:choiceTag=" comment="/> <xs:element name="browseable" minOccurs="0" type="xs:string" data:choiceTag=" browseable="/> </xs:all> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<?xml version="1.0"?> <configuration> <section> <name>homes</name> <properties> <comment>Home Directories</comment> <browseable>no</browseable> </properties> </section> <section> <name>printers</name> <properties> <comment>All Printers</comment> <browseable>no</browseable> </properties> </section> </configuration>
<configuration> <section> <name>homes</name> <key><name>comment</name><value>Home Directories</value></key> <key><name>browseable</name><value>no</value></key> <name>printers</name> <key><name>comment</name><value>All Printers</value></key> <key><name>browseable</name><value>no</value></key> </section> </configuration>We are using a different approach here. The key value pair approach reminds me of the Mac OSX plist XML, which I really hate. In the current approach, you have to write a more elaborate schema. However this also mean that you can have a better validation of your data.
Here is a similar example from the XML Convert and XFlat by Unidex Inc.
[contact] name=Nancy Magill email=lil.magill@blackmountainhills.com phone=(100) 555-9328 [contact] email=molly.jones@oblada.com name=Molly Jones [contact] phone=(200) 555-3249 name=Penny Lane email=plane@bluesuburbanskies.com
<xs:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:annotation> <xs:appinfo> <data:format dataCounter="\r\n"/> </xs:appinfo> </xs:annotation> <xs:element name="contacts"> <xs:complexType> <xs:sequence> <xs:choice minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"> <xs:element name="contact" data:choiceTag="[contact]\r\n"> <xs:complexType> <xs:all> <xs:element name="name" type="xs:string" data:choiceTag="name="/> <xs:element name="phone" minOccurs="0" type="xs:string" data:choiceTag="phone="/> <xs:element name="email" minOccurs="0" type="xs:string" data:choiceTag="email="/> </xs:all> </xs:complexType> </xs:element> </xs:choice> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
<?xml version="1.0"?> <contacts> <contact> <name>Nancy Magill</name> <email>lil.magill@blackmountainhills.com</email> <phone>(100) 555-9328</phone> </contact> <contact> <email>molly.jones@oblada.com</email> <name>Molly Jones</name> </contact> <contact> <phone>(200) 555-3249</phone> <name>Penny Lane</name> <email>plane@bluesuburbanskies.com</email> </contact> </contacts>
Here is a another example of Xflat from Generating XML Instances from Flat Files in the XML Journal. It is for comma delimited value file. Actually csv is used twice because the name field is then separated into first name and last name using space as the delimiter.
123456789,Ram Singh,100000.00 444556666,Barr Clark,87000.00 777227878,Simi? D Roy,123000.00 998877665,Charr Lee,92000.00
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:element name="employees"> <xs:complexType> <xs:sequence> <xs:element name="employee" type="employee" minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"/> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="employee"> <xs:sequence data:csv="true" > <xs:element name="ssn" type="xs:string"/> <xs:element name="name" type="name"/> <xs:element name="salary" type="xs:decimal" data:printf="%.2f"/> </xs:sequence> </xs:complexType> <xs:complexType name="name"> <xs:sequence data:csvSeparateChar=" " data:csvEscapeChar="?"> <xs:element name="fName" type="xs:string"/> <xs:element name="lName" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:schema>
<employees> <employee> <ssn>123456789</ssn> <name> <fName>Ram</fName> <lName>Singh</lName> </name> <salary>100000</salary> </employee> <employee> <ssn>444556666</ssn> <name> <fName>Barr</fName> <lName>Clark</lName> </name> <salary>87000</salary> </employee> <employee> <ssn>777227878</ssn> <name> <fName>Simi D</fName> <lName>Roy</lName> </name> <salary>123000</salary> </employee> <employee> <ssn>998877665</ssn> <name> <fName>Charr</fName> <lName>Lee</lName> </name> <salary>92000</salary> </employee> </employees>
123456789,Ram Singh,100000.00 444556666,Barr Clark,87000.00 777227878,"""Simi D"" Roy",123000.00 998877665,Charr Lee,92000.00
123456789 Ram Singh 100000 444556666 Barr Clark 87000 777227878 Simi? D Roy 123000 998877665 Charr Lee 92000 123456789 Ram Singh 100000 444556666 Barr Clark 87000 777227878 "Simi D" Roy 123000 998877665 Charr Lee 92000These two set of data are equivalent to the same XML data. Yet the second set is nicer when we read it from other csv aware application. We shall talk more about this in the discussion about the commands.
We can also use csv on record with variable number of fields. The problem
is that one may not know which field belongs to which element. However, this
is not a problem if only one single element may have variable occurrence. In
fact the most common case is that the only last element has variable occurrence.
This is a very common scenario. I often store data in spreadsheet in this manner.
Yet I have not seen other converters deal with this problem.
This is what we implement now.
Consider a families file, each family consists of a father, a mother and children.
So the only variable is number of children.
Smith,John,Mary,Peter Chan,Joseph,Yung,Chris,Jane Clark,Bob,Patricia Kane,Bill,Judi
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:data="http://www.datamech.com/storage"> <xs:element name="families"> <xs:complexType> <xs:sequence> <xs:element name="family" type="family" minOccurs="0" maxOccurs="unbounded" data:itemCounter="EOF"/> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="family"> <xs:sequence data:csv="true" > <xs:element name="familyName" type="xs:string"/> <xs:element name="father" type="xs:string"/> <xs:element name="mother" type="xs:string"/> <xs:element name="child" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:schema>
<?xml version="1.0"?> <families> <family> <familyName>Smith</familyName> <father>John</father> <mother>Mary</mother> <child>Peter</child> </family> <family> <familyName>Chan</familyName> <father>Joseph</father> <mother>Yung</mother> <child>Chris</child> <child>Jane</child> </family> <family> <familyName>Clark</familyName> <father>Bob</father> <mother>Patricia</mother> </family> <family> <familyName>Kane</familyName> <father>Bill</father> <mother>Judi</mother> <child>Jerry</child> </family> </families>So here we have families with none, one or two children, all handled without any problem.