This is not a functional hole in XSLT, but the consequence of a very consistent design orientation hooked to the nature of the input as a tree of nodes. An input flat file is not a tree; at most a single big node. We have to build the tree...

XSL and ReverseXSL complement each other

Equipped with both XSL and reverseXSL capabilities, you can generalize the Transformer class to handle any-to-any traffic, and even let it determine dynamically which mapping profile to apply to which input. You drop XSL-templates and reverseXSL-DEF files on the classpath, update the mapping decision table, and without programming changes, your application can take advantage of new data transformations.

Parsing a flat file, EDI file, CSV or text file on the one hand, and processing a tree of nodes (an XML document) on the other hand are very different jobs by nature. A sister article explains the fundamental difference.

The reverseXSL software basically contains an original Parser component, dedicated to organizing the four tasks required to build an XML-document tree from input data melted in repeating and nested character-based structures, namely: identify, cut, extract and validate.

Just like the XSLT processor using meta-data in the form of an XSL file for getting instructions about the XML-node transformations to perform, the reverseXSL parser uses a so-called DEF file containing details about the input syntax enhanced with meta-data for the construction of the target XML document.

The reverseXSL parser is much more sophisticated than inserting '<..>' and '</..>' elsewhere in the input file for the claim to call it XML. Instead of getting something like:

<item>
<1>C</1>
<2>54-12345-00007</2>
<3>LENS</3>
<4>12</4>
<5>BX</5>
<6>2350</6>
</item>

 

You shall expect:

<LineItem status="Confirmed">
  <SKU>54-12345-00007</SKU>
  <Name>LENS</Name>
  <Qty Unit="Boxes">12</Qty>
  <UnitPrice>23.50</UnitPrice>
</LineItem>

 

The parser will readily generate the format that you need, or, in some cases very close to.

Why only 'very close to' in some cases, and which 'cases'?

We were very much concerned by not duplicating XSL functionality with the risk of pushing complexity into the Parser for tasks that are very easily achieved in XSLT. The 'case' is then simply that the Parser does not re-order elements, and does not compute data (e.g. the total price of all article items). The Parser can flatten hierarchical structures as well as increase nesting levels, it can hide elements in XML or make new groupings that are implicit in the input message, it can supply default values and even map codes, but it does not re-order elements, neither computes original data. Therefore, next to the Parsing step, the reverseXSL Transformer may invoke an optional XSL transformation step as required by the selected mapping profile in order to execute such tasks.

The reverseXSL Transformer software can execute a Parsing step alone with a given DEF file, or the XSLT step alone with a precise XSL template, or else it performs both steps in sequence before returning the output, as illustrated in the figure.

Such flexibilty opens the possibility of delegating to the reverseXSL Transformer the task of selecting a mapping profile that matches the input at stake. A mapping profile contains by definition a pattern identifying the input brand with instructions for performing the Parsing, or the XSL Transformation, or both, or even no mapping at all. The name of the selected mapping profile is also returned, allowing to branch to diverse application processing areas according to what has been received, and possibly mapped.

Java API and Command-line interfaces

The reverseXSL software extends JAXP (the Java API for XML Processing) with a pair of classes com.reverseXSL.transform.TransfomerFactory and Transfomer just like the javax.xml.transform.TranformerFactory and Tranformer. The new Transformer is capable of transforming any character-based data into XML in addition of the XML-to-XML and XML-to-any capabilities of XSLT alone.

In addition to the java API, the reverseXSL software contains command line programs (running on Windows, UNIX, MacOS and other operating systems with a suitable java runtime) that invoke file-based mappings from the Parser alone (Parse ..args..), or from the complete Transformer (Transform ..args..).

The java source code of the command-line programs is included in the software distribution archive (.jar) as an example of using the APIs, and for customizing the command line tools.