There is an amazing parallelism between XSLT and ReverseXSL

Precisely, we have an inverted paralelism that we can summarize as Tree for Processing (XSLT) and Processing for Tree (ReverseXSL).

XSLT is built over XPath expressions. ReverseXSL (RXSL) is built over regular expressions.
XSLT adds the data processing and a control flow over XPath expressions which mostly tell what to take from the input XML document. ReverseXSL adds the tree structure and organize regular expressions which do the processing, namely identify, cut, extract, and validate. In fact, the parsing of an input text file requires four different processing activities, noted (i), (c), (e), (v) further.
An XML document starts as a tree of nodes → XPath selects the nodes to process → XSLT organizes the sequence of selections and the processing that is applied → the output is simply the sequential concatenation of the outcomes of processing activities. A text file starts as a big content → regular expressions identify the structures (i), or cut content into smaller pieces (c), or extract values (e), or validate data (v) → ReverseXSL provides the tree structure under which the four activities (i)(c)(e)(v) are organized → the output document matches the ReverseXSL tree. Being a tree, it is most naturally rendered in XML.
XSLT is part of the standard Java API for XML Processing (JAXP). Core classes are the TransformerFactory, and Transformer. ReverseXSL software is supplied as an additional API library in java archive (jar) form. Core classes are the TransformerFactory, and Transformer.

Both XSLT and ReverseXSL create and maintain a processing context that is highly recursive.

XSLT apply processing templates in a hierarchy, maintaining a context by reference to the source document tree so that the same templates with relative XPath expressions are applied to sub-trees at different levels in the original tree. ReverseXSL maintain a segmentation context, where the sub-pieces produced by a cut or extraction at some level become in turn the new (sub)message to cut further down till we reach the atoms of information.

How does the ReverseXSL Transformer work?

The Reverse XSL Transformer performs three activities in turn:

  1. Identify the brand of message to process, and dynamically associate none, one or both of the next two steps. To do so, the ReverseXSL transformer loads message identification patterns and associated parameters from a mapping selection table in a simple text file.
  2. Parsing Step (in case the input message is non-XML): Parse the input message and transform it to an XML document. To do so, the ReverseXSL parser loads instructions from a so-called DEF file.
  3. XSLT Step (typically if we need to re-order XML elements): Invoke an XSL transformation. To do so, XSLT loads instructions from an XSL file. 

We must understand that, opposite to XSLT, we do not need to explicitly indicate which XSL template, and DEF file in our case, to apply to the input. Of course, one can also impose a priori a precise DEF and/or XSL to every input message entering a given transformation thread, but the ability to delegate the selection of transformations brings significant operational advantages: we can make a system capable of handling variant or new messages by just loading meta-data. No new channel, pipeline or dedicated process flow shall be created.

We must also note the option to associate none, one or both transformation steps (a ReverseXSL Parsing step, and an XSLT step) to each input message brand.

  • None clearly means pass-through. In other words, the input at stake (be it XML or not!) is good enough for direct processing by the target application.
  • One can consist of the parsing step alone. A non-XML data message becomes an XML document.
  • One can also mean XSLT alone. An XML document must be adjusted to comply with a schema required by the target application
  • Two combines parsing with an XSL post-transformation, therefore extending the XML transformation capabilities of the ReverseXSL parser with the rich data-types and element processing functions in XSLT. As explained elsewhere, the ReverseXSL parser does not re-order elements from the input data message. This job is quite easily achieved with XSLT.

The reverseXSL parser recursively identifies, cuts, extracts, and validates smaller and smaller segments of the original data

The reverseXSL parser requires two things to perform its job:

  • the input data message, of course
  • a DEF file, standing for the message DEFinition

The DEF file contains meta data that describes the input message syntax. The structure of the file mimics that of the input message itself. At present, the structure is proprietary and easily edited with any text editor so that you can use your favorite development workstation to automate test runs, and develop your DEFinitions incrementally. Typically, you work on one chunk of the message at a time and focus on the relevant parsing details. You can immediately check the outcome, leaving the yet unparsed sections of the message as raw data.

The overview to the documentaion of Message DEFinition files explains further how the parsing proceeds.