We were willing to take advantage from XML technologies. They are well integrated in every product now, and with XSLT you can produce quick and reliable transformations from XML to anything. But the reverse...

For the reverse, especially if you do face screwy syntaxes, regular expressions are quite helpful: relevant classes are available from the Java Runtime environment, and with capturing groups in regular expressions, you can extract a piece of string from any kind of text structure, delimited or fixed or just 'looking like' something, containing numerous optional parts, with variants, and accepting control characters. However:

  • One can hardly describe the whole message in a single monstruous regular expression!  it will be... monstruous! so you shall at least provide some sequencing and repetition management logic above a collection of regular expressions.
  • Regular expressions can be used for 'identification', else for 'extraction', else for 'validation', and even for cutting a string into smaller pieces, namely 'segmentation'. Mixing these four roles yields complex code and expressions, whereas the segragation of roles produces regular expressions that are much simpler. Supporting code becomes elegant, and you can enrich the parsing experience with references to specifications, inter-dependency conditions, and details about the XML document to generate.

That is the point where earlier experience paid back a lot. Indeed we had developed during the nineties a sophisticated any-to-any EDI converter based on the theory of formal grammars (the product was quite successful and extremely flexible). So we built the reverseXSL parser around the very simple following idea: providing grammar-like structure over a collection of regular expressions, so that the entire message syntax can be identified, segmented, validated, and data can be extracted, with the help of a single meta-data file.

Generating code is an alternative that you often encounter in commercial products of the kind. In our case this had to be clearly banned, because for numerous reasons amongst which the million-messages-a-day-in-production: operation staff would never let new code enter live systems at the pace required for new message syntaxes and changes. We can also evoke the cascade of verifications. But loading and changing data is a completely different story, simply bound to using the system: we can change meta-data every hour if we wish to. Consequently, a new message had to mean only new data; period! We can do that very naturally with XSL templates, why not for the reverse mapping direction?

With the ReverseXSL transformer you can:

  • Process any character-based data interchange formats, including exotic ones
  • Identify an input message and dynamically select the required transformation steps
  • Just deploy data (no code) with every new message brand
  • Take advantage from the rich function set in XSLT, in both mapping directions
  • Learn it fast
  • Develop new message definitions really fast