Mirth Community

Mirth Community (http://www.mirthcorp.com/community/forums/index.php)
-   Support (http://www.mirthcorp.com/community/forums/forumdisplay.php?f=6)
-   -   Converting RTF to HTML/XML ? (http://www.mirthcorp.com/community/forums/showthread.php?t=5517)

dkirwilliam 06-03-2011 03:15 AM

Converting RTF to HTML/XML ?
Hey all,

I'm trying to find a good way of converting an RTF document (embedded in OBX;5 of an ORU message) into HTML or XML?

I notice in the source code that there are libraries used from iText, which can handle XML, PDF, HTML and RTF... I just don't quite know how I would implement them. There also the built in Java RTF editor, but i don't think this supports tables.

Anyone have any ideas?

bradd 06-03-2011 01:19 PM

I have actually had to use this before:


it's not free but there is a free evaluation lib

AlexToft 06-04-2011 12:29 AM

The iText libs vary in their ability, and the rtf classes are somewhat old and unmaintained now (there is a sourceforge project for it, but no content there). It does not, for example, do a good job of preserving formatting in Cerner Millennium discharges. The fact that you're in London suggests to me that you may be using the same system.

The product mentioned by bradd does a decent job, but personally I have a number of reservations about manipulating the contents of what is essentially a schema-less clinical document. Every tool I've used to try and get such data into a usable format has experienced issues at one point or another which is something of a deal breaker for me.

After playing around with it for a number of weeks I settled for a straight convert to PDF using jodconverter (which can be plugged into Mirth as a custom java lib) and the OpenOffice document converter running as a daemon. PDFs can then be sent out via DTS (or whatever) to the GP. Works really well.

AlexToft 06-04-2011 12:42 AM

Incidentally, if you want to play around with the iText stuff, here's some code I dug out from my testing which will take the ORU, parse the saved RTF (not very well!) and spit out a PDF; you could simply change the pdfwriter class to xml/html (com.lowagie.text.xml.XmlWriter and com.lowagie.text.html.HtmlWriter) as you wish and examine the results for yourself.


// Pull the RTF from the OBX and unescape it...
var contents = msg['OBX']['OBX.5']['OBX.5.5'].toString();
contents = contents.replace(/\\E\\/g, "\\");
contents = contents.replace(/\\.br\\/g, "\r\n");
FileUtil.write('/tmp/input.rtf', false, contents);

// Generate a unix time stamp for use as the output filename (we'll use something a little more robust for prod, but this is useful for test)
var foo = new Date;
var unixtime_ms = foo.getTime();
var unixtime = parseInt(unixtime_ms / 1000);

// Set the variables for the input file and output file
var inputfile = "/tmp/input.rtf";
var outputfile = "/output/"+unixtime+".pdf";

// Create the respective streams for the files
var inputstream = new Packages.java.io.FileInputStream(inputfile);
var outputstream = new Packages.java.io.FileOutputStream(outputfile);

// Create an iText document
var myDocument = new Packages.com.lowagie.text.Document();

// Create a PDF writer object which we'll use to save the PDF in a moment
var pdfwriter = new Packages.com.lowagie.text.pdf.PdfWriter.getInstance(myDocument, outputstream);

// Open the iText document we created a moment ago so we can modify it

// Create a parser which will load the RTF file in a moment
var parser = new Packages.com.lowagie.text.rtf.parser.RtfParser(null);

// Parse the RTF input and pass it to the PDF writer object
parser.convertRtfDocument(inputstream, myDocument);

// Close the document and hopefully it will contain what we want!

// Remove the temporary RTF file
var tidyUp = new Packages.java.io.File(inputfile);

dkirwilliam 06-06-2011 03:24 AM

Thanks Alex, you're right on the source of my RTFs.. I had looked into the OpenOffice/jodconverter route but not to the point that i'd tried implementing it. As long as it handles the RTF tables I think I should be fine.

I'm still a bit of a Java newbie so it could take me a while to figure it out :)

Cheers for the help. I do agree - only do the conversions if you are very, very sure that they are perfect

All times are GMT -8. The time now is 07:01 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation