web stats
Converting RTF to HTML/XML ? - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Reply
 
Thread Tools Display Modes
  #1  
Old 06-03-2011, 02:15 AM
dkirwilliam dkirwilliam is offline
Mirth Newb
 
Join Date: May 2011
Location: London, UK
Posts: 16
dkirwilliam is on a distinguished road
Question Converting RTF to HTML/XML ?

Hey all,

I'm trying to find a good way of converting an RTF document (embedded in OBX;5 of an ORU message) into HTML or XML?

I notice in the source code that there are libraries used from iText, which can handle XML, PDF, HTML and RTF... I just don't quite know how I would implement them. There also the built in Java RTF editor, but i don't think this supports tables.

Anyone have any ideas?
Reply With Quote
  #2  
Old 06-03-2011, 12:19 PM
bradd bradd is offline
Mirth Employee
 
Join Date: May 2009
Location: Irvine, CA
Posts: 133
bradd is on a distinguished road
Default

I have actually had to use this before:

http://www.rtf-to-xml.com/

it's not free but there is a free evaluation lib
Reply With Quote
  #3  
Old 06-03-2011, 11:29 PM
AlexToft AlexToft is offline
OBX.3 Kenobi
 
Join Date: Sep 2010
Location: Leeds, UK
Posts: 160
AlexToft is on a distinguished road
Default

The iText libs vary in their ability, and the rtf classes are somewhat old and unmaintained now (there is a sourceforge project for it, but no content there). It does not, for example, do a good job of preserving formatting in Cerner Millennium discharges. The fact that you're in London suggests to me that you may be using the same system.

The product mentioned by bradd does a decent job, but personally I have a number of reservations about manipulating the contents of what is essentially a schema-less clinical document. Every tool I've used to try and get such data into a usable format has experienced issues at one point or another which is something of a deal breaker for me.

After playing around with it for a number of weeks I settled for a straight convert to PDF using jodconverter (which can be plugged into Mirth as a custom java lib) and the OpenOffice document converter running as a daemon. PDFs can then be sent out via DTS (or whatever) to the GP. Works really well.
Reply With Quote
  #4  
Old 06-03-2011, 11:42 PM
AlexToft AlexToft is offline
OBX.3 Kenobi
 
Join Date: Sep 2010
Location: Leeds, UK
Posts: 160
AlexToft is on a distinguished road
Default

Incidentally, if you want to play around with the iText stuff, here's some code I dug out from my testing which will take the ORU, parse the saved RTF (not very well!) and spit out a PDF; you could simply change the pdfwriter class to xml/html (com.lowagie.text.xml.XmlWriter and com.lowagie.text.html.HtmlWriter) as you wish and examine the results for yourself.

Code:
// Pull the RTF from the OBX and unescape it...
var contents = msg['OBX']['OBX.5']['OBX.5.5'].toString();
contents = contents.replace(/\\E\\/g, "\\");
contents = contents.replace(/\\.br\\/g, "\r\n");
FileUtil.write('/tmp/input.rtf', false, contents);

// Generate a unix time stamp for use as the output filename (we'll use something a little more robust for prod, but this is useful for test)
var foo = new Date;
var unixtime_ms = foo.getTime();
var unixtime = parseInt(unixtime_ms / 1000);

// Set the variables for the input file and output file
var inputfile = "/tmp/input.rtf";
var outputfile = "/output/"+unixtime+".pdf";

// Create the respective streams for the files
var inputstream = new Packages.java.io.FileInputStream(inputfile);
var outputstream = new Packages.java.io.FileOutputStream(outputfile);

// Create an iText document
var myDocument = new Packages.com.lowagie.text.Document();

// Create a PDF writer object which we'll use to save the PDF in a moment
var pdfwriter = new Packages.com.lowagie.text.pdf.PdfWriter.getInstance(myDocument, outputstream);

// Open the iText document we created a moment ago so we can modify it
myDocument.open();

// Create a parser which will load the RTF file in a moment
var parser = new Packages.com.lowagie.text.rtf.parser.RtfParser(null);

// Parse the RTF input and pass it to the PDF writer object
parser.convertRtfDocument(inputstream, myDocument);

// Close the document and hopefully it will contain what we want!
myDocument.close();

// Remove the temporary RTF file
var tidyUp = new Packages.java.io.File(inputfile);
tidyUp["delete"]();

Last edited by AlexToft; 06-04-2011 at 12:10 AM.
Reply With Quote
  #5  
Old 06-06-2011, 02:24 AM
dkirwilliam dkirwilliam is offline
Mirth Newb
 
Join Date: May 2011
Location: London, UK
Posts: 16
dkirwilliam is on a distinguished road
Default

Thanks Alex, you're right on the source of my RTFs.. I had looked into the OpenOffice/jodconverter route but not to the point that i'd tried implementing it. As long as it handles the RTF tables I think I should be fine.

I'm still a bit of a Java newbie so it could take me a while to figure it out

Cheers for the help. I do agree - only do the conversions if you are very, very sure that they are perfect
Reply With Quote
Reply

Tags
rtf

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 08:33 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation