web stats
Convert PDF into TXT - Page 3 - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Reply
 
Thread Tools Display Modes
  #21  
Old 07-19-2017, 05:23 AM
seaston seaston is offline
OBX.3 Kenobi
 
Join Date: Feb 2010
Location: London, UK
Posts: 168
seaston is on a distinguished road
Default

I'm trying to do something similar, but from an attachment.
If I save the attachment out, and read in using the code above it works.

This doesn't, but I cannot see why...

Code:
var inputstream = new Packages.java.io.ByteArrayInputStream(getAttachments().get(0).getContent());
var reader = new Packages.com.itextpdf.text.pdf.PdfReader(inputstream);
for (i=1;i<=reader.getNumberOfPages();i++) {
		logger.info(Packages.com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(reader, i));
		//logger.info(Packages.com.lowagie.text.pdf.parser.PdfTextExtractor.getTextFromPage(reader, i));
}
reader.close();
inputstream.close();
I get the following returned which suggest it cannot read the attachment:

Code:
Caused by: com.itextpdf.text.exceptions.InvalidPdfException: PDF header signature not found.
	at com.itextpdf.text.pdf.PRTokeniser.getHeaderOffset(PRTokeniser.java:227)
	at com.itextpdf.text.pdf.PdfReader.getOffsetTokeniser(PdfReader.java:442)
Can anyone advise what is wrong here?
Reply With Quote
  #22  
Old 07-19-2017, 10:08 AM
seaston seaston is offline
OBX.3 Kenobi
 
Join Date: Feb 2010
Location: London, UK
Posts: 168
seaston is on a distinguished road
Default

This seems to work, but is it the best way?

Code:
var content = getAttachments().get(0).getContent();
var decoded = FileUtil.decode(Packages.java.lang.String(content));
var inputstream = new Packages.java.io.ByteArrayInputStream(decoded);
var reader = new Packages.com.itextpdf.text.pdf.PdfReader(inputstream);
for (i=1;i<=reader.getNumberOfPages();i++) {
		logger.info(Packages.com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(reader, i));
		// do other stuff here to regex match content
}
reader.close();
inputstream.close();
Reply With Quote
  #23  
Old 07-21-2017, 07:37 PM
pacmano pacmano is offline
OBX.2 Kenobi
 
Join Date: Oct 2009
Location: Texas
Posts: 86
pacmano is on a distinguished road
Default

Perhaps off topic, but I would most certainly do this outside of Mirth via command line utilities, delivering the extracted files to a folder for downstream mirth consumption.

https://stackoverflow.com/questions/...ext-extraction has some tools listed.

There are instances like this one that other tools seem easier, at least for me.
__________________
Mirth 3.8.0 / PostgreSQL 11 / Ubuntu 18.04
Diridium Technologies, Inc.
https://diridium.com
Reply With Quote
  #24  
Old 11-12-2017, 06:50 PM
paulorades paulorades is offline
What's HL7?
 
Join Date: Jun 2017
Posts: 4
paulorades is on a distinguished road
Default Error in PDF To text

Hi Vivian. I received this error when send a message. My Mirth is 3.5


....

DETAILS: TypeError: [JavaPackage com.itextpdf.text.pdf.PdfReader] is not a function, it is object.
at 272b44b5-fffe-46a9-b446-e0e7ceb291e2:3419 (extractText)
at 272b44b5-fffe-46a9-b446-e0e7ceb291e2:3428 (doTransform)
at 272b44b5-fffe-46a9-b446-e0e7ceb291e2:3450 (doScript)
at 272b44b5-fffe-46a9-b446-e0e7ceb291e2:3452 .....
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 05:51 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation