web stats
convert html string to text - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Reply
 
Thread Tools Display Modes
  #1  
Old 03-13-2019, 03:07 PM
adidob adidob is offline
What's HL7?
 
Join Date: Mar 2019
Posts: 5
adidob is on a distinguished road
Default convert html string to text

My javascript reader reads among other things an HTML block, e.g. "<div><li><span>blah</span></div>".
Then I need to extract only the text content from this HTML block, similar to document.getElementById("myelement").textContent.
Any idea how can I achieve this in source connector?
Reply With Quote
  #2  
Old 03-13-2019, 03:49 PM
agermano agermano is offline
Mirth Guru
 
Join Date: Apr 2017
Location: Indiana, USA
Posts: 754
agermano is on a distinguished road
Default

I think you'd need to use an html parser. You can't use e4x since it's not xhtml (the <li> tag is never closed.)

It looks like the mirth Document Writer uses com.lowagie.text.html.HtmlParser from the itext library if you want to look into that.

This might be helpful, too. https://stackoverflow.com/questions/...-to-plain-text
Reply With Quote
  #3  
Old 03-19-2019, 05:01 PM
ishpreetsingh ishpreetsingh is offline
What's HL7?
 
Join Date: Oct 2017
Posts: 4
ishpreetsingh is on a distinguished road
Default

When converting html to text do you want to preserve newline and spacing in the text?

I have used regex earlier however I have found it can lead to issue specifically if there are < > in your text

I have successfully used jsoup HTML parser which has worked well for me.
Reply With Quote
Reply

Tags
connector, html, source, text

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 08:04 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation