web stats
convert html string to text - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Thread Tools Display Modes
Old 03-13-2019, 04:07 PM
adidob adidob is offline
Mirth Newb
Join Date: Mar 2019
Posts: 8
adidob is on a distinguished road
Default convert html string to text

My javascript reader reads among other things an HTML block, e.g. "<div><li><span>blah</span></div>".
Then I need to extract only the text content from this HTML block, similar to document.getElementById("myelement").textContent.
Any idea how can I achieve this in source connector?
Reply With Quote
Old 03-13-2019, 04:49 PM
agermano agermano is offline
Mirth Guru
Join Date: Apr 2017
Location: Indiana, USA
Posts: 1,009
agermano is on a distinguished road

I think you'd need to use an html parser. You can't use e4x since it's not xhtml (the <li> tag is never closed.)

It looks like the mirth Document Writer uses com.lowagie.text.html.HtmlParser from the itext library if you want to look into that.

This might be helpful, too. https://stackoverflow.com/questions/...-to-plain-text
Reply With Quote
Old 03-19-2019, 06:01 PM
ishpreetsingh ishpreetsingh is offline
What's HL7?
Join Date: Oct 2017
Posts: 4
ishpreetsingh is on a distinguished road

When converting html to text do you want to preserve newline and spacing in the text?

I have used regex earlier however I have found it can lead to issue specifically if there are < > in your text

I have successfully used jsoup HTML parser which has worked well for me.
Reply With Quote

connector, html, source, text

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -8. The time now is 02:42 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation