web stats
special characters - Mirth Community

Go Back   Mirth Community > Mirth Connect > Support

Reply
 
Thread Tools Display Modes
  #1  
Old 11-11-2014, 08:18 AM
smetz02 smetz02 is offline
Mirth Newb
 
Join Date: Feb 2013
Posts: 23
smetz02 is on a distinguished road
Default special characters

Hello,
I have a sending system which is not correctly escaping special characters. For instance, they are sending a Latin Diphthong (\x92) instead of an apostrophe, probably because someone copy and pasted from a Word document somewhere.

I would like to implement a pre-processor script to handle this, but I'm having a terrible time with it. Mirth, by default, seems to translate this character into an\xBF and I've been unable to make it do anything else.

Have tried all of the following to no avail:

msg.replace(/\u0092/g,"'");

msg.replace(/\x92/g,"'");

msg.replace("’","'");

Any suggestions?

Thanks!
Steve
Reply With Quote
  #2  
Old 11-11-2014, 11:45 AM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,115
narupley is on a distinguished road
Default

First off, if you're talking about code point 92, that's not a diphthong or anything, it's just a specialized quote character: http://www.fileformat.info/info/unic...r/92/index.htm

Also, in the preprocessor you do not have access to "msg". You have access to "message". The default preprocessor script even says this:

Code:
// Modify the message variable below to pre process data
return message;
And, if you drag over the Remove Illegal XML Characters code template, you'll see this:

Code:
var newMessage = message.replace(/[\x00-\x08]|[\x0B-\x0C]|[\x0E-\x1F]/g, '');
So you should be calling "message.replace", not "msg.replace".
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.
Reply With Quote
  #3  
Old 11-11-2014, 04:16 PM
smetz02 smetz02 is offline
Mirth Newb
 
Join Date: Feb 2013
Posts: 23
smetz02 is on a distinguished road
Default

Doesn't seem to work, though:

var esc = message.replace(/[\x00-\x08]|[\x0B-\x0C]|[\x0E-\x1F]/g, '');
channelMap.put("escape",esc)
return esc;


obx.3.1 from escape variable in mappings tab: OBX|1|TX|What is your child¿s hearing condition?

What am I doing incorrectly?
Reply With Quote
  #4  
Old 11-11-2014, 04:21 PM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,115
narupley is on a distinguished road
Default

That code template does not include code point 92. You need to find out the correct code point and include it. "¿" is 0xBF, so you might try that.
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.
Reply With Quote
  #5  
Old 11-11-2014, 04:53 PM
smetz02 smetz02 is offline
Mirth Newb
 
Join Date: Feb 2013
Posts: 23
smetz02 is on a distinguished road
Default

still struggling.
obx in with \x92:
OBX|31|TX|What is your child’s hearing condition?|1|Normal hearing|||||||||20090415|||||||||||

script, which should, theoretically replace anything above \x7f, with nothing.

var esc = message.replace(/[\x7F-\xFF]/g, '');
channelMap.put("escape",esc)
return escape;

script output:
OBX|1|TX|What is your child¿s hearing condition?|1|Normal hearing|||||||||20090415|||||||||||

sorry to keep bugging.
Reply With Quote
  #6  
Old 11-12-2014, 08:48 AM
narupley's Avatar
narupley narupley is online now
Mirth Employee
 
Join Date: Oct 2010
Posts: 7,115
narupley is on a distinguished road
Default

This works perfectly for me:

Code:
return message.replace(/’/g, '');
This also works fine:

Code:
return message.replace(/\u2019/g, '');
"’" is code point U+2019. But that's only the numeric value; the actual bytes in the file/stream/whatever depend on the charset being used. For example if you wrote the character "’" out in UTF-8, you'd get three bytes (0xE2, 0x80, 0x99). If you then read it back in with a windows-1250, you would not get "’" back, instead you would get "’".

You're not giving enough information. Like I said before, you need to find the actual code point in the message (e.g. with charCodeAt) and replace that.
__________________
Step 1: JAVA CACHE...DID YOU CLEAR ...wait, ding dong the witch is dead?

Nicholas Rupley
Work: 949-237-6069
Always include what Mirth Connect version you're working with. Also include (if applicable) the code you're using and full stacktraces for errors (use CODE tags). Posting your entire channel is helpful as well; make sure to scrub any PHI/passwords first.


- How do I foo?
- You just bar.
Reply With Quote
Reply

Tags
encoding, regex, replace, special characters

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -8. The time now is 08:53 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Mirth Corporation