A problem that we often face is the characters copied from MS Word could not be displayed correctly on the web page, especially when the page is an xml. The reason is those characters may be encoded (such as ASCII 39) different from HTML and you have to convert them.
You might find that your editor doesn’t have UTF-8 support and is actually using normal US-ASCII. That’s OK because UTF-8 starts with exactly the same characters as US-ASCII. However, it’s more likely that your editor using Windows-1252 or some other proprietary encoding which will probably start with US-ASCII, but the characters from 128 to 255 might not match those in UTF-8. Therefore, you would often find the problem with the quotes, as they are between 145 and 148. Instead, single quotes are 8216 and 8217. Double quotes are 8220 and 8221.
Frequently used escape characters
Web server like Tomcat has its own URL encoder and decoder. All your parameters will be encoded when you send a request and will be decoded for further process. However, depend on the characters you passed, they could be miss-encoded or miss-decoded. For example, you have search form with search queryString equals “crunch time”. When you submit the form to Tomcat, it will be encoded as
and will be decoded by the Tomcat container as
However, as you can see, those quotations have been miss-decoded by Tomcat and you would not get expected value (“crunch time”) for further process.
To solve this problem, you can try to strip the value out from the query string
and decode it
manually before it is decoded by Tomcat, where valueStrippedOut’s value should be %e2%80%9ccrunch+time%e2%80%9d
You also should be aware when you want create a hyperlink with what you search, such as
The problem is that Tomcat doesn’t know these quotations characters and will miss-encode them. In this case you should pass the following query instead.
Another thing about hyperlink, you may know already, but you have to be careful with is that you must not have “&” character in the value of the parameter you want to pass. You have to replace them with %26.