Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2007

Tip: Looking for answers? Try searching our database.

Charset conversion question

Thread view: 
djthomp - 05 Feb 2007 16:00 GMT
To put it simply, I need some help dealing with the 'smart' character
corrections that Word automatically performs (quotes, hyphens,
fractions, etc), specifically after it has been copied from Word and
pasted into my web form.

I am working on a project using JSP that has a web form with a few
essay questions.  Because of the nature of the form (a scholarship
application form), it is very often filled out by applicants who are
writing their essays in Word and cutting and pasting them into the
form.  Often those essays have quotes or other punctuation and special
characters that have been modified from straight up ASCII into
something else.  This is causing me a problem both with character
counts, as well as with corrupted data after the data is submitted
(all of the modified characters show up as 2-3 garbage characters).

I've tried to find a solution using the various bits of String
functionality that take a character set name (public byte[]
getBytes(String charsetName) and public String(byte[] bytes, String
charsetName) in particular), but either its the wrong approach, or I'm
not doing it right, or I haven't found the right charsetName yet.  I
have not yet tried CharsetEncoder or CharsetDecoder, as I am a little
uncertain where to begin with them.

I do have a working fix for this, but since it consists of a loop
through the string in question which manually finds and fixes the
problem spots by finding them with some hard coded comparisons, I
really don't believe its a good long term solution.
opalpa opalpa@gmail.com http://opalpa.info - 05 Feb 2007 16:33 GMT
> To put it simply, I need some help dealing with the 'smart' character
> corrections that Word automatically performs (quotes, hyphens,
[quoted text clipped - 23 lines]
> problem spots by finding them with some hard coded comparisons, I
> really don't believe its a good long term solution.

Maybe helpful: http://www.ljmu.ac.uk/cis/webpublishing/81434.htm

opalpa
opalpa@gmail.com
http://opalpa.info/
djthomp - 07 Feb 2007 14:36 GMT
On Feb 5, 10:33 am, "opalpa opa...@gmail.com http://opalpa.info"
<opa...@gmail.com> wrote:

> Maybe helpful:http://www.ljmu.ac.uk/cis/webpublishing/81434.htm
>
> opalpa
> opa...@gmail.comhttp://opalpa.info/

Unfortunately, we don't really want give the users of our site the
additional instructions they would need so that they only paste
'clean' characters into the form.  We're looking for as simple and
clean of an application process as possible, and want a solution for
this that requires no additional user-side effort.

I ended up cleaning out the quotes with a little client-side
javascript, but I'm still looking for a server-side java method.
djthomp - 09 Mar 2007 14:41 GMT
> Unfortunately, we don't really want give the users of our site the
> additional instructions they would need so that they only paste
[quoted text clipped - 4 lines]
> I ended up cleaning out the quotes with a little client-side
> javascript, but I'm still looking for a server-side java method.

Well, I finally found my server-side solution to this.  I finally used
the proper search into google, which led me to <a href='http://
java.sun.com/developer/technicalArticles/Intl/HTTPCharset/'>this page</
a>.  After that it was just a question of using the proper page
directive and and meta tag attributes, and using
request.setCharacterEncoding(encodingName) before reading any request
parameters (all of which was detailed pretty clearly on the sun page I
found).

Just thought I'd give a success update with the answer.  When I was
looking I found the same question being asked a lot, but not this
particular answer.  Hope that others might find it useful.
djthomp - 09 Mar 2007 14:43 GMT
Doh, that link got mangled, its at: http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
opalpa opalpa@gmail.com http://opalpa.info - 12 Mar 2007 17:31 GMT


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.