Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2006

Tip: Looking for answers? Try searching our database.

can a browser read unicode?

Thread view: 
Shinya Koizumi - 24 Feb 2006 22:26 GMT
Is it possible to send a unicode document
from servlet to the browser and the browser can
tell what kinda of language this document is written?

SK
Carl - 24 Feb 2006 22:40 GMT
> Is it possible to send a unicode document
> from servlet to the browser and the browser can
> tell what kinda of language this document is written?
>
> SK

Im not sure i completely understand your question, but this may help:
http://www.cl.cam.ac.uk/~mgk25/unicode.html#web
http://www.utf-8.com/

Carl.
John O'Conner - 25 Feb 2006 08:06 GMT
> Is it possible to send a unicode document
> from servlet to the browser and the browser can
> tell what kinda of language this document is written?
>
> SK

The problem of language detection is quite difficult. Instead of forcing
a browser to use heuristics to determine document language, an HTML tag
can easily announce the same information. See the HTML "lang" tag:

http://www.w3.org/International/tutorials/tutorial-lang/

Regards,
John O'Conner
John C. Bollinger - 25 Feb 2006 08:48 GMT
> Is it possible to send a unicode document
> from servlet to the browser and the browser can
> tell what kinda of language this document is written?

That is the purpose of the "charset=" clause in an HTTP "Content-type:"
header.  There are various ways to put such a header into your response,
but this is probably the easiest:

(in servlet code:)

    response.setContentType("text/html; charset=UTF-8");

You do that before starting to write the actual response text.  You then
ensure that you actually do encode the text in UTF-8, either by getting
and using the response's Writer, or by wrapping the response's
OutputStream in an OutputStreamWriter configured to use UTF-8.

Signature

John Bollinger
jobollin@indiana.edu

Roedy Green - 25 Feb 2006 09:38 GMT
On Fri, 24 Feb 2006 22:26:53 GMT, "Shinya Koizumi"
<xxxx@smartestdesign.com> wrote, quoted or indirectly quoted someone
who said :

>Is it possible to send a unicode document
>from servlet to the browser and the browser can
>tell what kinda of language this document is written?

It could have a BOM. The browser could set fields in the header to
give the server a hint.

See http://mindprod.com/jgloss/bom.html
http://mindprod.com/jgloss/http.html
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Adam Maass - 27 Feb 2006 06:57 GMT
"Shinya Koizumi" <xxxx@smartestdesign.com> wrote::
> Is it possible to send a unicode document
> from servlet to the browser and the browser can
> tell what kinda of language this document is written?

Inferring a language from the characters contained in any given document is
difficult work, even for humans. One could write some heuristics to make
guesses, but those guesses will often be wrong. You don't say how much being
wrong matters for your application.

-- Adam Maass


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.