Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / July 2007

Tip: Looking for answers? Try searching our database.

convert char to codepoint

Thread view: 
JR - 04 Jul 2007 21:18 GMT
I have some text files with western characters in english, and
japanese fonts in them.  I need to convert the appropriate pieces of
this file to their codepoint alternatives so it can be properly
processed.  Is there a method out there or sample code to convert a
file as mentioned kind of like the native2ascii app does in the JDK?

Thanks.

JR
Jeff Higgins - 04 Jul 2007 21:55 GMT
> I have some text files with western characters in english, and
> japanese fonts in them.  I need to convert the appropriate pieces of
[quoted text clipped - 3 lines]
>
> Thanks.

Maybe this can help.
<http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html>
Jeff Higgins - 05 Jul 2007 01:49 GMT
>> I have some text files with western characters in english, and
>> japanese fonts in them.  I need to convert the appropriate pieces of
[quoted text clipped - 6 lines]
> Maybe this can help.
> <http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html>

Learning the Japanese language.
<http://www.nihongoresources.com/general/about.html>
Greg R. Broderick - 05 Jul 2007 17:44 GMT
JR <jriker1@yahoo.com> wrote in news:1183580320.812796.115680
@w5g2000hsg.googlegroups.com:

> I have some text files with western characters in english, and
> japanese fonts in them.  I need to convert the appropriate pieces of
> this file to their codepoint alternatives so it can be properly
> processed.  Is there a method out there or sample code to convert a
> file as mentioned kind of like the native2ascii app does in the JDK?

First, I would recommend that you spend some time learning the difference
between character sets (e.g. unicode), encodings (e.g. UTF-8) and fonds
(e.g.
MS Mincho).  Several web pages that I've found useful for this include:

http://czyborra.com/
http://www.i18nguy.com/unicode/codepages.html
http://www.unicode.org/
http://www.faqs.org/rfcs/rfc2044.html
http://www.faqs.org/rfcs/rfc2781.html

Once you've done this, investigate the javadocs for java.lang.Character.  
There are methods in this class that will do what you want.

Cheers
GRB

Signature

---------------------------------------------------------------------
Greg R. Broderick                  usenet200705@blackholio.dyndns.org

A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------

Roedy Green - 05 Jul 2007 18:53 GMT
On Thu, 05 Jul 2007 11:44:00 -0500, "Greg R. Broderick"
<usenet200706@blackholio.dyndns.org> wrote, quoted or indirectly
quoted someone who said :

>http://www.faqs.org/rfcs/rfc2044.html

the FAQ for UTF-8 is obsolete. It was replaced by RFC 2279, then again
by RFC 3629.

Unfortunately the people who maintain RFCs never go back into insert
"OBSOLETE, replaced by RFC xxxx" on any RFCs.  Perhaps it is time to
sort this out and publish a set of properly cross referenced RFCS.

see http://mindprod.com/projects/rfcconversion.html
student project.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Martin Gregorie - 05 Jul 2007 20:15 GMT
> the FAQ for UTF-8 is obsolete. It was replaced by RFC 2279, then again
> by RFC 3629.
>
> Unfortunately the people who maintain RFCs never go back into insert
> "OBSOLETE, replaced by RFC xxxx" on any RFCs.  Perhaps it is time to
> sort this out and publish a set of properly cross referenced RFCS.

Use http://www.rfc-editor.org/rfcsearch.html to search for RFCs.

The search results have a column that shows which RFCs were replaced by
the target one you searched for and which have since replaced it.

Signature

martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

Roedy Green - 05 Jul 2007 23:48 GMT
On Thu, 05 Jul 2007 20:15:00 +0100, Martin Gregorie
<martin@see.sig.for.address> wrote, quoted or indirectly quoted
someone who said :

>Use http://www.rfc-editor.org/rfcsearch.html to search for RFCs.
>
>The search results have a column that shows which RFCs were replaced by
>the target one you searched for and which have since replaced it.

Wonderful! It also shows obsoletes, updates, is obsoleted by ...

I just wish they would embed that info in the RFCs themselves in a
rigid format.

--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Greg R. Broderick - 06 Jul 2007 18:45 GMT
> the FAQ for UTF-8 is obsolete. It was replaced by RFC 2279, then again
> by RFC 3629.

Thanks!  I've updated my own references (I use these links fairly frequently
myself since I'm presently involved in doing i18n of some software at
$WORKPLACE).

Cheers!
GRB

Signature

---------------------------------------------------------------------
Greg R. Broderick                  usenet200705@blackholio.dyndns.org

A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.