...
...
I suggest mapping to a String rather than a char, to allow for e.g. two
letter expansions.
Patricia
>I suggest mapping to a String rather than a char, to allow for e.g. two
>letter expansions.
String also allows for 0-length transforms, to ignore a letter.
However, if you have no multi-char transforms the code will be faster
and considerably more compact using chars.

Signature
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Daniel Pitts - 30 Nov 2007 19:59 GMT
>> I suggest mapping to a String rather than a char, to allow for e.g. two
>> letter expansions.
>
> String also allows for 0-length transforms, to ignore a letter.
> However, if you have no multi-char transforms the code will be faster
> and considerably more compact using chars.
On the other hand, using String allows for codepoints that aren't single
characters.

Signature
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
Eric Sosman - 30 Nov 2007 23:25 GMT
>> I suggest mapping to a String rather than a char, to allow for e.g. two
>> letter expansions.
>
> String also allows for 0-length transforms, to ignore a letter.
> However, if you have no multi-char transforms the code will be faster
> and considerably more compact using chars.
... which suggests a hybrid approach: A char[] array for the
one-to-one mappings, with a special value like '\u0000' meaning
"I don't know; check for exceptional cases."
/* Untested, uncompiled, unscrutinized, un to the Nth: */
static char[] translateTable = new char[65536];
static { /* initialize it */ }
static Map<Character,String> weirdCasesMap = ...;
static { /* initialize it */ }
String translate(String old) {
StringBuilder buff = new StringBuilder();
for (int n = old.length(), i = 0; i < n; ++i) {
char oldc = old.charAt(i);
char newc = translateTable[oldc];
if (newc != 0) {
buff.append(newc);
}
else {
String news = weirdCasesMap.get(
Character.valueOf(oldc));
if (news != null) // allows for deletions
buff.append(news);
}
}
return buff.toString();
}
If deletions (incoming characters that map to nothing in the
output) are common, consider using two special codes in translateTable:
one meaning "Check the map" and the other meaning "Ignore this."

Signature
Eric.Sosman@sun.com