>> Hi,
>>
[quoted text clipped - 6 lines]
> file with every possible German word, the file would end up being several
> gigabytes.
While you're right that German words can be blended together to make bigger
words, I don't think anyone is expecting a dictionary to have every single
compound that can be made.
<Pedantic aside>I think the longest real German word I ever saw was in a
German textbook in a university course. It was something like:
Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
word, it is actually easy to break down:
Vier = four
wald = forest
Tattersee = a proper place name containing "see", which means "lake"
dampf = steam
schiff = ship
gesellschaft = company
If you put it all together, it meant something like:
Lake of the Four Woods Steamship Company
(There might have been another word in their too, something that meant
"travel" or "excursion" but I don't recall for sure.)
Basically, this is analagous to making "raincoat" from "rain" and "coat";
it's just that German does this more frequently than English.
</Pedantic aside>
> See http://j3e.de/ispell/igerman98/todo.html for more details and see
> http://hunspell.sourceforge.net/ for an example implementation of a German
> spellchecker.
>
> - Oliver
--
Rhino
Andrey Kuznetsov - 13 May 2006 10:03 GMT
> <Pedantic aside>I think the longest real German word I ever saw was in a
> German textbook in a university course. It was something like:
> Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
> word, it is actually easy to break down:
there are some words which may be splitted on different ways - with funny
results:
rohrohrzucker
roh|rohr|zucker:
roh=raw
rohr=cane
zucker=sugar
rohr|ohr|zucker
rohr=cane
ohr=ear
zucker=sugar
the second one does not make sense,
but how spellchecker should know about it?
Andrey

Signature
http://uio.imagero.com Unified I/O for Java
http://reader.imagero.com Java image reader
http://jgui.imagero.com Java GUI components and utilities
Rhino - 13 May 2006 14:16 GMT
>> <Pedantic aside>I think the longest real German word I ever saw was in a
>> German textbook in a university course. It was something like:
[quoted text clipped - 17 lines]
> the second one does not make sense,
> but how spellchecker should know about it?
I really don't know how you can write a spellchecker to handle a case like
that :-)
I think this is just another case that a spellchecker will simply not handle
correctly. That's why spellcheckers typically have options to let the user
accept words that were flagged as errors.
Let's face it: languages are constantly evolving and new words are joining
the language all the time. Fifty years ago, a spellchecker - whether a
software one or a human - would have rejected "Internet" since it wasn't a
word yet; today it is utterly commonplace and no spellchecker should ever
reject it. Any software spellchecker is always going to fail to recognize
the newest words.
It's just foolish to expect a perfect spellchecker.
--
Rhino
Thomas Fritsch - 15 May 2006 12:36 GMT
> Vierwaldstatterseedampfschiffgesellschaft.
>
> Lake of the Four Woods Steamship Company
>
> (There might have been another word in their too, something that meant
> "travel" or "excursion" but I don't recall for sure.)
Right! Actually it was
Vierwaldstätterseedampfschifffahrtgesellschaft
where
fahrt = travel
A fairly obvious extension is :-)
Vierwaldstätterseedampfschifffahrtgesellschaftskapitänsmütze
where
Kapitän = captain
Mütze = cap
==> Cap of the captain of the steamship travel company at the Four Woods
Lake Site

Signature
"Thomas:Fritsch$ops:de".replace(':','.').replace('$','@')
Thanks for the links.
But my problem is still not solved. I do not need to check composed
words! I just need some kind of "normal" german dictionary that I can
access from within my Java Code! I guess the hunspell-project does not
have a Java API?!
So, does anybode have some more suggestions?