Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / May 2006

Tip: Looking for answers? Try searching our database.

Free Dictionary German

Thread view: 
HansHenning.Gabriel@gmail.com - 12 May 2006 20:18 GMT
Hi,

I like to implement a little SpellChecking Tool. Therefor I need some
kind of free dictionary API that contains all possible german words.
Does anybody know where to get something like this?

Thanks!
Oliver Wong - 12 May 2006 21:32 GMT
> Hi,
>
> I like to implement a little SpellChecking Tool. Therefor I need some
> kind of free dictionary API that contains all possible german words.
> Does anybody know where to get something like this?

   This is the wrong approach, as German words can almost infinitely be
composed together to form yet more words. If you were to actually have a
file with every possible German word, the file would end up being several
gigabytes.

   See http://j3e.de/ispell/igerman98/todo.html for more details and see
http://hunspell.sourceforge.net/ for an example implementation of a German
spellchecker.

   - Oliver
Rhino - 13 May 2006 00:31 GMT
>> Hi,
>>
[quoted text clipped - 6 lines]
> file with every possible German word, the file would end up being several
> gigabytes.

While you're right that German words can be blended together to make bigger
words, I don't think anyone is expecting a dictionary to have every single
compound that can be made.

<Pedantic aside>I think the longest real German word I ever saw was in a
German textbook in a university course. It was something like:
Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
word, it is actually easy to break down:

Vier = four
wald = forest
Tattersee = a proper place name containing "see", which means "lake"
dampf = steam
schiff = ship
gesellschaft = company

If you put it all together, it meant something like:

Lake of the Four Woods Steamship Company

(There might have been another word in their too, something that meant
"travel" or "excursion" but I don't recall for sure.)

Basically, this is analagous to making "raincoat" from "rain" and "coat";
it's just that German does this more frequently than English.

</Pedantic aside>

>    See http://j3e.de/ispell/igerman98/todo.html for more details and see
> http://hunspell.sourceforge.net/ for an example implementation of a German
> spellchecker.
>
>    - Oliver

--
Rhino
Andrey Kuznetsov - 13 May 2006 10:03 GMT
> <Pedantic aside>I think the longest real German word I ever saw was in a
> German textbook in a university course. It was something like:
> Vierwaldstatterseedampfschiffgesellschaft. Despite this horrendously long
> word, it is actually easy to break down:

there are some words which may be splitted on different ways - with funny
results:
rohrohrzucker

roh|rohr|zucker:
roh=raw
rohr=cane
zucker=sugar

rohr|ohr|zucker
rohr=cane
ohr=ear
zucker=sugar

the second one does not make sense,
but how spellchecker should know about it?

Andrey

Signature

http://uio.imagero.com Unified I/O for Java
http://reader.imagero.com Java image reader
http://jgui.imagero.com Java GUI components and utilities

Rhino - 13 May 2006 14:16 GMT
>> <Pedantic aside>I think the longest real German word I ever saw was in a
>> German textbook in a university course. It was something like:
[quoted text clipped - 17 lines]
> the second one does not make sense,
> but how spellchecker should know about it?

I really don't know how you can write a spellchecker to handle a case like
that :-)

I think this is just another case that a spellchecker will simply not handle
correctly. That's why spellcheckers typically have options to let the user
accept words that were flagged as errors.

Let's face it: languages are constantly evolving and new words are joining
the language all the time. Fifty years ago, a spellchecker - whether a
software one or a human - would have rejected "Internet" since it wasn't a
word yet; today it is utterly commonplace and no spellchecker should ever
reject it. Any software spellchecker is always going to fail to recognize
the newest words.

It's just foolish to expect a perfect spellchecker.

--
Rhino
Thomas Fritsch - 15 May 2006 12:36 GMT
> Vierwaldstatterseedampfschiffgesellschaft.
>
> Lake of the Four Woods Steamship Company
>
> (There might have been another word in their too, something that meant
> "travel" or "excursion" but I don't recall for sure.)
Right! Actually it was
  Vierwaldstätterseedampfschifffahrtgesellschaft
where
  fahrt = travel

A fairly obvious extension is :-)
  Vierwaldstätterseedampfschifffahrtgesellschaftskapitänsmütze
where
  Kapitän = captain
  Mütze = cap
==> Cap of the captain of the steamship travel company at the Four Woods
Lake Site

Signature

"Thomas:Fritsch$ops:de".replace(':','.').replace('$','@')

HansHenning.Gabriel@gmail.com - 13 May 2006 17:44 GMT
Thanks for the links.
But my problem is still not solved. I do not need to check composed
words! I just need some kind of "normal" german dictionary that I can
access from within my Java Code! I guess the hunspell-project does not
have a Java API?!

So, does anybode have some more suggestions?


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.