How is this "one file" organized? Is it a simple text file (I guess not
or else you would haven't posted)? Is it tightly integrated into the
application that uses it? Is it a compressed file? Is it a bit file? A
byte file?
If I were you, I'd first put that file on my Linux box, open it with
hava and print out the first 5kB and see what comes out.
Play around with it and it might make sense.
Sashi
> hi,
>
[quoted text clipped - 3 lines]
> idea where to start from. The file itself is about 65MB. What is the way to
> understand the file structure in order to read valid data.
I'd use good old hexdump and analyze the file format.
Then, extract its pure text part.
On Tue, 17 Jan 2006 23:13:17 +0100, "Grzegorz Stasica"
<gstasica@poczta.onet.pl> wrote, quoted or indirectly quoted someone
who said :
>I've dictionary which apparently works only on Win. I'd like to use its
>database (one file) and write some application so I could use it from linux
>as well. The problem is that I've never "hacked" these things and have no
>idea where to start from. The file itself is about 65MB. What is the way to
>understand the file structure in order to read valid data.
Dictionaries typically use some quite sophisticated compression
techniques. Basically they are going to try to avoid storing a
duplicate of the string
"responsibil"
in responsible
and responsibility.
If you have software to build custom dictionaries, that will make it
easier to decode, since you can build dictionaries with only one or
two words to study.
Unlike ordinary compression, dictionaries have to be used in
compressed form.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
Gordon Beaton - 18 Jan 2006 07:49 GMT
> Basically they are going to try to avoid storing a
> duplicate of the string
> "responsibil"
> in responsible
> and responsibility.
I would certainly hope that any dictionary I use doesn't assume that
"responsibil" is part of "responsible", regardless of how they choose
to store the words.
/gordon

Signature
[ do not email me copies of your followups ]
g o r d o n + n e w s @ b a l d e r 1 3 . s e