Java Forum / General / September 2007
Properly encoding "Project Gutenburg 1913 Webster Unabridged Dictionary".
Daniel Pitts - 20 Sep 2007 06:03 GMT So, I've spent all day working on this. Funfun...
Back story: Project Gutenburg create free ebooks from content that is now in the public domain, including the "1913 Webster Unabridged Dictionary". The problem with this particular work (pgw050*.txt), is that it uses a very "odd" character set, and an almost-xml markup (it may be valid SGML, but I wouldn't bank on it)
Its part DOS extended ascii, and then some proprietary character codes.
My goal: I'd like to get this into a form that is easily processed by a program. I think the best way to do this is to put it into a robust XML formal. This would involved cleaning up the markup to be more valid XML, as well as processing some of the character codes into nicer forms. I've already written a program that will read the original texts, and re-encode the files as UTF-8, using appropriate character substitution when possible.
At this point, I'm not sure if I'd be better off converting their custom "entities" into the equivalent UTF-8 encoded characters, or if it would be better to convert all entities and non-standard characters into some sort of XML encoded entities.
Anyone have suggestions on what would be the most useful way to go?
Hunter Gratzner - 20 Sep 2007 09:14 GMT > So, I've spent all day working on this. Funfun... > > Back story: Project Gutenburg It's Gutenberg, not Gutenburg.
> create free ebooks from content that is > now in the public domain, including the "1913 Webster Unabridged > Dictionary". The problem with this particular work (pgw050*.txt), is Thanks for not providing a link to the file, so we are saved from having to have a look at it.
Daniel Pitts - 20 Sep 2007 15:25 GMT > > So, I've spent all day working on this. Funfun... > > > Back story: Project Gutenburg > > It's Gutenberg, not Gutenburg. I actually knew that, but my fingers decided to do what they wanted, not what I wanted :-)
> > create free ebooks from content that is > > now in the public domain, including the "1913 Webster Unabridged > > Dictionary". The problem with this particular work (pgw050*.txt), is > > Thanks for not providing a link to the file, so we are saved from > having to have a look at it. Ah, indeed.
Thanks for the constructive response.
Jeff Higgins provided the link in a reply: <http://www.gutenberg.org/ dirs/etext96/pgw050ab.txt> Thanks Jeff!
Thanks, Daniel.
Jeff Higgins - 20 Sep 2007 14:39 GMT > So, I've spent all day working on this. Funfun... > [quoted text clipped - 15 lines] > original texts, and re-encode the files as UTF-8, using appropriate > character substitution when possible. Whew. After a quick read of webfont.asc and tagset.web I can feel your pain. I think the main problem here is that the typesetters /style/ conveys so much information. For instance:
216 d8 Ø <par/ double vertical bar (short length; the long length is the graphics character 186) This precedes words marked with a double vertical bar in the original dictionary, signifying that the word was adopted directly into English without modification of the spelling.
For myself, I suppose the question would be: Do I want my /program/ to understand and/or act upon the fact that a character code 0xd8 signifies the above or is it strictly for a /human/ readers' consumption? If the former probably an XML tag would be appropriate, if the latter maybe an appropriate glyph is sufficient.
<http://www.gutenberg.org/dirs/etext96/pgw050ab.txt>
> At this point, I'm not sure if I'd be better off converting their > custom "entities" into the equivalent UTF-8 encoded characters, or if > it would be better to convert all entities and non-standard characters > into some sort of XML encoded entities. > > Anyone have suggestions on what would be the most useful way to go? Jeff Higgins - 20 Sep 2007 21:36 GMT >> So, I've spent all day working on this. Funfun... >> [quoted text clipped - 3 lines] >> that it uses a very "odd" character set, and an almost-xml markup (it >> may be valid SGML, but I wouldn't bank on it) Another thought strikes me. Have you looked any of the many "dictionary markup" languages already out there? Have you seen the GNU CIDE? http://www.ibiblio.org/webster/
Daniel Pitts - 20 Sep 2007 22:23 GMT > >> So, I've spent all day working on this. Funfun... > [quoted text clipped - 7 lines] > "dictionary markup" languages already out there? Have you seen > the GNU CIDE?http://www.ibiblio.org/webster/ Heh, same source material, but it looks like more care was taken in the translation to *machine readable* format. I'll check it out. Thanks for the pointer. (Searching for Public Domain Dictionary doesn't turn up as much relevant hits as it should :-) )
Daniel Pitts - 20 Sep 2007 21:43 GMT > > So, I've spent all day working on this. Funfun... > [quoted text clipped - 32 lines] > consumption? If the former probably an XML tag would be appropriate, > if the latter maybe an appropriate glyph is sufficient. Thanks for the reply. My main goal is to retain as much semantic meaning as possible for the program to understand. So if I understand your point, I should convert it to XML tags to maintain that information...
This brings up a related point. In XML, can "&blah;" entities have semantic meaning associated with them? Or are they only replacements for otherwise difficult-to-represent characters? That makes a difference between using &directlyAdopted; and <directly-adopted/>
> <http://www.gutenberg.org/dirs/etext96/pgw050ab.txt> > [quoted text clipped - 4 lines] > > > Anyone have suggestions on what would be the most useful way to go? Thanks, Daniel.
Jeff Higgins - 20 Sep 2007 22:11 GMT > Daniel Pitts wrote: > > So, I've spent all day working on this. Funfun... [quoted text clipped - 33 lines] > consumption? If the former probably an XML tag would be appropriate, > if the latter maybe an appropriate glyph is sufficient. Thanks for the reply. My main goal is to retain as much semantic meaning as possible for the program to understand. So if I understand your point, I should convert it to XML tags to maintain that information...
This brings up a related point. In XML, can "&blah;" entities have semantic meaning associated with them? Or are they only replacements for otherwise difficult-to-represent characters? That makes a difference between using &directlyAdopted; and <directly-adopted/>
Well, if your asking me personally, I'd have to say I'm no XML expert and that the best I could do is to point you to the appropriate part of the spec, sorry.
<http://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct>
> <http://www.gutenberg.org/dirs/etext96/pgw050ab.txt> > [quoted text clipped - 4 lines] > > > Anyone have suggestions on what would be the most useful way to go? Thanks, Daniel.
Roedy Green - 20 Sep 2007 18:24 GMT On Thu, 20 Sep 2007 05:03:36 -0000, Daniel Pitts <googlegroupie@coloraura.com> wrote, quoted or indirectly quoted someone who said :
>At this point, I'm not sure if I'd be better off converting their >custom "entities" into the equivalent UTF-8 encoded characters, or if >it would be better to convert all entities and non-standard characters >into some sort of XML encoded entities. Perhaps the way to go is to devise a font that renders these odd characters correctly. Then the text could be easily manipulated programmatically with tiny mods to existing software. Then you could even publish it as a PDF document.
Your problem then becomes political, talking some skilled type designer into donating her skills in return for some exposure.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Roedy Green - 20 Sep 2007 18:29 GMT On Thu, 20 Sep 2007 17:24:26 GMT, Roedy Green <see_website@mindprod.com.invalid> wrote, quoted or indirectly quoted someone who said :
>Your problem then becomes political, talking some skilled type >designer into donating her skills in return for some exposure. If you have some high res scans of the original text, your job is not designing a font, but the much easier job of "stealing" the font from the original samples. I looked into a similar problem circa 1990 to "steal" Chinese fonts from hand painted fonts on mechanical optical typesetters. The tools were primitive -- interactively defining Bezier curves with Adobe tools.
There are people who will create you a font from a sample of your handwriting or printing for a nominal charge. Perhaps one of them has the tools and skills to solve your problem.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
RedGrittyBrick - 21 Sep 2007 10:43 GMT > On Thu, 20 Sep 2007 05:03:36 -0000, Daniel Pitts > <googlegroupie@coloraura.com> wrote, quoted or indirectly quoted [quoted text clipped - 12 lines] > Your problem then becomes political, talking some skilled type > designer into donating her skills in return for some exposure. The purpose of a dictionary is semantic. The actual glyphs are comparatively unimportant. The intellectual accomplishment does not lie mainly in the choice of symbols.
If you want to reproduce the beautiful typography of the original, use high quality image scans.
Otherwise I'd translate the glyphs to something semantically or visually close in the unicode character set.
I think I'd try for a purely semantic markup in XML. Then create a stylesheet that would render it in XHTML (say) and which would introduce glyphs and fonts as close to the original as possible. That way, if unicode ever gets extended to include some of the odd characters used in the original, you only have to amend the stylesheet.
So I'd represent the "double vertical bar" as an attribute of a tag. e.g. <word spelling="adopted"> The stylesheet could insert a glyph visually close to "double vertical bar".
In particular, I'd translate markup like "<universbold>" into <exposition> or <shape-description> or something. I'm pretty sure Webster didn't compose his dictionary with LaserJet fonts in mind :-)
Daniel Pitts - 21 Sep 2007 16:31 GMT On Sep 21, 2:43 am, RedGrittyBrick <redgrittybr...@spamweary.foo> wrote:
> > On Thu, 20 Sep 2007 05:03:36 -0000, Daniel Pitts > > <googlegrou...@coloraura.com> wrote, quoted or indirectly quoted [quoted text clipped - 36 lines] > <exposition> or <shape-description> or something. I'm pretty sure > Webster didn't compose his dictionary with LaserJet fonts in mind :-) Heh. He probably was using a BubbleJet :-)
But seriously. I'd like to keep the original intent (the transcriber's, not necessarily Webster's), and then in a later stage of the processing, convert it to the more semantic meaning, and probably ignore the rendering of that information. My personal use- case actually only cares about the relationships between words, and the part of speech. For instance, I'd like to be able to recognize Ran, Run, and Runs as different tenses of the same word, and Leaf/ Leaves as different inflections of the same word.
Actually, thats not quite my "ultimate" goal. The ultimate goal is to create an English Imperative Sentence parser to use in a text adventure game. I just figured I might as well do something useful for the community while I'm at it (in this case, semanticize the dictionary). Although it appears that gcide_xml may have done what I wanted to do already.
John W. Kennedy - 22 Sep 2007 04:10 GMT > Actually, thats not quite my "ultimate" goal. The ultimate goal is to > create an English Imperative Sentence parser to use in a text > adventure game. I cannot find that you have ever participated in rec.arts.int-fiction. Assuming this to be true, then it is highly likely you have no idea of what you are getting into. Most fundamentally, you can't do a useful I-F parser (assuming that, by "parser", you mean more than a mere lexer) unless it is integrated with the world model. And you're also going to have to create a descriptive language and a compiler for it.
Please study Inform 6, Inform 7 (they are completely different), TADS 2, TADS 3, Hugo, and Adrift, and then see if A) you really have anything new to contribute to the state of the art, and B) you have the time to produce it. I would estimate that any new system offering a significant improvement on existing tools should take about ten man-years to do from scratch. You'll also probably need at least two collaborators, a test writer, and a documentation writer. At a minimum, don't try to create your own tests; you need a dedicated adversary, because this problem domain is rife with edge and corner cases.
 Signature John W. Kennedy "The whole modern world has divided itself into Conservatives and Progressives. The business of Progressives is to go on making mistakes. The business of the Conservatives is to prevent the mistakes from being corrected." -- G. K. Chesterton
Daniel Pitts - 22 Sep 2007 05:26 GMT > > Actually, thats not quite my "ultimate" goal. The ultimate goal is to > > create an English Imperative Sentence parser to use in a text > > adventure game. > > I cannot find that you have ever participated in rec.arts.int-fiction. Indeed, I have not.
> Assuming this to be true, then it is highly likely you have no idea of > what you are getting into. Most fundamentally, you can't do a useful I-F > parser (assuming that, by "parser", you mean more than a mere lexer) > unless it is integrated with the world model. And you're also going to > have to create a descriptive language and a compiler for it. Actually, my plan is to describe the world model with Java objects (hence this being a Java group)
> Please study Inform 6, Inform 7 (they are completely different), TADS 2, > TADS 3, Hugo, and Adrift, and then see if A) you really have anything > new to contribute to the state of the art, and B) you have the time to > produce it. A) If I don't have anything worth while to contribute, at least I'll have gained knowledge. This isn't about bettering existing tools and platforms, but about bettering myself. I will take a look at those you suggested, but I'll probably continue on with my project anyway. I do have *some* experience working on a Lima M.U.D.
> I would estimate that any new system offering a significant > improvement on existing tools should take about ten man-years to do from > scratch. You'll also probably need at least two collaborators, a test > writer, and a documentation writer. At a minimum, don't try to create > your own tests; you need a dedicated adversary, because this problem > domain is rife with edge and corner cases. Agreed. The part that I find the most difficult to model, parse, and query is the complex relationships that can occur amongst several objects. It's easy enough to say that a bowl in on a table, but what about an apple between the banana and the orange in the bowl on the wooden table.
Every journey starts with but a footstep. It may take 10 man years to complete, but if I don't start on my own, I'll never know. I'm 26, so if this a project that takes me until I'm 36, I'll still be young enough to enjoy the results. In any case, if this DOES get to a point where I think it might become something useful to the community, I'm sure I will be able to find plenty of collaborators.
Thanks for the pointers both to the existing projects, and to the raif group. I'm sure I will find it invaluable as I go on.
Cheers, Daniel.
Patricia Shanahan - 22 Sep 2007 06:10 GMT ...
> Agreed. The part that I find the most difficult to model, parse, and > query is the complex relationships that can occur amongst several > objects. It's easy enough to say that a bowl in on a table, but what > about an apple between the banana and the orange in the bowl on the > wooden table. I think there are far more basic issues. Here's a classic example of the context-sensitivity of the English language: "Time flies like an arrow.".
If it is advice from a senior researcher to a junior researcher in an entymology lab, "time" is a verb, "flies" is a noun, and "like an arrow" modifies how to go about timing flies.
If it is a comment on how fast time seems to go by, "time" is a noun, "flies" is a verb, and "like an arrow" modifies how time flies.
Patricia
RedGrittyBrick - 22 Sep 2007 12:56 GMT > ... >> Agreed. The part that I find the most difficult to model, parse, and [quoted text clipped - 12 lines] > If it is a comment on how fast time seems to go by, "time" is a noun, > "flies" is a verb, and "like an arrow" modifies how time flies. Time flies like an arrow. Fruit flies like a banana. - Groucho Marx
 Signature RGB
Daniel Pitts - 22 Sep 2007 18:32 GMT > ... > [quoted text clipped - 15 lines] > > Patricia I actually have a plan on how to handle context, but that particular sentence is not imperative in the second sense that you provided. Since I'm narrowing the scope of sentence types down to imperative, that helps eliminate _some_ ambiguous situations. Indeed, most languages (including programming) are somewhat sensitive to context.
For example, the Java "sentence": s+=10;
could mean "Increase the int 's' by 10.", or "append '10' to the String 's'". It could even be an error if "s" isn't numeric or a String.
The only reason that isn't considered a problem in Java, is that its "easy" to determine the context of a statement (scoping rules are specific and well-defined). On the other hand, "Get the other key" depends on context that would be harder to model in a computer. Especially after a few interactions...
"You see a red key and a blue key." Look at the red key "The key is red." Look at the other key "The other key is blue." Get the other key. <-- Does other point to the other other key, or to the original other key?
Its been my experience with interactive fictions that the sentence interpreters tend to need you to be very specific. I'm sure there are some out there that have forms of context handling, but I want to experiment on my own to see how I would go about it.
Originally, I think contextual information will have to be provided by the world-view designer, with a little help about the "obvious" context. Eventually, if the imperative sentence parser becomes good enough, I would consider expanding the scope of it so that the parser understood other types of sentences, and could glean information about the current context simply by the descriptions involved.
John W. Kennedy - 22 Sep 2007 20:57 GMT > .... >> Agreed. The part that I find the most difficult to model, parse, and >> query is the complex relationships that can occur amongst several >> objects. It's easy enough to say that a bowl in on a table, but what >> about an apple between the banana and the orange in the bowl on the >> wooden table.
> I think there are far more basic issues. Here's a classic example of the > context-sensitivity of the English language: "Time flies like an arrow.".
> If it is advice from a senior researcher to a junior researcher in an > entymology lab, "time" is a verb, "flies" is a noun, and "like an arrow" > modifies how to go about timing flies.
> If it is a comment on how fast time seems to go by, "time" is a noun, > "flies" is a verb, and "like an arrow" modifies how time flies. And if it is an observation by an surrealist, "time" is an adjective, "flies" is a noun, "like" is a verb, and "an arrow" is the direct object.
Here's a worse one: "It's a pretty little girls school". I count six parsings.
 Signature John W. Kennedy "I want everybody to be smart. As smart as they can be. A world of ignorant people is too dangerous to live in." -- Garson Kanin. "Born Yesterday"
Stefan Ram - 22 Sep 2007 21:23 GMT >And if it is an observation by an surrealist, "time" is an adjective, >"flies" is a noun, "like" is a verb, and "an arrow" is the direct object. »[I]n an analysis of a set of 891 sentences ranging in length from 1 to 25 words, a team led by Kathryn Baker found an average of 27 possible ways to parse each sentence.«
http://scienceblogs.com/cognitivedaily/2006/12/machine_translation_taking_a_q.php
»"Time flies like an arrow" --
1. Time proceeds as quickly as an arrow proceeds. (the intended reading)
2. Measure the speed of flies in the same way that you measure the speed of an arrow.
3. Measure the speed of flies in the same way that an arrow measures the speed of flies.
4. Measure the speed of flies that resemble an arrow.
5. Flies of a particular kind, time-flies, are fond of an arrow.«
»The Language Instinct«, Steven Pinker
Lew - 22 Sep 2007 21:24 GMT > Here's a worse one: "It's a pretty little girls school". I count six > parsings. I trust none of them involve the possessive of "girl", singular or plural. That would involve the appropriate placement of apostrophe.
 Signature Lew
John W. Kennedy - 22 Sep 2007 22:29 GMT >> Here's a worse one: "It's a pretty little girls school". I count six >> parsings. > > I trust none of them involve the possessive of "girl", singular or > plural. That would involve the appropriate placement of apostrophe. No, I'm not counting that; if we were looking at the spoken form, however, we could, which would give even more readings.
 Signature John W. Kennedy "But now is a new thing which is very old-- that the rich make themselves richer and not poorer, which is the true Gospel, for the poor's sake." -- Charles Williams. "Judgement at Chelmsford"
Lew - 22 Sep 2007 22:45 GMT >>> Here's a worse one: "It's a pretty little girls school". I count six >>> parsings. [quoted text clipped - 4 lines] > No, I'm not counting that; if we were looking at the spoken form, > however, we could, which would give even more readings. <http://www.phrases.org.uk/bulletin_board/48/messages/808.html>
Given the messiness of human input, one might well have to disregard niceties of punctuation to arrive at the intended input.
 Signature Lew "The world needs a computer that does what we want instead of what we tell it to do."
John W. Kennedy - 22 Sep 2007 20:42 GMT > Agreed. The part that I find the most difficult to model, parse, and > query is the complex relationships that can occur amongst several > objects. It's easy enough to say that a bowl in on a table, but what > about an apple between the banana and the orange in the bowl on the > wooden table. You're still looking at the purely linguistic problems. But there's more to it than that. For example, what about a cabinet with a closed door, but which also has a flat surface on top? What if the door is made of glass? What if it's made of smoky glass, but there's a switch that can turn on an interior light? All these things have to be handled by the world model, but -- they also drag in your parser's disambiguator.
 Signature John W. Kennedy "Sweet, was Christ crucified to create this chat?" -- Charles Williams. "Judgement at Chelmsford"
Daniel Pitts - 23 Sep 2007 01:17 GMT > > Agreed. The part that I find the most difficult to model, parse, and > > query is the complex relationships that can occur amongst several [quoted text clipped - 13 lines] > "Sweet, was Christ crucified to create this chat?" > -- Charles Williams. "Judgement at Chelmsford" Actually, the parser can give a set of all possible parsings, and the model could determine which makes the most sense based on the current context.
Yes, the world model is an important part of the interactive fiction. Its also the easier part to handle in my opinion. The reason its easier is that you can limit the world model in ways that you can't limit what the human will type (without given them an express set of allowable inputs). When you come across an ambiguous statement, you can do one of several things, including asking for clarification or making a best guess based on current context.
John W. Kennedy - 23 Sep 2007 03:04 GMT > Actually, the parser can give a set of all possible parsings, and the > model could determine which makes the most sense based on the current > context. Then you are ruling out the ability to do:
> Take the box. Which box do you mean? The red box or the blue box?
...which has been regarded as bare-minimum practice for decades.
> Yes, the world model is an important part of the interactive fiction. > Its also the easier part to handle in my opinion. It, too, has nasty possibilities that I suspect you've not yet considered. Can the player, while seated in a vehicle, reach out and take an object from the surrounding environment? Have you complete insurance against putting A inside (or on top of) B while B is inside (or on top of) A? And don't forget combinatorial explosion.
> The reason its > easier is that you can limit the world model in ways that you can't > limit what the human will type (without given them an express set of > allowable inputs). Sure, but go too far, and you'll be damned for mimetic failure.
 Signature John W. Kennedy "The first effect of not believing in God is to believe in anything...." -- Emile Cammaerts, "The Laughing Prophet"
Daniel Pitts - 23 Sep 2007 17:21 GMT > > Actually, the parser can give a set of all possible parsings, and the > > model could determine which makes the most sense based on the current [quoted text clipped - 4 lines] > > Take the box. > Which box do you mean? The red box or the blue box? Are you sure I'm ruling that out? If their is an equal probability of the user meaning either the red box or the blue box, then I could easily present that question. If the user then replies with "The first one", or "Red", or "Either" (etc...), then the contextual information will give the interpreter enough information to figure out what the user really meant.
> ...which has been regarded as bare-minimum practice for decades. > [quoted text clipped - 6 lines] > insurance against putting A inside (or on top of) B while B is inside > (or on top of) A? And don't forget combinatorial explosion. I think I handled that by: if (a.inReachOfPlayer());
and in the "add(Relationship relationship, Thing thing)" method, I check to see if thing's relationship tree includes this already.
> > The reason its > > easier is that you can limit the world model in ways that you can't > > limit what the human will type (without given them an express set of > > allowable inputs). > > Sure, but go too far, and you'll be damned for mimetic failure. What is mimetic failure? I've never heard that term.
Anyway, why are you so convinced that I haven't got the engineering capability to come up with solutions for these problems? Have any of these concerns of yours been been proven impossible to resolve, or just difficult to resolve? I'm not a junior programmer, I've engineering software for 18 years. If this was my first project, I'd probably be doomed to failure as you've suggested, its much more (I concede, not 100%) likely to succeed given my experience.
And whats the harm in trying?
I do thank you for your interest in ensuring that I don't (waste my time? fail? why are you pointing these out?). I assure you that the whole thing is just for the learning experience anyway. Even if my effort produces naught, the project wouldn't have failed.
If you are interested in discussing the intricacies of text-based user interaction with me, I'd be pleased to continue our conversation, but I'd appreciate it if you try to alter your tone. It feels like your assuming I couldn't have thought about things before you point them out to me.
Thanks, Daniel.
Lew - 23 Sep 2007 17:46 GMT > If you are interested in discussing the intricacies of text-based user
> interaction with me, I'd be pleased to continue our conversation, but > I'd appreciate it if you try to alter your tone. It feels like your > assuming I couldn't have thought about things before you point them > out to me. I wish people would stop being tone-of-voice police. This is Usenet, in a group that is designed for free-wheeling consideration of Java technical issues. JWK had some points to draw to your attention. You should jump off the high horse of personal aggrievement and consider his points simply on their merits. His points were on topic, technical and designed to elucidate issues introduced from your posts. That suffices. He owes you no more. He owes me no more, anyway.
I suggest that you get over it.
 Signature Lew
Daniel Pitts - 23 Sep 2007 18:04 GMT > > If you are interested in discussing the intricacies of text-based user > [quoted text clipped - 9 lines] > their merits. His points were on topic, technical and designed to elucidate > issues introduced from your posts. Hey, don't worry, I'm not Ed or Twisted, I don't get offended quite so easily. I responded to the technical aspect of his post without resorting to ad hominem. I'm just asking that he doesn't make assumptions about my abilities in his posts. Don't get me wrong, I do appreciate his bringing up the technical challenges that await me on this project.
In any case, I've asked once, and if the response isn't up to my "emotional standards" ;), I'll simply drop the thread.
> That suffices. He owes you no more. He > owes me no more, anyway. He might not owe you or I, but *I* owe it to myself to ask for a little respect. If I feel like I wont get that respect, I'll take your following suggestion to heart (as I intended to from the start).
> I suggest that you get over it. > > -- > Lew Thanks, Not a troll, Daniel.
Lew - 23 Sep 2007 18:13 GMT > Thanks, > Not a troll, > Daniel. No, you are most emphatically not a troll. You are truly one of the White Hats here.
As the subject myself of actual, direct /ad hominem/ attacks in these hallowed halls, I know how difficult it can be to put up with disrespect.
I support your right to ask for respect. I also point out that JWK isn't writing for you alone, but for all the viewers who might not have thought about all the implications that you have. JWK supports everyone when he elucidates those issues.
Also, don't expect him to be telepathic. How can he know of what you have thought? He exercises due diligence by bringing up points that he "/suspect[s]/ you've not yet considered." (Emphasis mine.)
I offer that asking for respect here is pointless. Ask for information, knowledge or guidance. Let self-respect suffice.
How about we all stop expecting people to coddle our namby-pambiness and just deal with the content of messages?
 Signature Lew
Daniel Pitts - 23 Sep 2007 19:28 GMT > > Thanks, > > Not a troll, [quoted text clipped - 23 lines] > -- > Lew F***ing idiot!
(Just kidding!)
I don't think that asking for respect is pointless, but expecting it might be. Its dangerous for one to confuse a request with an expectation. I requested respect, but I don't expect it.
Perhaps what I should have ask for was to continue this conversation with JWK out of public attention and that we discuss the topic at the level that I'm capable, rather than at the LCD of all of cljp. :-)
Although, I agree with you that answers here should be at the level that benefits the wider audience. It makes me wonder though, perhaps a more advanced-topic forum is desirable. Or perhaps cljp should be reclaimed, and all the basic->intermediate topics could be shifted to cljh.
Or, maybe my questions are more domain specific, and I need to move my conversation into the appropriate group. For this thread, perhaps, as JWK suggested, rec.arts.int-fiction, or perhaps rec.games.int-fiction would be more appropriate.
To J.W.K. Would you be interested in continuing this discussion through e-mail? (Don't use the address I have here, it won't go through)
Thanks, Daniel.
Andrew Thompson - 23 Sep 2007 19:56 GMT ..
>...perhaps cljp should be >reclaimed, and all the basic->intermediate topics could be shifted to >cljh. I agree fully. c.l.j.h. is a group well designed for beginners where (perhaps unproductively excessive) politeness is expected. I invite anybody that feels the slightest bit 'fragile' to post there, and stop wasting the bandwidth of c.l.j.p. posters with such dross.
 Signature Andrew Thompson http://www.athompson.info/andrew/
John W. Kennedy - 24 Sep 2007 03:56 GMT > To J.W.K. Would you be interested in continuing this discussion > through e-mail? (Don't use the address I have here, it won't go > through) Honestly, you'd be better off with real experts. I've been programming since 1965, my wife and I were beta testers for Infocom from 1984 on, and I've been involved in post-Infocom IF software since the early 90s (mainly after-the-fact OS/2 support for Infocom and a Java servlet that could execute most Infocom games on cellphones via WAP -- in case you don't know, Infocom games ran on a virtual machine), but there are people way more knowledgeable than I am, people that I look up to in this field the way that I look up to people like Jane Austen, Kálmán Imre, or Joe Straczynski in theirs.
I know enough to know that developing an IF parser is like herding cats; I don't claim to be a cat herder myself. I'm only getting involved in this because, as far as I know, I'm the only one in CLJP who's dipped a toe in this pool at all -- and I've seen people crash and burn.
 Signature John W. Kennedy "The bright critics assembled in this volume will doubtless show, in their sophisticated and ingenious new ways, that, just as /Pooh/ is suffused with humanism, our humanism itself, at this late date, has become full of /Pooh./" -- Frederick Crews. "Postmodern Pooh", Preface
Daniel Pitts - 24 Sep 2007 04:51 GMT > > To J.W.K. Would you be interested in continuing this discussion > > through e-mail? (Don't use the address I have here, it won't go [quoted text clipped - 21 lines] > become full of /Pooh./" > -- Frederick Crews. "Postmodern Pooh", Preface Indeed, you do seem to be the most knowledgeable on this topic in this group. Perhaps I should seek a mentor in raif then. I do have experience building parsers. As a matter of fact, I've created some sophisticated parsers by hand, rather than relying on a tool.
Anyway, enough about my random wanderings as a programmer. I downloaded Inform 7 today, and I've been playing with it all day. So far I'm impressed, but not overwhelmed. I find it easier to model my world with code rather than natural language, but I'm sure that I'll get the hang of this eventually.
Thanks for your help JWK. Daniel.
John W. Kennedy - 24 Sep 2007 20:20 GMT > I find it easier to model my > world with code rather than natural language, but I'm sure that I'll > get the hang of this eventually. I am myself not at all sure about the natural-language aspect of Inform 7 (horrid memories of supporting COBOL), but it embodies by far the most powerful "calculus of IF", so to speak, that I'm aware of.
Anyway, good luck!
 Signature John W. Kennedy If Bill Gates believes in "intelligent design", why can't he apply it to Windows?
John W. Kennedy - 23 Sep 2007 20:24 GMT >>> Actually, the parser can give a set of all possible parsings, and the >>> model could determine which makes the most sense based on the current [quoted text clipped - 11 lines] > interpreter enough information to figure out what the user really > meant. But now, you see, you've entangled the world model with the parser again. It really can't be avoided.
>> ...which has been regarded as bare-minimum practice for decades.
>>> Yes, the world model is an important part of the interactive fiction. >>> Its also the easier part to handle in my opinion.
>> It, too, has nasty possibilities that I suspect you've not yet >> considered. Can the player, while seated in a vehicle, reach out and >> take an object from the surrounding environment? Have you complete >> insurance against putting A inside (or on top of) B while B is inside >> (or on top of) A? And don't forget combinatorial explosion.
> I think I handled that by: > if (a.inReachOfPlayer()); That is only to say that you can solve the problem by solving it. How do you define inReachOfPlayer() when there may be arbitrary container objects surrounding a and/or the player? (And remember, by the way, that a modern system has to allow for player-ness to move from one character to another.)
> and in the "add(Relationship relationship, Thing thing)" method, I > check to see if thing's relationship tree includes this already. Nope. An object cannot contain another object that contains it, but an NPC can be friendly with another NPC that is friendly with it. And, on the other hand, you've forgotten the cabinet with a shelf on top.
>>> The reason its >>> easier is that you can limit the world model in ways that you can't >>> limit what the human will type (without given them an express set of >>> allowable inputs).
>> Sure, but go too far, and you'll be damned for mimetic failure.
> What is mimetic failure? I've never heard that term. From the American Heritage Dictionary: mimesis, noun: The imitation or representation of aspects of the sensible world, especially human actions, in literature and art.
> Anyway, why are you so convinced that I haven't got the engineering > capability to come up with solutions for these problems? I'm not. I'm just warning you that you're tackling an intrinsically hard problem that experts have been working on for decades, and that if you don't familiarize yourself with the state of the art, you're going to lay a big, fat egg.
> ...
> It feels like your > assuming I couldn't have thought about things before you point them > out to me. I am assuming only that you are not prodigiously more gifted than anyone else who has ever tried this -- and that group includes the founders of Infocom, who were graduates of the MIT Artificial Intelligence Laboratory, and Graham Nelson, the leading contemporary theorist and the creator of Inform and Inform 7, who lectures on mathematics at Oxford University and is also a published poet.
I cannot recommend too strongly that you acquaint yourself with rec.arts.int-fiction and some modern IF development systems. Inform 7 (<URL:http://www.inform-fiction.org>) is still in beta, but is probably the most advanced.
 Signature John W. Kennedy "When a man contemplates forcing his own convictions down another man's throat, he is contemplating both an unchristian act and an act of treason to the United States." -- Joy Davidman, "Smoke on the Mountain"
Ed Kirwan - 24 Sep 2007 22:09 GMT snipski
>> I would estimate that any new system offering a significant >> improvement on existing tools should take about ten man-years to do from >> scratch. snip
> Every journey starts with but a footstep. It may take 10 man years to > complete, but if I don't start on my own, I'll never know. I like that, because I've been there: and I eventually found a stall selling big, "I've failed," tee-shirts, just like John W. said I would. I bought myself a nice, bright green one. (See, "Violentia," below.)
Almost every step towards this particular failure, however, was rewarding, and I carry both lessons and lesions with me still (if only all failures yielded such insights). I've often thought that IF is the perfect environment in which to cuts ones OO-teeth (not that you are, Daniel) because you can get by with a little and add sophistication until the cows come home. In short, it's so damn extensible. It's worth doing as a code-structuring exercise alone, just don't ever expect to see a finish-line.
FWIW, you're going to meet the Visitor. I'm sure you've met him before, but in IF, he's the biggest, meanest bruiser you've ever seen. And he's in a bad mood.
Murderously bad ...
 Signature .ed
www.EdmundKirwan.com - Home of The Fractal Class Composition
Daniel Pitts - 24 Sep 2007 23:58 GMT > snipski > >> I would estimate that any new system offering a significant [quoted text clipped - 17 lines] > code-structuring exercise alone, just don't ever expect to see a > finish-line. No project or product is finished until its end-of-lifed. And at that point, its only finished in the sense of its mortality. :-) I'm glad someone else sees my point of view on this.
> FWIW, you're going to meet the Visitor. I'm sure you've met him before, but > in IF, he's the biggest, meanest bruiser you've ever seen. And he's in a > bad mood. Do you mean the Visitor pattern? Or is this some reference to something I don't yet know?
> Murderously bad ... *gulp* :-)
> -- > .ed > > www.EdmundKirwan.com- Home of The Fractal Class Composition Thanks, Daniel.
Ed Kirwan - 25 Sep 2007 06:15 GMT snip
> No project or product is finished until its end-of-lifed. And at that > point, its only finished in the sense of its mortality. :-) I'm glad [quoted text clipped - 4 lines] >> in a bad mood. > Do you mean the Visitor pattern? I do indeed.
At least if you take the simplistic Verb, Noun, Adverb, Adjective (etc.) approach that I took, because an action will depend on which type of each of these is involved.
For example, "Take sword," will have a different outcome from, "Take water," and it's the nature of the verb-object/noun-object interaction that defines this different outcome. I found my verbs constantly visiting my nouns to find out what to do next.
 Signature .ed
www.EdmundKirwan.com - Home of The Fractal Class Composition
Daniel Pitts - 25 Sep 2007 16:27 GMT > snip > > No project or product is finished until its end-of-lifed. And at that [quoted text clipped - 21 lines] > > www.EdmundKirwan.com- Home of The Fractal Class Composition My plan is actually to define a grammar that will parse the input sentence possible parse trees. Then figure out from the world model which of those parse tree's makes the most sense (or if I'd have to ask for clarification).
I'll probably want to use the visitor pattern to visit the objects in my world model though.
Jeff Higgins - 21 Sep 2007 15:16 GMT > So, ... I must thank you for posting this article. After having read your post I spent some time browsing the WWW on the subject and found a lot of interesting stuff. Here are links to two things that I found particularly interseting.
I rediscovered the Princeton University WordNet project. <http://wordnet.princeton.edu/>
And through that link discovered a most wonderful (free) dictionary utility program for the Windows platform:
WordWeb 5 for Windows <http://wordweb.info/free/>
This program allows me to place my mouse cursor over a word in any other program and with a CTRL + right click bring up a useful dictionary/thesarus already opened to the word under the cursor!! How neat! I've tried it in my newsreader "Outlook" and in IE7 and OpenOffice Writer, even Eclipse. How's it do that?
Anyway, this is not a commercial advertisement, I am not in any way associated the above mentioned organizations.
Thanks, JH
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|