Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / April 2005

Tip: Looking for answers? Try searching our database.

StringTokenizer/StreamTokenizer.

Thread view: 
Anand Narasimhan - 26 Apr 2005 18:48 GMT
Hi,

I want to tokenize a string like '/Device/Interface?NAME=Serial1/0' into
 the following tokens.

Device
Interface?NAME=Serial1/0

If I use StringTokenizer with '/' as sepearator character I get the
following, which is not what I want.

Device
Interface?NAME=Serial1
0

I tried using StreamTokenizer.

StreamTokenizer st = new StreamTokenizer(
"/Device/Interface?NAME=Serial1/0" );
st.whiteSpaceChars( '/', '/' );
st.nextToken()

The result is

Device
INTERFACE
null
NAME
null
Serial1
null

I tried to call resetSyntax

st.resetSyntax();
            st.wordChars(0, 255);
            st.whitespaceChars( '/', '/');
            st.quoteChar('"');
            st.quoteChar('\'');
            st.parseNumbers();

The result I got was

Device
INTERFACE?NAME=Serial1
null

I tried quoting 'Serial1/0' like /Device/Interface?NAME='Serial1/0' The
result I got was

Device
INTERFACE?NAME=
Serial1/0

Is there any way with StringTokenizer, StreamTokenizer or any other
means (without acually having to write a tokenizer on my own) to get the
result I want which is

Device
Interface?NAME=Serial1/0

Thanks
Anand
Virgil Green - 26 Apr 2005 20:07 GMT
> Hi,
>
[quoted text clipped - 59 lines]
> Thanks
> Anand

Not without defining rules regarding when a '\' should be treated as a token
and when it should be treated as an included character.

--
Virgil
Anand Narasimhan - 26 Apr 2005 20:17 GMT
Thanks.
Setting whitespaceChars to '/' seems to work, except that when the
tokenizer sees a quote character, tokenizes everything within the quotes
as a seperate token.

eg. /Device/Interface?NAME='Serial1/0' results in
Device
Interface?NAME=
Serial1/0

But I did not set the quote character as a whitespace character.

Anand

>>Hi,
>>
[quoted text clipped - 65 lines]
> --
> Virgil
Virgil Green - 28 Apr 2005 18:20 GMT
> Thanks.
> Setting whitespaceChars to '/' seems to work, except that when the
[quoted text clipped - 9 lines]
>
> Anand

Still, no rules. What are the rules for when a '/' is considered a separator
and when it is considered a valid character?

Signature

Virgil

Oscar kind - 26 Apr 2005 21:40 GMT
> I want to tokenize a string like '/Device/Interface?NAME=Serial1/0' into
>  the following tokens.
[quoted text clipped - 8 lines]
> Interface?NAME=Serial1
> 0

[...]
> Is there any way with StringTokenizer, StreamTokenizer or any other
> means (without acually having to write a tokenizer on my own) to get the
> result I want which is
>
> Device
> Interface?NAME=Serial1/0

How is "/Device/Interface?NAME=Serial1/0".split("/", 3) insufficient?
I get {"", "Device", "Interface?NAME=Serial1"}, which is not exactly what
you want, but quite close.

Signature

Oscar Kind                                    http://home.hccnet.nl/okind/
Software Developer                    for contact information, see website

PGP Key fingerprint:    91F3 6C72 F465 5E98 C246  61D9 2C32 8E24 097B B4E2

Tor Iver Wilhelmsen - 27 Apr 2005 14:01 GMT
> I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
> into the following tokens.
[quoted text clipped - 8 lines]
> Interface?NAME=Serial1
> 0

You want to look into using regular expressions instead (present in
1.4 or later, separate install prior to that).

E.g.

Pattern p = Pattern.compile("/(\w+)/(.*)");
Matcher m = p.matcher("/Device/Interface?NAME=Serial1/0");
if (m.matches()) {
  tokens = new String[] { m.group(1), m.group(2)};
}

> I tried using StreamTokenizer.

StreamTokenizer is a very basic C lexer. It, like StringTokenizer,
should be discarded in modern code in preference of regular
expressions or a lexer/parser (google for them, there are quite a few
variants).
Ross Bamford - 29 Apr 2005 14:07 GMT
> > I want to tokenize a string like '/Device/Interface?NAME=Serial1/0'
> > into the following tokens.
[quoted text clipped - 26 lines]
> expressions or a lexer/parser (google for them, there are quite a few
> variants).

Although I'd stick with the tokenizer for simple use cases or tight
code, like splitting into words - regexps are more expensive.

Ross

Signature

  [Ross A. Bamford]     [ross AT the.website.domain]
Roscopeco Open Tech ++ Open Source + Java + Apache + CMF
http://www.roscopec0.f9.co.uk/ + info@the.website.domain



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.