Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2006

Tip: Looking for answers? Try searching our database.

find a string in line

Thread view: 
puzzlecracker - 04 Jan 2006 02:13 GMT
you have a sentence and you need to find wether it contains a word; for
example "he is great " .containts("is") returns true, but  "his
great".contains("is") also returns true, but  I want it to return false

I tried regexpe but it didnt work String
patterns="[^A-Za-z]+str+[^A-Za-z]" ?

but it didnt work.
zero - 04 Jan 2006 02:26 GMT
> you have a sentence and you need to find wether it contains a word; for
> example "he is great " .containts("is") returns true, but  "his
[quoted text clipped - 4 lines]
>
> but it didnt work.

Use the whitespace character class:

".*\\sis\\s.*"

will match zero or more characters, a whitespace, the string "is", a
whitespace, and zero or more characters.

Note the double backslashes.

Signature

Beware the False Authority Syndrome

puzzlecracker - 04 Jan 2006 02:33 GMT
> > you have a sentence and you need to find wether it contains a word; for
> > example "he is great " .containts("is") returns true, but  "his
[quoted text clipped - 16 lines]
> --
> Beware the False Authority Syndrome

but I also want it to match things such as ,is,  is.
zero - 04 Jan 2006 02:46 GMT
>> > you have a sentence and you need to find wether it contains a word;
>> > for example "he is great " .containts("is") returns true, but  "his
[quoted text clipped - 11 lines]
>
> but I also want it to match things such as ,is,  is.

Then maybe the \W construct can help.  This will match any non-word
character.

".*\\W+" + searchString + "\\W+.*"

non-word characters are defined as anything other than an alfanumeric
character or an underscore.  So this would return true for "what is, once
was." but not for "his word"

There may be a problem if the search string is at the end or beginning
(or both) of the line you're searching, but you can check for that with
String:startsWith and String:endsWith

Signature

Beware the False Authority Syndrome

Chris Smith - 04 Jan 2006 03:46 GMT
> Then maybe the \W construct can help.  This will match any non-word
> character.
>
> ".*\\W+" + searchString + "\\W+.*"

Just to be paranoid, make that:

   ".*\\W+" + Pattern.quote(searchString) + "\\W+.*"

Note that Pattern.quote is only available in Java 1.5.  Prior to Java
1.5, it's exceedingly difficult to search for arbitrary substrings using
regular expressions, and you'd be better of using String.indexOf(String)
and checking the surrounding characters on your own.

> There may be a problem if the search string is at the end or beginning
> (or both) of the line you're searching, but you can check for that with
> String:startsWith and String:endsWith

Or, since you've got a Java regular expression anyway:

   "(^|.*\\W+)" + Pattern.quote(searchString) + "($|\\W+.*)"

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

puzzlecracker - 04 Jan 2006 05:21 GMT
> > Then maybe the \W construct can help.  This will match any non-word
> > character.
[quoted text clipped - 15 lines]
>
> Or, since you've got a Java regular expression anyway

that is slow with JFC... any old fashion ways to accomplish this task?

Thanks

>     "(^|.*\\W+)" + Pattern.quote(searchString) + "($|\\W+.*)"
>
[quoted text clipped - 4 lines]
> Chris Smith - Lead Software Developer/Technical Trainer
> MindIQ Corporation
puzzlecracker - 04 Jan 2006 06:01 GMT
> > Then maybe the \W construct can help.  This will match any non-word
> > character.
[quoted text clipped - 24 lines]
> Chris Smith - Lead Software Developer/Technical Trainer
> MindIQ Corporation

regular expressiona are quite slow with jfc.... can anyone suggest a
quick indexof variant or lagacy variant?
Chris Smith - 04 Jan 2006 06:24 GMT
> regular expressiona are quite slow with jfc.... can anyone suggest a
> quick indexof variant or lagacy variant?

Huh?  Is this some "jfc" that I'm not familiar with?  Regular
expressions are no slower or faster than normal with the Java Foundation
Classes (that is, Swing and some related APIs).  In fact, the two have
little to do with each other.

In any case, Noodles Jefferson already gave you a solution without using
regular expressions.  You seemed no happier with that, because it didn't
work precisely the way you want.  If you're that unable to write your
own code, perhaps its time to think about why you're involved in
programming.  An if statement probably won't kill you.

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Jaakko Kangasharju - 04 Jan 2006 08:15 GMT
> Then maybe the \W construct can help.  This will match any non-word
> character.
[quoted text clipped - 8 lines]
> (or both) of the line you're searching, but you can check for that with
> String:startsWith and String:endsWith

You can use boundary matches to overcome this problem.  The \b
construct matches a word boundary, so modifying your expression to
".*\\b" + searchString + "\\b.*" matches searchString between word
boundaries, including at the beginning or the end.

Signature

Jaakko Kangasharju, Helsinki Institute for Information Technology
Will you be my friend, please?

Chris Smith - 04 Jan 2006 08:28 GMT
> You can use boundary matches to overcome this problem.  The \b
> construct matches a word boundary, so modifying your expression to
> ".*\\b" + searchString + "\\b.*" matches searchString between word
> boundaries, including at the beginning or the end.

So if that works, then the correct version can be written as:

   ".*\\b" + Pattern.quote(searchString) + "\\b.*"

The Pattern.quote could technically be omitted if searchString were
guaranteed to contain only word characters... but it would need to be
accompanied with copious documentation to explain that fact.

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

zero - 04 Jan 2006 13:11 GMT
>> You can use boundary matches to overcome this problem.  The \b
>> construct matches a word boundary, so modifying your expression to
[quoted text clipped - 8 lines]
> guaranteed to contain only word characters... but it would need to be
> accompanied with copious documentation to explain that fact.

It seems patterns are indeed a complex subject, with lots of near-identical
alternatives.  Btw, anyone know how this works with strings with
international content?  The Pattern JavaDoc states that a word character is
[a-ZA-Z_0-9], so accented characters won't work here - and I'm not even
talking about non-latin script.

Signature

Beware the False Authority Syndrome

puzzlecracker - 04 Jan 2006 02:42 GMT
> > you have a sentence and you need to find wether it contains a word; for
> > example "he is great " .containts("is") returns true, but  "his
[quoted text clipped - 16 lines]
> --
> Beware the False Authority Syndrome

pattern=".*[^a-z]" +"is"+"[^a-z].*";

line.matches(patern);

returns few results

Note: both line and is are lower case.
puzzlecracker - 04 Jan 2006 02:44 GMT
> Use the whitespace character class:
>
> ".*\\sis\\s.*"

doesn't find  cases such as ",is"  or  ",is."
Noodles Jefferson - 04 Jan 2006 02:57 GMT
In article <1136340802.254990.89700@g43g2000cwa.googlegroups.com>,
puzzlecracker took the hamburger, threw it on the grill, and I said "Oh
wow"...

> you have a sentence and you need to find wether it contains a word; for
> example "he is great " .containts("is") returns true, but  "his
[quoted text clipped - 4 lines]
>
> but it didnt work.

Not Tested:

public boolean findAWord(String aString, String aWord) {

boolean hasWord = false;
int i = 0;

String[] sa = aString.split();

   while (i < sa.length) {

         if (sa[i].equals(aWord)) {
   
             hasWord = true;  
             break;          

         }

      i++;

   }

   return hasWord;

}

Signature

Noodles Jefferson
mhm31x9 Smeeter#29 WSD#30
sTaRShInE_mOOnBeAm aT HoTmAil dOt CoM

NP: "The Road to Chicago" -- Thomas Newman (Road to Perdition
Soundtrack)

"Our earth is degenerate in these latter days, bribery and corruption
are common, children no longer obey their parents and the end of the
world is evidently approaching."
--Assyrian clay tablet 2800 B.C.

puzzlecracker - 04 Jan 2006 03:02 GMT
> public boolean findAWord(String aString, String aWord) {
>
[quoted text clipped - 15 lines]
>
>     }

Not going to work,  it will only find word delemited by space such as
".. word  .." but not ",word"

>     return hasWord;
>
[quoted text clipped - 12 lines]
> world is evidently approaching."
> --Assyrian clay tablet 2800 B.C.
puzzlecracker - 04 Jan 2006 03:20 GMT
> > public boolean findAWord(String aString, String aWord) {
> >
[quoted text clipped - 35 lines]
> > world is evidently approaching."
> > --Assyrian clay tablet 2800 B.C.

didn't work either
pattern=".*\\W?"+s+"\\W.*?";


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.