Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / August 2005

Tip: Looking for answers? Try searching our database.

Regular expression to read non-commented lines from a file

Thread view: 
Jonny - 12 Aug 2005 10:13 GMT
Hi,

I would like to use a regular expression in Java to read those lines
from a file which are not comments and do not start with whitespace.
Commented lines start with #

Currently with grep, I am using the command:

grep -E "^[^#\ \t]" myfile

to get the lines I want, but I am having problems converting this
regular expression for use in Java.  I don't get any lines returned.

If I replace the above regular expression with ".*" in my Java code,
then all lines of myfile are returned, as you might expect, so it would
appear that the problem is only with the regular expression shown in the
above grep example.

Please can you help.

Thanks,
Jonny
Bastiaan - 12 Aug 2005 10:20 GMT
> Currently with grep, I am using the command:
>
[quoted text clipped - 7 lines]
> appear that the problem is only with the regular expression shown in the
> above grep example.

What does your Java code look like? You might have to escape the \'s and
other special characters, I myself am very new to Java but escaping of
characters is done in most programming languages.

Bastiaan
Jonny - 12 Aug 2005 19:30 GMT
> > Currently with grep, I am using the command:
> >
[quoted text clipped - 11 lines]
> other special characters, I myself am very new to Java but escaping of
> characters is done in most programming languages.

Thanks for your reply Bastiaan.

I know I have to include \\ for each \

See my replies to the other responses.

Regards,
Jonny
Hemal  Pandya - 12 Aug 2005 10:43 GMT
> Hi,
>
[quoted text clipped - 8 lines]
> to get the lines I want, but I am having problems converting this
> regular expression for use in Java.  I don't get any lines returned.

Java Matcher "Attempts to match the entire region against the pattern."
(from the  javadocs). The search is anchored. So the only lines that
match the regular expression "^[^#\ \t]" are those that contain exactly
one character that is not pound, space or tab.

I am resisting the temptation to give you the exact pattern you are
looking for.
Jonny - 12 Aug 2005 19:34 GMT
> > I would like to use a regular expression in Java to read those lines
> > from a file which are not comments and do not start with whitespace.
[quoted text clipped - 11 lines]
> match the regular expression "^[^#\ \t]" are those that contain exactly
> one character that is not pound, space or tab.

Thanks for your reply, Hemal.

I understand what you have said, so I have to use ^ and $ in Java.

See my response to Mario's reply.

Regards,
Jonny
Hemal  Pandya - 13 Aug 2005 02:52 GMT
> > > I would like to use a regular expression in Java to read those lines
> > > from a file which are not comments and do not start with whitespace.
[quoted text clipped - 15 lines]
>
> I understand what you have said, so I have to use ^ and $ in Java.

No, you do not need them, if you are comparing each line individually.
By anchored I meant that the pattern is interpreted as if it already
has ^ and $ around it.

You pattern matches only the first character of a line. It needs to
match the entire line.

> See my response to Mario's reply.
>
> Regards,
> Jonny
Mario Winterer - 12 Aug 2005 12:45 GMT
Hi!

Here's the code snippet that prints all lines that are not comments and do not start with whitespace.
The input is a String (more general: CharSequence) containing the entire file content. DO NOT USE FOR LARGE FILES!

/* BEGIN */
String testLine = "This\n is\na\n#test";

Pattern pattern = Pattern.compile("^[^\\s#].*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(testLine);
while (matcher.find()) {
  String l = matcher.group();
  System.out.println(l);
}
/* END */

In your case, it might be better to read lines using the BufferedReader's "readLine" method and just test if it starts with
whitespace or "#":

BufferedReader reader = new BufferedReader(new FileReader(yourFile));
try {
 String line = null;
 while ((line = reader.readLine()) != null) {
   if (line.length() == 0) continue; /* skip line in case it is empty (is this correct?) */

   char c = line.charAt(0);
   if ((c == '#') || Character.isWhitespace(c)) continue;
   System.out.println(line);
 }
} finally {
 reader.close();
}

Best regards,
 Tex

> Hi,
>
[quoted text clipped - 18 lines]
> Thanks,
> Jonny
Jonny - 12 Aug 2005 19:40 GMT
> Here's the code snippet that prints all lines that are not comments and
> do not start with whitespace.
[quoted text clipped - 53 lines]
> >
> > Please can you help.

Thanks for a comprehensive response Mario.  It is much appreciated.

I can see that I needed to use ^ and $, and also \\s for whitespace.
These were the two problems I was having.

Incidentally, the file I am reading is very small, so I used the
following code to read the file:

String fileAsString = new Scanner(new
File(myFile)).useDelimiter("\\A").next();

where myFile is the path to the file to be read.

Regards,
Jonny


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.