Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2007

Tip: Looking for answers? Try searching our database.

Removing Dates from Strings

Thread view: 
Bertram Hurtig - 23 Dec 2007 11:04 GMT
Hi,

I have a big file - each line may start with date and time (german date
formatting). To be able to sort and compare these lines, I want to
remove the Date and time sub strings.

For the date and time, I got this regex:
\d\d\.\d\d.\d\d\d\d\s\d\d\:\d\d:\d\d

I know there are classes like Pattern and Matcher, but this only tells
me if the String contains a date + time - but no idea how to also get
the relevant index positions to be able to remove the substrings found.

I know I could "manually" do this writting my own parsing method,
but I would prefer to have it nice, short (and performance optimized) -
and I don't want to reinvent the wheel.... ;-)

Any ideas?

Thanks in advance,

Betram
Christian - 23 Dec 2007 12:38 GMT
Bertram Hurtig schrieb:
> Hi,
>
[quoted text clipped - 18 lines]
>
> Betram
use capturing Groups and you are done.. just surrround interesting parts
with () then you can retrieve them ..
Jeff Higgins - 23 Dec 2007 13:17 GMT
> Hi,
>
[quoted text clipped - 14 lines]
>
> Any ideas?

Maybe java.text.SimpleDateFormat.parse(String text, ParsePosition pos).
Jeff Higgins - 24 Dec 2007 00:03 GMT
>> Hi,
>>
[quoted text clipped - 16 lines]
>>
> Maybe java.text.SimpleDateFormat.parse(String text, ParsePosition pos).

import java.text.ParsePosition;
import java.text.SimpleDateFormat;

public class Main
{
 public static void main(String[] args)
 {
   String[] source =
   {
     "12.12.2007 10:39:59 abc def",
     "abc def",
     "12.12.2007 10:39:59 abc def ghi",
     "12.12.2007 10:39:59 abc 12.12.2007 10:39:59",
     "12.12.2007 10:39:59 12.12.2007 10:39:59 abc",
     "abc def ghi 12.12.2007 10:39:59"
   };
   SimpleDateFormat format =
     new SimpleDateFormat("dd.MM.yyyy HH:mm:ss ");
   ParsePosition pos = new ParsePosition(0);
   for(String s : source)
   {
     if(format.parse(s, pos) != null)
       System.out.println(s.substring(pos.getIndex()));
     else
       System.out.println(s);
     pos.setIndex(0);
   }
 }
}

abc def
abc def
abc def ghi
abc 12.12.2007 10:39:59
12.12.2007 10:39:59 abc
abc def ghi 12.12.2007 10:39:59
Jeff Higgins - 24 Dec 2007 00:59 GMT
>>> Hi,
>>>
[quoted text clipped - 16 lines]
>>>
>> Maybe java.text.SimpleDateFormat.parse(String text, ParsePosition pos).

import java.text.NumberFormat;
import java.text.ParsePosition;
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.Comparator;

public class Main
{
 public static void main(String[] args)
 {
   String[] source =
   {
     "12.12.2007 10:39:59 776854 3.1416e0",
     "144209 456332.987e3",
     "12.12.2007 10:39:59 567445 1e3",
     "12.12.2007 10:39:59 999222 19",
     "334978, 332.987e3",
     "999224 789.3e-15"
   };

   Arrays.sort(source, new IgnoreDateComparator());
   for(String s : source)
     System.out.println(s);
 }

 static class IgnoreDateComparator
 implements Comparator<String>
 {
   @Override
   public int compare(String s1, String s2)
   {
     SimpleDateFormat df =
       new SimpleDateFormat("dd.MM.yyyy HH:mm:ss ");
     ParsePosition s1Pos = new ParsePosition(0);
     ParsePosition s2Pos = new ParsePosition(0);
     Long s1Long, s2Long;
     NumberFormat nf = NumberFormat.getIntegerInstance();

     df.parse(s1, s1Pos);
     df.parse(s2, s2Pos);
     if(s1Pos.getIndex() > 0)
       s1Pos.setIndex(s1Pos.getIndex());
     if(s2Pos.getIndex() > 0)
       s2Pos.setIndex(s2Pos.getIndex());
     s1Long = (Long)nf.parse(s1, s1Pos);
     s2Long = (Long)nf.parse(s2, s2Pos);
     return s1Long.compareTo(s2Long);
   }
 }
}

144209 456332.987e3
334978, 332.987e3
12.12.2007 10:39:59 567445 1e3
12.12.2007 10:39:59 776854 3.1416e0
12.12.2007 10:39:59 999222 19
999224 789.3e-15
Stefan Ram - 23 Dec 2007 16:50 GMT
>formatting). To be able to sort and compare these lines, I want to
>remove the Date and time sub strings.

public class Main
{
 public static void main
 ( final java.lang.String[] args )
 {
   final java.util.Scanner source =new java.util.Scanner
   ( "12.12.2007 10:39:59 abc def\n" +
     "12.12.2007 10:39:59 abc def\n" );

   while( source.hasNextLine() )
   { java.lang.System.out.println
     ( source.nextLine().replaceAll( "(?:\\S+\\s+){2}", "" )); }}}

abc def
abc def
Roedy Green - 28 Dec 2007 13:57 GMT
>I know there are classes like Pattern and Matcher, but this only tells
>me if the String contains a date + time - but no idea how to also get
>the relevant index positions to be able to remove the substrings found.

the key in what are called groups to get the pieces.  See
http://mindprod.com/jgloss/regex.html

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.