Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2006

Tip: Looking for answers? Try searching our database.

Parsing "February 24th, 2006" to java.util.Date

Thread view: 
stevengarcia@yahoo.com - 28 Mar 2006 19:40 GMT
how does one write a SimpleDateFormat pattern to take into account the
"th" or the "nd" that might be present on any date?

March 1st, 2006
March 2nd, 2006
March 3rd, 2006
March 4th, 2006

I'm not sure how to write a mask that can take into acct "st", "nd",
"rd", "th".  

Thanks for your help.
Dave Mandelin - 28 Mar 2006 20:08 GMT
I don't think SimpleDate format can do it. I'd use a regexp to remove
those characters.

--
Need to get from a Foo object to a Bar object in Java?
   Ask Prospector:                http://snobol.cs.berkeley.edu
Want to play tabletop RPGs over the internet?
   Check out Koboldsoft RPZen:    http://www.koboldsoft.com
stevengarcia@yahoo.com - 28 Mar 2006 22:46 GMT
Anyone other takers?
Roedy Green - 28 Mar 2006 23:25 GMT
>Anyone other takers?

// Parsing a Date of the form: "February 24th, 2006"

import java.text.DecimalFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

public class ParseDate
  {
  private static final SimpleDateFormat pattern = new
SimpleDateFormat( "MMM dd'th', yyyy" );

  /**
   * test harness
   *
   * @param args not used
   */
  public static void main ( String[] args )
     {

     String dateString = "February 24th, 2006";
     int where;
     if ( (where = dateString.indexOf( "st," ) ) >= 0 )
        {
        dateString = dateString.substring( 0, where)
                     + "th,"
                     + dateString.substring( where + 3 );
        }
     else if ( (where = dateString.indexOf( "nd," ) ) >= 0 )
        {
        dateString = dateString.substring( 0, where)
                     + "th,"
                     + dateString.substring( where + 3 );
        }
     Date d = null;
     try
        {
        d = pattern.parse( dateString );
        }
     catch ( ParseException e )
        {
        System.err.println( "oops:" + dateString );
        }

     System.out.println( d );

     }
  }

With JDK 1.5 you could use String.replace( "nd," ,"th," );
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

James McGill - 29 Mar 2006 01:23 GMT
> >Anyone other takers?

> With JDK 1.5 you could use String.replace( "nd," ,"th," );

Localizing it to handle e.g., "-ieme, -ere", or "-zig"... seems like
there's a case to be made for I18n-ized ordinal number parsing...
Hard-coding strings for "-st", "-nd", "-rd", "-th" just smells bad, in a
language that puts such emphasis on i18n.
Oliver Wong - 29 Mar 2006 21:40 GMT
>> >Anyone other takers?
>
[quoted text clipped - 4 lines]
> Hard-coding strings for "-st", "-nd", "-rd", "-th" just smells bad, in a
> language that puts such emphasis on i18n.

   In some languages, the entire word is changed when going from number to
ordinal, rather than just having a suffix added. It's like how the word
"one" changes to "first" in English (note that the two words have zero
letters in common).

   So yeah, this is a non-trivial problem, and it'd probably be a great
boon to programmers if a standardized i18n API call existed for this. But
the syntax wouldn't be as simple as "MMM dd[ordinal-suffix], yyyy", but
rather, something like "MMM
[pure-ordinal-or-number-followed-by-ordinal-suffix], yyyy".

   - Oliver
Twisted - 29 Mar 2006 22:18 GMT
Even though "first" is utterly different from "one", "1st" is just "1"
with a suffix.

RFE: add getSuffixFor(int) and getWordFor(int) to Locale? Typically,
there'll be some special cases for small enough integers (and an
illegal argument exception if argument <= 0?) and a simple algorithm
for larger integers. (In the case of English, starting at 20.) The
English algorithm for suffixes is especially simple, as it's just

if (arg > 10 && arg < 14) return "th";
switch (arg%10) {
   case 1:
       return "st";
   case 2:
       return "nd";
   case 3:
       return "rd";
   default:
       return "th";
}

(The only special cases are 11th, 12th, and 13th instead of 11st, 12nd,
and 13rd.)

The word one in English is similar -- you special-case 11, 12, and 13
("eleventh, twelfth, thirteenth" -- note you can't just add "th" or you
get "twelveth" for 12), and for the rest, you turn the LSD into an
ending "-first", "-second", "-third", or "-" + number's name + "th",
and the remaining digits into a beginning, e.g. "three hundred and
seventy", generating e.g. "three hundred and seventh-sixth".

Doing this for other languages is left as an exercise for the reader.
:)

--
I am the terror that flaps in the net!
I am the leaky faucet in the kitchen of crime!
I am TWISTED!
Oliver Wong - 29 Mar 2006 22:32 GMT
> Even though "first" is utterly different from "one", "1st" is just "1"
> with a suffix.

   Yeah, my point was that English (and most Latin/European languages) have
this "feature" that you can add a suffix to a arabic numeral (e.g. '1', '2',
'3') to turn them into ordinals (e.g. '1st', '2nd', '3rd'), but this is not
true for ALL languages.

   Then I tried to give an analogy, but took an example from English.
Admittedly, that might be confusing, but I felt if I had used any other
language, I could not expect most readers here to relate to the example.

> RFE: add getSuffixFor(int) and getWordFor(int) to Locale?

   Would getSuffixFor() return null, or throw an exception, for a Locale
for which these concepts of suffix don't exist?

   Also, with "getWordFor(int)", there exists some languages where "the
word for a number" changes depending on what you are counting. For example,
in French, you might say "un homme" to mean "one man", but "une femme" to
mean "one woman". The word varies depending on the gender of the thing you
are counting. In Japanese, you vary the word depending on whether you're
counting something round, something flat, something pointy, etc.

   - Oliver
James McGill - 29 Mar 2006 22:40 GMT
>     Also, with "getWordFor(int)", there exists some languages where
> "the
[quoted text clipped - 7 lines]
> you're
> counting something round, something flat, something pointy, etc.

Yes, this is the kind of stuff I think about whenever I notice that
people believe we've reached some sort of plateau in technology.  We
have MUCH further left to go than we've come.  I hope the comfortable
equilibrium compromise we're in right now doesn't destroy us with
complacency.  
Chris Uppal - 30 Mar 2006 10:23 GMT
>     Yeah, my point was that English (and most Latin/European languages)
> have this "feature" that you can add a suffix to a arabic numeral (e.g.
> '1', '2', '3') to turn them into ordinals (e.g. '1st', '2nd', '3rd'), but
> this is not true for ALL languages.

And even in English the pattern isn't uniform.  I would feel very odd talking
about the "thousand and first dalmatian" -- at some point (at least in British
English) the pattern reverts to "zillion-and-oneth".

But for the case in question -- where we are talking about number names for
days in a month -- I don't see why the whole lot can't be hard-wired into the
language/calendar-specific localisation.

Maybe there are languages where the number names for days don't follow a
(feasibly) computable pattern, and don't fit into a table-driven approach
either, but they must surely be in the tiny minority.

   -- chris
Twisted - 31 Mar 2006 06:45 GMT
Bloody hell. Might as well just go ahead and solve the NLP first then.

--
I am the terror that flaps in the net!
I am the bent prong on the power cable of crime!
I am TWISTED!
Roedy Green - 29 Mar 2006 22:19 GMT
>    So yeah, this is a non-trivial problem, and it'd probably be a great
>boon to programmers if a standardized i18n API call existed for this. But
>the syntax wouldn't be as simple as "MMM dd[ordinal-suffix], yyyy", but
>rather, something like "MMM
>[pure-ordinal-or-number-followed-by-ordinal-suffix], yyyy".

this is related to the problem of expressing numbers in words.

See http://mindprod.com/applets/inwords.html

It handles English ordinals in words.

Ordinals are used much less frequently than I remember them being used
as a child. Perhaps the irregularity discouraged their use in
computers.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Twisted - 29 Mar 2006 00:48 GMT
S'funny you should pick my 30th birthday as your example date (in the
Subject)...

--
I am the terror that flaps in the net!
I am the broken software with the awful user interface that the boss
forces everyone to use!
I am TWISTED!


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.