Java Forum / First Aid / February 2006
regular expression: if this is impossible, how to proceed ?
Lion-O - 31 Jan 2006 21:38 GMT Hi there,
I have a string which denotes a filename which optionally may have an extension. Because I need to create an extra (supporting) file which will contain basic certificates I decided to add (or change if its already there) an extension .crt to denote the certificates.
After checking up on java.io and java.nio I found nothing to seperate an extension so I decided to utilize regular expressions since I'm fairly familiar with them on *nix. However, to my surprise the idea I had isn't working and I can't think of anything else (not yet anyway) to solve this puzzle...
public void change_name() {
String filename = "filename.extension"; String regexp = "\..*$"; String new_ext = ".crt";
System.out.println( filename.replaceAll(regexp, new_ext)); }
When trying to compile this it will give you an error saying that the regexp. is using an "illegal escape character". I can find a mention of this in the documentation, the page about regexps. states (I quote:) "It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language.".
Well, this *does* denote an escape construct since I'm trying to grab a literal . after which the rest of the string follows untill the end of the line. Trying to use other constructions like \Q and \E (literal / end literal) but those give me an error as well.
I've managed to come up with an alternative but it isn't pretty :-(. Could someone please point me to a more suitable solution or perhaps indicate something obvious which I missed ?
Solution so far:
public void change_name() {
//String regexp = "\..*$"; int index = 0; String filename = "filename.extension"; String new_ext = "crt"; StringBuffer new_name = new StringBuffer();
// Determine where the . is located for (int i = filename.length() - 1; i > 0; i--) { file (filename.charAt(i) == '.') { index = i; break; } } // Construct the filename without extension for (int i = 0; i < (filename.length() - index - 1); i++) { new_name.append(filename.charAt(i)); } // Add the new extension new_name.append(new_extension);
System.out.println("Old filename: " + filename); System.out.println("New filename: " + new_name); }
Thanks in advance.
 Signature Groetjes, Peter
.\\ PGP/GPG key: http://www.catslair.org/pubkey.asc
m.mckinley@gmail.com - 31 Jan 2006 23:29 GMT Change \. to \\.
The \ is the escape character in Java strings, so to use a backslash in a string you need a double backslash.
Lion-O - 31 Jan 2006 23:41 GMT > Change \. to \\. huh?
> The \ is the escape character in Java strings, so to use a backslash in a > string you need a double backslash. I'm not in need of a backslash, I'm in need of a plain "." (no ""). And since that character is used in a regexp it has to be escaped.
 Signature Groetjes, Peter
.\\ PGP/GPG key: http://www.catslair.org/pubkey.asc
m.mckinley@gmail.com - 31 Jan 2006 23:47 GMT Java, not just the Regex interpreter, uses backslashes as an escape character. By using "\\." as your Regex, you're really just sending it "\.", which is what you want.
Lion-O - 31 Jan 2006 23:54 GMT > Java, not just the Regex interpreter, uses backslashes as an escape > character. By using "\\." as your Regex, you're really just sending it "\.", > which is what you want. Ayups, thanks a bunch for that one. This was a common case of writing first, thinking a bit later and trying it out WAY too late, sorry about that.
Thanks again, this was indeed exactly what I was after and I must have made a horrible mistake at the beginning of this endevour.
This is what I have now (to show others who maybe interested in the solution):
public class regexp {
public static void main(String[] args) {
String regex = "\\..*$"; String new_name; String filename = "/home/peter/filename.extension"; String new_extension = ".crt";
new_name = filename.replaceAll(regex, new_extension);
System.out.println("Old filename: " + filename); System.out.println("New filename: " + new_name);
} } // end of class
Cheers for the quick response, its much appreciated.
 Signature Groetjes, Peter
.\\ PGP/GPG key: http://www.catslair.org/pubkey.asc
Roedy Green - 01 Feb 2006 02:19 GMT >After checking up on java.io and java.nio I found nothing to seperate an >extension so I decided to utilize regular expressions since I'm fairly familiar >with them on *nix. However, to my surprise the idea I had isn't working and I >can't think of anything else (not yet anyway) to solve this puzzle... You don't need a regex for this, only String.lastIndexOf( '.' )
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Lion-O - 02 Feb 2006 00:31 GMT > You don't need a regex for this, only String.lastIndexOf( '.' ) Thanks for pointing it out, I totally overlooked this one. As if that wasn't obvious enough by looking at my (buggy) alternative routine. In this case I'm sticking to my regexp since it works decently well but I'll do some experiments with this one with regards to that "alternate routine" I knotted together.
 Signature Groetjes, Peter
.\\ PGP/GPG key: http://www.catslair.org/pubkey.asc
Roedy Green - 02 Feb 2006 05:31 GMT >Thanks for pointing it out, I totally overlooked this one. As if that wasn't >obvious enough by looking at my (buggy) alternative routine. In this case I'm >sticking to my regexp since it works decently well but I'll do some experiments >with this one with regards to that "alternate routine" I knotted together. just make sure your code works for filenames with more than one dot in them. They are becoming ever more common.
// getting the extension of a filename // This code is much faster than any regex technique.
// filename without the extension String choppedFilename;
// extension without the dot String ext;
// where the last dot is. There may be more than one. int dotPlace = filename.lastIndexOf ( '.' );
if ( dotPlace >= 0 ) { // possibly empty choppedFilename = filename( 0, dotPlace );
// possibly empty ext = filename.substring( dotPlace + 1 ); } else { // was no extension choppedFilename = filename; ext = ""; }
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 02 Feb 2006 07:08 GMT On Thu, 02 Feb 2006 05:31:38 GMT, Roedy Green <my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or indirectly quoted someone who said :
>// getting the extension of a filename >// This code is much faster than any regex technique. for future reference, that code is accessible via http://mindprod.com/jgloss/file.html or http://mindprod.com/jgloss/extension.html
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Oliver Wong - 02 Feb 2006 21:13 GMT >>Thanks for pointing it out, I totally overlooked this one. As if that >>wasn't [quoted text clipped - 6 lines] > just make sure your code works for filenames with more than one dot in > them. They are becoming ever more common. Or, for that matter, filenames with zero dots in them.
- Oliver
Roedy Green - 01 Feb 2006 02:20 GMT > String filename = "filename.extension"; > String regexp = "\..*$"; > String new_ext = ".crt"; > > System.out.println( filename.replaceAll(regexp, new_ext)); see http://mindprod.com/jgloss/string.html#REPLACE
and http://mindprod.com/jgloss/regex.html
But as I said, the practical solution is to avoid regexes. you can have multiple dots in names.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Lion-O - 02 Feb 2006 00:42 GMT >> String filename = "filename.extension"; >> String regexp = "\..*$"; >> String new_ext = ".crt"; >> >> System.out.println( filename.replaceAll(regexp, new_ext));
> and http://mindprod.com/jgloss/regex.html Thanks for the link, but I already looked there having seen your pointers several times throughout these newsgroups :-) And now that I'm on the subject I'd like to compliment you with the several sections, so far all of the ones I've read are very usable and good to read. Even in lynx which I really appreciate.
My initial problem with the information on the website was not being open to the obvious. I knew \ to be an escape character in regexps, on *nix systems and to a little extend on Java but I never added this up. Stupid, yes. So I totally skipped (/overlooked) the parts where you wrote about \ which has to be escaped as well.
> But as I said, the practical solution is to avoid regexes. you can have > multiple dots in names. True, but in this case all is well. This isn't the full code and I already wrote a method explicitly aimed at validating the entries. So in this case it will check if there actually is a . being used and if so replace everything behind it with a new string (/extention).
Still, despite that I'll still have a closer look at your examples tomorrow evening.
 Signature Groetjes, Peter
.\\ PGP/GPG key: http://www.catslair.org/pubkey.asc
Roedy Green - 02 Feb 2006 05:47 GMT > And now that I'm on the subject >I'd like to compliment you with the several sections, so far all of the ones >I've read are very usable and good to read. Even in lynx which I really >appreciate. The reason for that is I do everything with CSS. The markup itself is very vanilla, which Lynx can deal with. It can either ignore the CSS style sheet or use it for hints on how to render things in a texty way.
I also validate with HTMLValidator which makes me give alt="xxx" text descriptions to every image, which can help in text-only browsing.
Having all my tags balance and nest properly helps. too. IE is too forgiving. It renders crap. People test with only IE and think they are done, blaming failures on other browsers when they fail to render their pages, when the problems are actually grammatical errors in the HTML.
Part of the reason the more intricate parts of the HTML are perfect is because I generate them with HTML static macros rather than hand coding them the way most other people do.
All this work in paying off. My site is much faster than it would be if I tried to do this with JSP.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 02 Feb 2006 06:57 GMT >. Stupid, yes. So I totally >skipped (/overlooked) the parts where you wrote about \ which has to be escaped >as well. I have redone that section with a table and two hammerhead sharks. Hopefully those who come after you won't so easily ignore those warnings.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 02 Feb 2006 06:58 GMT On Thu, 02 Feb 2006 06:57:22 GMT, Roedy Green <my_email_is_posted_on_my_website@munged.invalid> wrote, quoted or indirectly quoted someone who said :
>I have redone that section with a table and two hammerhead sharks. >Hopefully those who come after you won't so easily ignore those >warnings. If you want to see this, try http://mindprod.com/jgloss/regex.html#RECIPES
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|