Java Forum / First Aid / February 2008
indexOf
Jean Pierre Daviau - 21 Feb 2008 11:17 GMT Hi to everyone and all of you ,
How comes that the checkExtensions method does not work and the tedious if( temp.indexOf("mov") != -1 ) do?
====== public static boolean checkExtensions(String tmp){
String ext[] = {"html", "htm", "lnk", "mov", "avi", "psd", "ai", "ps", "tif", "nws", "txt", "raw", "pdf"};
for (int i=0; i<ext.length ;i++ ) { if(tmp.indexOf(ext[i]) != -1){ return true; } } return false; }
public static void walk(File file) { String temp = file.toString();
/* if (checkExtensions(temp)){ if(!noFiles) to_preHTML(temp, linkFiles); } */
if( temp.indexOf("mov") != -1 || temp.indexOf("pdf") != -1 || temp.indexOf("lnk") != -1 || temp.indexOf("avi") != -1){ if(!file.isFile()) to_preHTML(temp, true); } ====
= Thanks for your attention.
Jean Pierre Daviau -- windows Xp asus p4 s533/333/133 Intel(R) Celeron (R) CPU 2.00 GHz Processor Radeon7000 0x5159 agp
RedGrittyBrick - 21 Feb 2008 11:50 GMT > Hi to everyone and all of you , > [quoted text clipped - 33 lines] > = > Thanks for your attention. Have I missed something? It seems to work.
----------------------------------8<--------------------------------- import java.io.File;
public class TestIndexOf { public static void main(String[] args) { walk(new File("foo.mov")); walk(new File("bar.jpg")); }
public static boolean checkExtensions(String tmp) {
String ext[] = { "html", "htm", "lnk", "mov", "avi", "psd", "ai", "ps", "tif", "nws", "txt", "raw", "pdf" };
for (int i = 0; i < ext.length; i++) { if (tmp.indexOf(ext[i]) != -1) { return true; } } return false; }
public static void walk(File file) { String temp = file.toString();
if (checkExtensions(temp)) { System.out.println("1. Valid extension for " + temp); } else { System.out.println("1. Bad extension for " + temp); }
if (temp.indexOf("mov") != -1 // || temp.indexOf("pdf") != -1 // || temp.indexOf("lnk") != -1 // || temp.indexOf("avi") != -1) { System.out.println("2. Valid extension for " + temp); } else { System.out.println("2. Bad extension for " + temp); } } } ----------------------------------8<---------------------------------
1. Valid extension for foo.mov 2. Valid extension for foo.mov 1. Bad extension for bar.jpg 2. Bad extension for bar.jpg
> Jean Pierre Daviau > -- > windows Xp > asus p4 s533/333/133 > Intel(R) Celeron (R) CPU 2.00 GHz > Processor Radeon7000 0x5159 agp SigSeparatorException: A valid separator is hyphen hyphen space newline. SigLocationException: Normally the name appears below the separator.
 Signature RGB Cranium Hom.Sap. ~1.5 Kg grey matter.
Eric Sosman - 21 Feb 2008 13:33 GMT > Hi to everyone and all of you , > [quoted text clipped - 29 lines] > to_preHTML(temp, true); > } I see nothing wrong with the code. In what way does it fail to "work?"
Incidentally, both methods may give unintended results for certain files like "RavishingBeauty.jpeg" ...
 Signature Eric Sosman esosman@ieee-dot-org.invalid
Roedy Green - 21 Feb 2008 16:08 GMT On Thu, 21 Feb 2008 06:17:28 -0500, "Jean Pierre Daviau" <Once@WasEno.ugh> wrote, quoted or indirectly quoted someone who said
>public static boolean checkExtensions(String tmp){ > [quoted text clipped - 9 lines] >return false; >} Some improvements:
1. your list of extensions can be static final. No need to build the list on every call.
2. you can use for:each syntax on your loop. see http://mindprod.com/jgloss/jcheat.html
3. I would compare >=0 rather than != -1. Comparing against 0 is faster than against other integers. It also guards against a freaky -2. It also more conventional, and hence easier for other programmers to read.
4. you don't really want indexOf. file "external.doc" would pass your "ext" check. You want endsWith including a dot.
See http://mindprod.com/products1.html#FILTER. It has source code for a extensions filter that does just this for you. --
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Jean Pierre Daviau - 22 Feb 2008 20:19 GMT > Hi to everyone and all of you , > [quoted text clipped - 30 lines] > } > ==== Na. It is no joke ;-)
Jean Pierre Daviau - 23 Feb 2008 19:23 GMT public static boolean checkExtentions(String tmp){
int debut; int fin = tmp.length();
for (int i=0; i<extentions.length ;i++ ) { if(extentions[i] != null && (debut = tmp.lastIndexOf('.')) != -1 && tmp.substring(debut+1, fin).equals(extentions[i])) { return true; } } return false;
}
Lew - 23 Feb 2008 20:09 GMT > public static boolean checkExtentions(String tmp){ > [quoted text clipped - 12 lines] > > } Roedy Green suggested:
>> 4. you don't really want indexOf. ... You want endsWith including a dot. That would make your code a lot simpler. <http://java.sun.com/javase/6/docs/api/java/lang/String.html#endsWith(java.lang.String)>
 Signature Lew
Eric Sosman - 23 Feb 2008 20:15 GMT > public static boolean checkExtentions(String tmp){ > [quoted text clipped - 12 lines] > > } You're working much too hard. First, observe that tmp.lastIndexOf('.') will return the same result every time you call it, so debut can be computed once before the loop starts rather than multiple times inside it. Second, tmp.substring(debut+1, tmp.length()) is the same as tmp.substring(debut+1), and easier to write. Third, you could use the endsWith() method:
public static boolean checkExtentions(String tmp) { if (tmp.lastIndexOf('.') < 0) return false; int fin = tmp.length() - 1; for (String ext : extentions) { if (ext != null && tmp.endsWith(ext) && tmp.charAt(fin - ext.length()) == '.') return true; } return false; }
Finally, you can simplify things even more by setting up the table of extensions to hold ".html", ".mov" and so on instead of just "html", "mov". Do that, and the messy- looking test becomes straightforward:
public static boolean checkExtentions(String tmp) { for (String ext : extentions) { if (tmp.endsWith(ext)) return true; } return false; }
... where the test for null has also been omitted, because it's your own private array and you're probably smart enough not to put null elements in it.
 Signature Eric Sosman esosman@ieee-dot-org.invalid
Lew - 23 Feb 2008 20:33 GMT > Finally, you can simplify things even more by setting up > the table of extensions to hold ".html", ".mov" and so on [quoted text clipped - 12 lines] > it's your own private array and you're probably smart enough > not to put null elements in it. You do have to check that tmp != null, since it's a public method.
 Signature Lew
Eric Sosman - 23 Feb 2008 21:18 GMT >> Finally, you can simplify things even more by setting up >> the table of extensions to hold ".html", ".mov" and so on [quoted text clipped - 14 lines] > > You do have to check that tmp != null, since it's a public method. "Have to" seems too strong. There's certainly no Big Rule saying that all public methods must check for null arguments. Besides, what difference would it make? For a method like the one above, does it really matter whether a NullPointerException or an IllegalArgumentException is thrown?
If the tmp argument were being stashed in some other data structure where it might provoke an NPE at some far-removed point, that'd be another matter, and throwing IAE would be a good idea.
 Signature Eric Sosman esosman@ieee-dot-org.invalid
Lew - 23 Feb 2008 22:18 GMT >>> Finally, you can simplify things even more by setting up >>> the table of extensions to hold ".html", ".mov" and so on [quoted text clipped - 25 lines] > point, that'd be another matter, and throwing IAE would be a > good idea. You raise good points. I should have said, "should", rather than "have to", in the sense that it's often a best practice to have an explicit null check, but not always.
I propose a few rules of thumb:
Argument-verification tests should be explicit where feasible, in order to make code self-documenting. As Eric points out, sometimes it really doesn't matter and you skip it.
You wouldn't just toss or catch an exception if you can help it, given a strategy that checkExtentions() return 'false' for any invalid argument. A gram of prevention is worth a kilo of cure. One should check for null explicitly instead of using the NPE as a logic switch. Exceptions are interruptive, so preventing them when they aren't required clarifies code and improves performance.
A logging aspect would also benefit from detection of null arguments in order to log them, even if one then (re)throws a (Runtime)Exception.
 Signature Lew
Hendrik Maryns - 25 Feb 2008 12:29 GMT Lew schreef:
>>>> Finally, you can simplify things even more by setting up >>>> the table of extensions to hold ".html", ".mov" and so on [quoted text clipped - 45 lines] > A logging aspect would also benefit from detection of null arguments in > order to log them, even if one then (re)throws a (Runtime)Exception. Furthermore, when debugging where this exception is coming from, seeing a stacktrace which says the cause is checkExtension would make clear something is going wrong there, whereas if the check is not there, you will see the NPE is provoked in String.endsWith(), and you have to go back in the stacktrace to find the the cause is really in checkExtension (or even higher). I think it is a good idea to try to keep stacktraces as short as possible.
H.
 Signature Hendrik Maryns http://tcl.sfs.uni-tuebingen.de/~hendrik/ ================== http://aouw.org Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
Jean Pierre Daviau - 25 Feb 2008 22:15 GMT > if (tmp.endsWith(ext)) Thanks for endsWith()
Roedy Green - 23 Feb 2008 20:43 GMT On Thu, 21 Feb 2008 06:17:28 -0500, "Jean Pierre Daviau" <Once@WasEno.ugh> wrote, quoted or indirectly quoted someone who said
>How comes that the checkExtensions method does not work and the >tedious if( temp.indexOf("mov") != -1 ) do? here is a the way I handled the problem of good, bad and iffy extensions:
/* copyright (c) 2002-2008 Roedy Green, Canadian Mind Products #101 - 2536 Wark Street Victoria, BC Canada V8T 4G8 tel: (250) 361-9093 http://mindprod.com Source and executables may be freely used for any purpose except military.
version history
version 1.0 initial 1.1 allow multiple files on the command line. trim leading and trailing blank lines. ensures consistent use of \r\n on Windows, or equivalent for platform. ensures file ends with exactly one \r\n 1.2 2005-07-27 add more bad extensions. 1.3 2005-07-16 add Javadoc 1.4 2006-03-05 reformat with IntelliJ, add Javadoc. 1.5 2007-06-07 add pad, icon. */ package com.mindprod.dedup;
import com.mindprod.common11.StringTools;
import java.awt.*; import java.io.*;
/** * <pre> * Removes adjacent duplicate lines from a text file. * Trims trailing blanks on each line. * Trims leading and trailing blank lines. * If nothing changed, file date will not be disturbed. * Case sensitive compare, Only compares adjacent lines. Does not sort the file * first. * converts all Unix, DOS, or Mac line terminators to the platform style. * <p/> * usage: java com.mindprod.dedup.DeDup MySource.txt another.txt * or with JET: * dedup.exe MySource.txt another.txt * </pre> * * @author Roedy Green, Canadian Mind Products * @version 1.5, 2007-06-24 */ public final class DeDup { // ------------------------------ FIELDS ------------------------------
/** * which line end convention do we use */ static boolean unix = false;
/** * input "before" file name */ static String inFilename;
/** * output "after" file name, the temporary, later renamed to match the input */ static String outFilename;
private static final String RELEASEDATE = "2007-06-24";
private static String TITLESTRING = "DeDup";
private static final String VERSIONSTRING = "1.5";
/** * input "before" reader */ static BufferedReader inReader;
/** * input "before" file */ static File inFile;
/** * output "after" file */ static File outFile;
/** * output "after" file writer */ static PrintWriter outWriter;
/** * don't need undisplayed copyright notice, since have banner. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. * <p/> * extensions known unsafe to run DeDup on. */ /** * extensions known unsafe to run DeDup on. */ static final String[] badExtensions = { "ans", "asm", "bat", "batfrag", "blk", "bmp", "bod", "btm", "btmfrag", "c", "cfrag", "class", "cmd", "com", "cpp", "cppfrag", "css", "cssfrag", "csv", "csvfrag", "dat", "dll", "doc", "e", "exe", "gif", "h", "hfrag", "hpp", "hppfrag", "htm", "html", "htmlfrag", "ico", "ih", "ini", "jar", "java", "javafrag", "jnlp", "jnlpfrag", "jpg", "jsp", "jspfrag", "mac", "mbx", "mft", "obj", "p7b", "pas", "png", "policy", "prn", "properties", "ps", "rh", "seq", "ser", "sh", "site", "so", "sql", "sqlfrag", "sym", "tab", "toc", "use", "usg", "wiki", "xml", "xmlfrag", "zip", };
/** * extensions known safe to run DeDup on. */ static final String[] goodExtensions = { "ctl", "list", "log", "lst", "txt", };
// -------------------------- STATIC METHODS --------------------------
/** * display a banner about the author */ static void banner() { /* Usually not displayed, just embedded. */
System.out .println( TITLESTRING + " " + VERSIONSTRING + "\n" + "\nFreeware to remove adjacent duplicate lines." + "\ncopyright (c) 2002-2008 Roedy Green, Canadian Mind Products" + "\n#101 - 2536 Wark Street, Victoria, BC Canada V8T 4G8" + "\nTelephone: (250) 361-9093 Internet:roedyg@mindprod.com" + "\nMay be used freely for non-military use only\n" + "released: " + RELEASEDATE + "\n\n" ); }// end banner
/** * Ask user to confirm that some action is ok. * * @param prompt Question to ask the user. * * @return true if the user answers, yes it is ok to proceed. Should redo this with a modal dialog so don't have to * hit Y enter. */ static boolean confirm( String prompt ) { /* just give a warning */ System.out.print( prompt );
System.out.print( " (Y)es (N)o " ); while ( true ) {/* loop forever till user enters Y or N */ honk(); int response = '\033';// default esc
try { // read single keystroke, even though user has to hit enter. response = System.in.read();// the console is a // fileInputReader } catch ( IOException e ) { }
response = Character.toUpperCase( (char) response );
switch ( response ) { case 'Y': System.out.println( " Yes" ); return true;
case 'N': System.out.println( " No" ); return false;
/* others, keep looping */ }// end switch }// end while }// end confirm
/** * Guts of the class. This is the dedup logic. copy inReader to outWriter, processing tabs and line ends Presume * files already open. Does not close them. * * @throws IOException */ static void deDupFile() throws IOException { String prevLine = null; String thisLine; boolean inLeading = true; boolean pendingBlankLine = false;
while ( ( thisLine = inReader.readLine() ) != null ) { thisLine = StringTools.trimTrailing( thisLine ); if ( thisLine.length() == 0 ) { pendingBlankLine = true; } else if ( !thisLine.equals( prevLine ) ) { // deal first with and pending blank lines if ( inLeading ) { // ignore leading blank lines. inLeading = false; pendingBlankLine = false; } else { if ( pendingBlankLine ) { // emit just one embedded blank line, collapse dup blank // lines. outWriter.println(); pendingBlankLine = false; } } // deal with the unique line outWriter.println( thisLine ); prevLine = thisLine; } }/* end while */ // fall out the end with pendingBlankLine we just totally ignore. // that is how we trim trailing blanks. }// end deDupFile
/** * abort the run, clean up as best as possible. */ static void die() { honk(); try { if ( inReader != null ) { inReader.close(); } if ( outWriter != null ) { outWriter.close(); } } catch ( IOException e ) { } System.exit( 1 );/* exit with errorlevel = 1 */ }// end die
/** * make sure the filename we are about to process has a safe extension. */ static void ensureSafeFilename() { /* * Ensure appropriate file name extensions. good =.txt etc - done * without prompt bad =.exe etc. - abort warning =.doc & others - ask */
String extension = ""; int whereDot = inFilename.lastIndexOf( '.' ); if ( whereDot >= 0 && whereDot <= inFilename.length() - 2 ) { extension = inFilename.substring( whereDot + 1 ); }
for ( int i = 0; i < goodExtensions.length; i++ ) { if ( extension.equalsIgnoreCase( goodExtensions[ i ] ) ) {/* match, it is Good */ return; } }
for ( int i = 0; i < badExtensions.length; i++ ) { if ( extension.equalsIgnoreCase( badExtensions[ i ] ) ) {/* match, it is bad */ inFile = null; return; } } /* just give a warning */ if ( !confirm( "\n Warning!\n" + " DeDup is not usually used on files such as " + inFilename + ".\n" + " Do you want to dedup anyway?" ) ) { inFile = null; } }// end ensureSafeFilename
/** * make a noise */ static void honk() { Toolkit.getDefaultToolkit().beep(); }// end honk
/** * open the input "before" file */ static void openInReader() { try { inFile = new File( inFilename ); if ( !inFile.exists() ) { banner(); System.out.print( "Oops! Cannot find file " ); System.out.println( inFilename ); die(); } // ignore directories, usually put there by wildcard expansion. if ( inFile.isDirectory() ) { inFile = null; // keep going return; }
if ( !inFile.canRead() ) { banner(); System.out .print( "Oops! no permission to read (i.e. examine) the file " ); System.out.println( inFilename ); die(); } if ( !inFile.canWrite() ) { banner(); System.out .print( "Oops! no permission to write (i.e. change) the file " ); System.out.println( inFilename ); die(); }
inReader = new BufferedReader( new FileReader( inFile ), 4096 /* buffsize */ ); } catch ( FileNotFoundException e ) { banner(); System.out.print( "Oops! Cannot open file " ); System.out.println( inFilename ); die(); } }// end openInReader
/** * open the output "after" file */ static void openOutWriter() { try { // get a temporary file in the same directory as inFile. // outFile = getTempFile("DeDup", inFile); outFile = File.createTempFile( "dedup", "tmp", inFile .getParentFile() );
outWriter = new PrintWriter( new BufferedWriter( new FileWriter( outFile ), 64 * 1024 /* buffsize */ ), false /* auto flush */ ); } catch ( IOException e ) { System.out .println( "Oops! Cannot create the temporary work file\n" ); die(); } }// end OpenOutWriter
// --------------------------- main() method ---------------------------
/** * Command line utility to remove adjacent duplicate lines. * * @param args list of filenames to dedup. */ public static void main( String[] args ) { try { // process each file on command line, or expanded wild card. for ( int i = 0; i < args.length; i++ ) { inFilename = args[ i ];
openInReader();/* Open input "before" file. */ /* Make sure file exists before */ /* song and dance about extension. */ if ( inFile == null ) { /* ignore */ System.out .println( "- " + inFilename + " : could not open. Directory or unreadable file" ); continue; } ensureSafeFilename();/* make sure filename has sane extension */
if ( inFile == null ) { /* ignore */ System.out .println( "- " + inFilename + " : bypassed based on extension" ); continue; } openOutWriter();/* open output "after" file */
/* * copy inReader to outWriter removing duplicate lines, trailing * spaces, and lead/trailing blank lines */ deDupFile();
/* * if we trimmed, changed line ends, removed dups, file size * should change. In a pathological case it would not, but then * we do no damage. */
inReader.close(); outWriter.close(); if ( inFile.length() == outFile.length() ) { // nothing changed System.out .println( "- " + inFilename + " : contained no duplicate lines. Left as is." ); } else { // file really did change. System.out.println( "* " + inFilename + " : changed!" ); /* Rename output to input */ inFile.delete(); outFile.renameTo( inFile ); // don't delete outFile, it has been renamed to a real file } }// end for } catch ( IOException e ) { System.out.print( "Oops! IO failure. e.g. cannot find file.\n" ); die(); } }// end main }
--
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|