When I use "|" as a delimiter, my program seems to take each character
as a new token, instead of tokenizing at the bar. When I use ":" as the
delimiter, I get what I expect when reading "delim.txt".
Am I doing something incorrectly?
"delim.txt":
100:|first:|second:|third
200:|alpha:|beta:|gamma
300:|roy:|gee:|biv
Here is the program:
package scantext;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class TheClass {
private static void readFile(String filename, String delim) {
try {
File file = new File(filename);
Scanner scanner = new Scanner(file);
scanner.useDelimiter(System.getProperty("line.separator"));
while (scanner.hasNext()) {
System.out.println();
String theNext = scanner.next();
System.out.println(theNext);
parseline(theNext, delim);
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
public static void parseline(String line, String delim) {
Scanner lineScanner = new Scanner(line);
lineScanner.useDelimiter(delim);
System.out.println("delimiter is >" + lineScanner.delimiter() + "<");
int a = lineScanner.nextInt();
String b = lineScanner.next();
String c = lineScanner.next();
String d = lineScanner.next();
System.out.println(
"a = " + a +
", b= " + b +
", c = " + c +
", d = " + d);
}
public static void main(String[] args) {
readFile("delim.txt", "|");
System.out.println();
readFile("delim.txt", ":");
}
}
The output is:
100:|first:|second:|third
delimiter is >|<
a = 1, b= 0, c = 0, d = :
200:|alpha:|beta:|gamma
delimiter is >|<
a = 2, b= 0, c = 0, d = :
300:|roy:|gee:|biv
delimiter is >|<
a = 3, b= 0, c = 0, d = :
100:|first:|second:|third
delimiter is >:<
a = 100, b= |first, c = |second, d = |third
200:|alpha:|beta:|gamma
delimiter is >:<
a = 200, b= |alpha, c = |beta, d = |gamma
300:|roy:|gee:|biv
delimiter is >:<
a = 300, b= |roy, c = |gee, d = |biv
cumin - 26 Jan 2007 17:13 GMT
> When I use "|" as a delimiter, my program seems to take each character
> as a new token, instead of tokenizing at the bar. When I use ":" as the
[quoted text clipped - 79 lines]
> delimiter is >:<
> a = 300, b= |roy, c = |gee, d = |biv
never mind...
Alex Hunsley - 28 Jan 2007 17:24 GMT
>>When I use "|" as a delimiter, my program seems to take each character
>>as a new token, instead of tokenizing at the bar. When I use ":" as the
>>delimiter, I get what I expect when reading "delim.txt".
>>
>>Am I doing something incorrectly?
[snip]
>>a = 300, b= |roy, c = |gee, d = |biv
>
> never mind...
Glad you got it sorted. It's often useful in these circumstances (where
you work out the problem yourself) to say what the problem was - it's
helpful to others (educational) - and shows you're interested in
contributing, rather than just getting a question answered.
lex
Andreas Leitgeb - 26 Jan 2007 17:18 GMT
> When I use "|" as a delimiter, my program seems to take each character
> as a new token, instead of tokenizing at the bar. When I use ":" as the
> delimiter, I get what I expect when reading "delim.txt".
The String you pass as delimiter is actually a Pattern.
"|" as a (regex-)Pattern means "empty or empty", so any
string matches, so every position in your input matches
the delimiter-pattern, so each char is a token on its own.
Use "\\|" to specify a pattern that actually matches only
the literal pipe-char. (you need two backslashes, because
parsing java-string literals "eats" one of them away.)