regular expression match any character including new line
([.|/s]*)
why this doesn't work?
MD
Arne Vajhøj - 16 Sep 2006 02:12 GMT
> regular expression match any character including new line
>
> ([.|/s]*)
>
> why this doesn't work?
If think it is common to achieve this functionality
via the Pattern.DOTALL flag.
Arne
Jeffrey Schwab - 16 Sep 2006 03:00 GMT
> regular expression match any character including new line
>
> ([.|/s]*)
A . inside a character class means the class can match any character
other than newline. You can make it match newlines, too, by making
Pattern.DOTALL the second argument to Pattern.compile.
A | in a character class can match a literal |. The nature of the
character class effectively ORs its sub-expressions, so you do not need
to specify a pipe at all unless you want to match the character |.
The characters / and s inside a character class mean the class can match
either a literal / or literal s. To represent whitespace, use \s.
> why this doesn't work?
If nobody has answered your question yet, please post a complete
program, along with sample input.
Jeffrey Schwab - 16 Sep 2006 03:03 GMT
>> regular expression match any character including new line
>>
[quoted text clipped - 15 lines]
> If nobody has answered your question yet, please post a complete
> program, along with sample input.
I just realized what you were going for. ITYM ((?:.|\s)*). The better
solution, as Arne suggested, is Pattern.compile(".", Pattern.DOTALL).
Lasse Reichstein Nielsen - 16 Sep 2006 11:24 GMT
> regular expression match any character including new line
>
> ([.|/s]*)
>
> why this doesn't work?
You are mixing notations. The "|" means either-or outside of a
character class, but it's just the "|" character inside a character
class. (A character class is what is defined by square brackets,
"["..."]"). Also, "/s" should probably be "\s".
Use either
[.\s]
or
.|\s (properly parenthesised: (?:.|\s) and remember that backslash
must be escaped inside string literals),
or better yet, use the DOTALL flag to specify that "." also matches
newlines.
/L

Signature
Lasse Reichstein Nielsen - lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
Lasse Reichstein Nielsen - 16 Sep 2006 15:38 GMT
> [.\s]
This doesn't work, of course, since "." isn't significant inside
a character class. Instead you can use
[\s\S]
i.e., match any character that is either a whitespace or not a
whitespace.
/L

Signature
Lasse Reichstein Nielsen - lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
Jussi Piitulainen - 16 Sep 2006 11:57 GMT
> regular expression match any character including new line
>
> ([.|/s]*)
>
> why this doesn't work?
That pattern matches only sequences that consist of
the four literal characters inside the brackets. The
dot, in particular, "loses its special meaning inside
a character class".