Java regular expressions and multi-lines

Mar 16, 2009 19:02

When using perl to match strings, you would quite often do something like:

if ($message =~ /value/) { print $message }

In java regular expressions such as used by String.matches have to match the whole string, so you would do

if (message.matches(".*value.*")) { System.out.println(message) }

Now, what happens if you do

if ("foo\nsome value 5".matches(".*value.*")) { System.out.println("It matched!") }
?
Well, it doesn't match. This is because . doesn't match newlines. So, in order to match this you have to use a Pattern object. The option you need isn't the obvious Patten.MULTILINE, no, that would be far too easy. Instead it's Pattern.DOTALL, and the code becomes:

Pattern pat = Pattern.compile(".*value.*", Pattern.DOTALL);
if (pat.matcher("foo\nsome value 5").matches()) { System.out.println("It matched!"); }
Eww, that doesn't read anywhere near as nicely.

At this point I thought that Java was being silly, and perl has a nice . which matches everything like the documentation says, however it turns out that . doesn't match newlines in perl either, it just works because you tend to match partial strings, rather than having to match the whole string.

I later found that if you don't want to create the Pattern object, and don't mind making your java look like the line noise of perl you can instead do

if ("foo\nsome value".matches("(?s).*value.*")) { System.out.println("It matched!"); }

java, regex, perl

Previous post Next post
Up