Here is a great tip for anyone who wants to practice their regular expression writing. Whether you are new to regular expressions and just want to play, or you are an old hand at regular expressions but faced with debugging a complicated expression, this can help.
Image may be NSFW.
Clik here to view.
- Point your browser at http://jakarta.apache.org/oro/demo.html
- Enter in the regular expression that you want to test.
- Enter in some sample data.
- Change the setting from case sensitive to case insensitive (if desired).
- Click OK.
- If the results are not what you expect, change the expression (or the settings), and try again.
Image may be NSFW.
Clik here to view.
About Jakarta ORO: In case you’re wondering, ORO is the name of an excellent open-source project that Java programmers can use to add regular expression processing to their software. The original intent of this applet page was to allow such programmers to see if ORO handles regular expressions the way they expect it to.
About the Example: For anyone who is new to regular expressions, here’s what the “p[a-z]*r” example means, as pictured above:
- The leading “p” means that the first letter of any match has to be p (or P, if the setting is set for case insensitive)
- The trailing “r” means that the last letter of any match has to be r (or R)
- The “[a-z]” means that the second letter can be any letter of the alphabet. (Here, the hyphen is a shortcut to keep from having to type out [abcdefghijklmnopqrstuvwxyz].)
- The asterisk after [a-z] means that there can be any number of whatever just came before the asterisk, in this case the [a-z].
So, taken altogether, this pattern matches any string of letters that start with a P and end with an R. (By the way, the asterisk is “greedy”, always matching as many characters as it can. so, given the choice of matching all of “Piper”, or just the “per”, it’ll match the longer string every time.)
Go ahead and play. Try taking out the asterisk, and click OK. (You’ll see that it matches the “per” in Piper and pepper). Put the asterisk back in, and change the [a-z] to [te]. (Now, it’ll only match strings that start with a P, end with an R, and have Ts and Es in the middle, such as all of “Peter”, plus the various “per”s.)