How do I use regular expressions to match numbers?
You have some text, and you want to find the numbers in it. For example, you have the following:
Solution: use Regular Expressions
What does that actually do?
In a regular expression \d means a digit, and a + means one or more. You want to ‘match’ one or more numbers in the text. The brackets mean to capture what is found into a number variable – e.g. $1.
However, if your text said:
And you wanted to know how many balloons you had, the above regular expression wouldn’t work. It would match the first number it found, which would be the 3 from ‘3 bananas’.
There are three different ways to get the number 37 out of the above text using a regular expression. You could get:
If you decided the first case, your regular expression would become:
Which is: one or more digits, followed by zero or more spaces, followed by the word balloon.
The second case would be:
This regular expression is explained as: one or more digits, followed by zero or more non-digits, followed by one or more digits, (which are captured in $1 because of the parenthesis).
The reason this works is that by default perl regular expressions are ‘greedy’. That is they will try to capture as many characters as possible.
The final case would become:
The $ means the end of the text. So the regular expression means: match one or more digits (capture these digits in $1), followed by zero or more non-digits, followed by the end of the text.
See also
For more information on regular expressions, see: