Regular Expressions I

A fast introduction to regular expressions.

There are many good online resources for learning regular expressions. If you are unhappy with one, search for another.

Essentials to know:

  • Grouping: ( ... ) makes a group. The parentheses do not appear in the item you are matching.
  • Choices: (A|B) matches either A or B.
  • Repeat the last item: ? means 0 or 1; * means 0 or more; + means 1 or more.
  • Character classes:
    • [abc] matches one of a, b, or c.
    • [abc]+ matches aba, cba, and abcbca. Any string of length 1 or more made up of the letters a, b, and/or c.
    • [a-z] matches any letter from a through z.

Notes

  • The convention is to use lowercase for a kind of character (like digit is \d) and uppercase for not that kind of character (non-digits would match \D).

Exercises

Match with a regular expression, or say what patterns will match the given regex.

  1. MMMM…. NNN….

  2. MMMM….andM…..

  3. Give examples of what this recognizes:

    1. M*[and]M*
    2. M*[and]+M*
  4. (AB)+C

  5. (Mary had a little (sheep|lamb|goat))+

  6. ( [a-z] \+= 1; ))+

  7. (statement)*

  8. Write a regexp to match positive integers.

  9. A variable name can have lower case and upper case letters. It may include digits and underscores. A variable name cannot start with a number.

Last modified March 15, 2024: Parsing unit. (015098c)