Accessing Match Groups in Python Regex Without Match Object

neha.jlly · January 21, 2025, 6:30pm

What is the best way to access match groups in Python regex match group without explicitly creating a match object, or is there a more elegant way to improve the example below?

Here’s a Perl code snippet for reference:

if    ($statement =~ /I love (\w+)/) {
  print "He loves $1\n";
}
elsif ($statement =~ /Ich liebe (\w+)/) {
  print "Er liebt $1\n";
}
elsif ($statement =~ /Je t\'aime (\w+)/) {
  print "Il aime $1\n";
}

This is translated into Python regex match group as:

m = re.search("I love (\w+)", statement)
if m:
  print("He loves", m.group(1))
else:
  m = re.search("Ich liebe (\w+)", statement)
  if m:
    print("Er liebt", m.group(1))
  else:
    m = re.search("Je t'aime (\w+)", statement)
    if m:
      print("Il aime", m.group(1))

However, the nested if-else-cascade and the repeated creation of match objects seem awkward. What is a cleaner or more efficient way to handle this in Python?

vindhya.rddy · January 21, 2025, 6:30pm

I’ve worked with regex quite a bit, and I know how annoying it gets when you keep repeating match object creation. A simple class can make this much cleaner!

import re

class REMatcher:
    def __init__(self, matchstring):
        self.matchstring = matchstring

    def match(self, regexp):
        self.rematch = re.match(regexp, self.matchstring)
        return bool(self.rematch)

    def group(self, i):
        return self.rematch.group(i)

statements = ["I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"]

for statement in statements:
    m = REMatcher(statement)

    if m.match(r"I love (\w+)"): 
        print("He loves", m.group(1))

    elif m.match(r"Ich liebe (\w+)"):
        print("Er liebt", m.group(1))

    elif m.match(r"Je t'aime (\w+)"):
        print("Il aime", m.group(1))

    else: 
        print("???")

This makes working with Python regex match group much easier since the REMatcher class takes care of everything, allowing for cleaner and reusable code!

Priyadapanicker · January 21, 2025, 6:31pm

Okay, I see the class-based approach makes things cleaner, but what if we could do this even more concisely? Enter the walrus operator (:=), which lets us assign and check the match in one go!

import re

statements = ["I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"]

for statement in statements:
    if m := re.match(r"I love (\w+)", statement):
        print("He loves", m.group(1))

    elif m := re.match(r"Ich liebe (\w+)", statement):
        print("Er liebt", m.group(1))

    elif m := re.match(r"Je t'aime (\w+)", statement):
        print("Il aime", m.group(1))

    else:
        print("???")

Why is this better?

No need for a separate class.
The match object (m) is created only when needed.
The Python regex match group is accessed directly.

Ambikayache · February 6, 2025, 11:37am

I love the assignment expression trick! But what if we have a ton of patterns to match? Instead of manually checking each one, let’s use a dictionary to keep things DRY (Don’t Repeat Yourself)!"

import re

patterns = {
    r"I love (\w+)": "He loves",
    r"Ich liebe (\w+)": "Er liebt",
    r"Je t'aime (\w+)": "Il aime"
}

statements = ["I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"]

for statement in statements:
    for pattern, phrase in patterns.items():
        if m := re.match(pattern, statement):
            print(phrase, m.group(1))
            break
    else:
        print("???")

Why is this great?

Easily extendable for more patterns.
Removes repetitive if-elif blocks.
Still uses assignment expressions for efficiency.
Keeps Python regex match group extraction smooth.