How can I use Python to replace content using a regex pattern, and does .replace() support regex operations?

sndhu.rani · July 8, 2025, 6:30pm

I’m trying to remove everything after the tag in a string. I attempted to use .replace(‘.+’, ‘’), but it doesn’t seem to work. Does the .replace() method in Python support regular expressions, or is there a better way to achieve this using a proper python regex replace technique?

mark-mazay · July 8, 2025, 6:30pm

Ah, I see what you’re going for! You’re right to suspect that .replace() doesn’t support regular expressions , it’s pretty limited to exact string matches. For regex functionality, you’ll need to turn to the re module. In your case, here’s a quick fix using Python regex replace:

import re

cleaned = re.sub(r'</html>.*', '</html>', your_html_string, flags=re.DOTALL)

The .* grabs everything after </html>, and re.DOTALL is important since it lets the regex match across newlines. This approach works great when you’re sanitizing scraped HTML or just cleaning up strings!

emma-crepeau · July 8, 2025, 6:30pm

I ran into a similar issue when I tried to remove JavaScript comments and random garbage after closing tags. As mentioned earlier, .replace() isn’t the tool you need. I switched to re.sub() and honestly, I’ve never looked back. One thing to keep in mind though: always test your regex to ensure it matches greedily. Sometimes, when dealing with real HTML, edge cases like nested tags or comments can throw you off. A quick regex test in an online tool helps a lot here!

Rashmihasija · July 8, 2025, 6:30pm

Yeah, I also tried .replace('</html>.+', '</html>') and couldn’t understand why it didn’t work at first. It turns out, .replace() isn’t doing pattern matching — it just replaces exact matches. As everyone mentioned, re.sub() is the way to go. Just don’t forget to import re and, like Mark said, be careful with .+ — it won’t match across multiple lines unless you set the right flag. Adding flags=re.DOTALL really saved me when I worked with multi-line HTML content. It’s the perfect Python regex replace solution when you’re cleaning up strings!