Remove Quotes from Speech Recognition Output in Python

How can I Python remove quotes from string results obtained from a speech recognition script?

I have a function that runs a shell script to recognize speech using the Google Speech-to-Text engine. The script returns the recognized text, but the output is enclosed in quotes, which causes issues when I try to use it for running commands. How can I remove these quotes directly in Python before processing the output further?

Here’s the function I use to call the shell script and capture its output:

python

Copy code

def recog():
    p = subprocess.Popen(['./speech-recog.sh'], stdout=subprocess.PIPE,
                                            stderr=subprocess.PIPE)
    global out, err
    out, err = p.communicate()
    print(out)

The shell script (speech-recog.sh) is adapted from a Voicecommand Program for Raspberry Pi. It captures audio, processes it through the Google Speech API, and outputs the recognized text. The issue is that the returned text has extra quotes around it, which need to be removed.

What’s the best way to handle this directly in Python so that the recognized text is clean and ready to use for further processing? Would string manipulation like strip() work in this case? If so, how should I incorporate it into the function to ensure the output is properly cleaned up?

I’ve dealt with this type of issue before, and a simple approach is to use Python’s str.strip() method. It’s a quick way to remove unwanted characters like quotes from the start and end of the string. Here’s how you can implement it:

def recog():
    p = subprocess.Popen(['./speech-recog.sh'], stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE)
    global out, err
    out, err = p.communicate()
    clean_output = out.decode('utf-8').strip('""')  # Remove enclosing double quotes
    print(clean_output)

This works effectively when the quotes are at the edges of the string. It’s clean and straightforward!

That’s a great starting point, Miro! From my experience, if you want to remove all occurrences of quotes (not just those at the edges), the str.replace() method is a better fit. It replaces every instance of the specified substring.

Here’s an updated version of your code:

def recog():
    p = subprocess.Popen(['./speech-recog.sh'], stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE)
    global out, err
    out, err = p.communicate()
    clean_output = out.decode('utf-8').replace('""', '')  # Replace all double quotes
    print(clean_output)

This approach is useful if there’s a chance of additional quotes within the string. It’s precise and covers more scenarios for python remove quotes from string.

Good points, both of you! However, in my experience with text cleaning, there are situations where quotes may have different patterns, like extra spaces or even single quotes. For more flexibility, I recommend using Python’s re module and the re.sub() method.

Here’s how you can handle it:

import re

def recog():
    p = subprocess.Popen(['./speech-recog.sh'], stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE)
    global out, err
    out, err = p.communicate()
    clean_output = re.sub(r'^""|""

This regex removes quotes only at the start or end of the string, but it can easily be expanded to match other patterns. If your strings have complex quote issues, regex gives you the control to refine your solution. Python remove quotes from string? Problem solved!, ‘’, out.decode(‘utf-8’)) # Regex for edge cases print(clean_output)


This regex removes quotes only at the start or end of the string, but it can easily be expanded to match other patterns. If your strings have complex quote issues, regex gives you the control to refine your solution. Python remove quotes from string? Problem solved!