Hi Anusha,
It is indeed possible to perform Google searches programmatically in Python with just a few lines of code. The setup may not be very straightforward, but the result is that you can search the entire web using the google search api python.
Here are the three main steps involved:
Step 1: Get Google API Key
The old PyGoogle SOAP API is no longer supported, and the AJAX API is also unavailable. To proceed, you’ll need to obtain a Google API key. For simple experimental use, I recommend using a “server key.”
You can find more details here: Google API Keys
Step 2: Set Up a Custom Search Engine to Search the Entire Web
Google Custom Search is the best available solution for programmatic searching. To set up a Custom Search Engine (CSE) that searches the entire web:
- Go to Google Custom Search and click “Create a Custom Search Engine.”
- Give it a name and description.
- In the “Sites to Search” box, enter at least one valid URL (just enter www.anyurl.com to move forward for now).
- Choose the CSE edition, accept the Terms of Service, and click Next.
- After completing the setup, navigate to your Control Panel and click “Basics.”
- In the “Search Preferences” section, select “Search the entire web but emphasize included sites.”
- Save your changes and delete the site you entered earlier.
More details can be found here: Google Custom Search Help
Step 3: Install Google API Client for Python
You need to install the google-api-python-client package to interact with the Google Custom Search API. Use this command to install it:
pip install google-api-python-client
More information is available on the GitHub repository and in the official Google documentation.
Step 4 (Bonus): Perform the Search
After the setup, you can use the following Python code to perform searches:
from googleapiclient.discovery import build
import pprint
my_api_key = "Your Google API Key"
my_cse_id = "Your Custom Search Engine ID"
def google_search(search_term, api_key, cse_id, **kwargs):
service = build("customsearch", "v1", developerKey=api_key)
res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
return res['items']
results = google_search('stackoverflow site:en.wikipedia.org', my_api_key, my_cse_id, num=10)
for result in results:
pprint.pprint(result)
This code should help you replicate the behavior of your original snippet. Note that num has an upper limit of 10, so you may need to update start in a loop for larger result sets.