Google maps are one of the most prosperous web pages for obtaining data. The information on reviews from maps can be helpful in many aspects. Therefore, here we shall understand how to scrape Google reviews using Python. We shall also seek help from Selenium and Beautifulsoup for the princess. So let's get started.
Initialize
In order to scrape Google reviews, we begin with the process of initialization. In this process, we first initialize the web driver. The driver has an ease of configuration. One can configure it with multiple combinations, but for this particular instance, we shall set out the browser language to English.
Here is the necessary code:
options = Options()
options.add_argument("--lang=en")
driver = webdriver. Chrome (chrome_options=options)
Initiate URL Input
In the next step of how to scrape Google reviews using Python, we begin to track down the page. We scrape Google by consolidating it to a URL that directs us to a landing page where we will find the Google reviews that we need. We run the driver to open this URL page in this step. Nevertheless, the URLs of such nature are complicated and do not account for the automatic approach. One needs to carefully copy it with a manual process from the browser and convert it into a variable before eventually moving it onto the driver.
Here is the necessary URL:
url =
https://www.google.it/maps/place/Pantheon/@41.8986108,12.4746842,17z/d ata=!3m1!4b1!4m7!3m6!1s0x132f604f678640a9 : 0xcad165fa2036ce2c!8m2!3d41. 8986108!4d12.4768729!9m1!1b1
driver.get(url)
Hit The Menu Buttons
We are here to scrape Google reviews that are likely to appear in an assortment of the most “relevant reviews.” Such a result is due to a default algorithm setting by Google. Here, we need to look for the ’sort’ menu and click it. Moving on, we find the ‘newest’ tab and proceed to click that button.
The steps we just performed are sequential execution. In this example, we shall employ Selenium. We start by looking for the 1st button. Click it, and move on to the next. We will require search alternatives of XPath, which are provided by Selenium itself. We have the advantage that the process is more simplified than a typical CSS search.
We also leverage Chropath, which is a browser extension that incorporates an XPath interpreter into the developer tools of your browser. This is a point till which we have examined the page and its expressions in the process of how to scrape Google reviews using Python. Next, we move on to feature the elements that we actually require.
However, there is a minor bump in the journey to scrape Google this way. The point till we could highlight the necessary texts would not suffice. Just like tons of websites in 2022, many portions of the Google websites load in an asynchronous fashion.
Such behavior leads to the disengagement of many buttons in case Selenium is trying to locate them directly once the page is loaded. The reason for such a blunder is the implementation of AJAX for Google websites.
Here is the necessary code:
wait WebDriverWait (driver, 10) menu_bt = wait.until (EC.element_to_be_clickable ( ) (By.XPATH, '//button [@data-value=\ 'Sort\']')) menu_bt.click() recent_rating_bt = driver.find_elements_by_xpath( recent_rating_bt.click() time.sleep (5) '//div[@role=\ 'menuitem\']') [1]
An added sleep function is noticeable at the end of this set. Why? The answer is that the click initiates an AJAX response that reloads the reviews, which will add up a bit to the waiting time toward the next step.
Extracting The Crust, The Review Data
The next step to scrape Google reviews is to proceed to the actual data that we are looking for. The Google reviews. Here, we will require input from another very competent HTML parser. None other than the Beautiful Soup. Beautiful soup parser is an essential step in this process of how to scrape Google reviews using Python as it now aids us in locating the proper HTML tags properties and divs.
To begin with, we shall proceed with pinpointing the wrapper div of our review. We can utilize the “find all” technique to develop a listicle of the div elements we need for the respective properties. For this instance, our list is inclusive of the div of the reviews that are available on the page.
Here’s what it looks like:
response = BeautifulSoup(driver.page_source, 'html.parser')
rlist = response.find_all('div', class_='section-review-content')
We must parse the data of each and every piece of our reviews. These data sets are inclusive of text based content, date of the review, identification of the review, names of the responders, and so on. Some of this information may not always be available.
Here is the necessary code:
id_r= r.find('button', class_='section-review-action-menu') username = r.find('div', ['data-review-id'] class_='section-review-title').find('span') .text try : review_text= r.find('span', class_='section-review-text').text except Exception : review_text = None rating = r.find('span', class_='section-review-stars') ['aria-label'] rel_date= r.find('span', class_='section-review-publish-date').text
Now We Scroll
We are now at the final stage to scrape Google. Sure, we are able to see the reviews generated right here. However, not every single one of them is going to be visible on the page, considering the number of reviews generated. Some of them are below, and the page needs to be scrolled.
The catch is, the scroll isn’t something that is directly available and equipped in the API provided by Selenium. Nevertheless, there is a scope that facilitates the use of JavaScript code to embed it into a particular page element. Here the div object can be clearly ascertained, which can be scrolled. We execute a JS code that allows us to scroll down to the end of the page.
Here is the necessary code:
scrollable_div = driver.find_element_by_css_selector ( 'div.section-layout.section-scrollbox.scrollable-y.scrollable-show' ) driver.execute_script( 'arguments[0].scrollTop = arguments[0].scrollHeight', scrollable_div )
Parting Thoughts
Here we wind up with a simple and step-by-step explanation of how to scrape Google reviews using Python. Many websites facilitate the use of similar techniques. There are many other aspects to learn and scrape Google reviews. Stay tuned for more!
If you are searching for web scraping services USA that can simplify your task of scraping Google reviews from Maps, feel free to contact Botscraper. We have years of experience in providing top-notch web scraping services to clients from USA and across the globe. Get in touch with us today!