Screen Sraping the Google Play Store

I was trying to fetch the average ratings and the download count of the top Android Apps from the Google Play Store but since they offer no API, screen scraping was the only solution.

As a first step, I wrote a Google search query that returns all Android apps that have been downloaded 500+ million times from the Play Store. The query looked something like this: "500,000,000 - 1,000,000,000"

The parameter num=100 was appended to the Google Search URL so that it returns 100 search results on the first page. I then created a new sheet in Google Docs and used the ImportXML function to extract all the Google Play hyperlinks into the Google Sheet (A1 is the Google URL).

=importXML(A1, "//h3/a/@href")

Once the Google Play URL of an App is known, the rating and count can be easily know using another importXML function (K3 is the Google Play url for any Android App).


Please note that a Google Sheet can have a maximum of 50 ImportXML functions.