Created
December 16, 2018 11:35
-
-
Save erajanraja24/2c51e76f906c49f341f331920111aca3 to your computer and use it in GitHub Desktop.
Scrape Google search results -Title, URL by Location
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function scrapeGoogle() { | |
var searchResults=UrlFetchApp.fetch("https://www.google.co.uk/search?q="+encodeURIComponent("keyword finder tool")+"&num=30",{muteHttpExceptions:true}); | |
var titleExp=/<h3 class=\"r\">([\s\S]*?)<\/h3>/gi; | |
var urlExpression=/<h3 class=\"r\">([\s\S]*?)\&\;/gi; | |
var titleResults=searchResults.getContentText().match(titleExp); | |
var urlResults=searchResults.getContentText().match(urlExpression); | |
//To get the actual Title | |
for(var i in titleResults) | |
{ | |
var actualTitle=titleResults[i].replace(/(^\s+)|(\s+$)/g, "").replace(/<\/?[^>]+>/gi, ""); | |
Logger.log(actualTitle); | |
} | |
//To get the actual URL | |
for(var i in urlResults) | |
{ | |
var actualURL=urlResults[i].replace('<h3 class="r"><a href="/url?q=',"").replace('&',""); | |
Logger.log(actualURL); | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Are there special configuration of the "Google Script" that is needed before running the script? I took your script and added one line (
Logger.log(searchResults)
) before the closing curly brace to print thesearchResults
. Upon dropping the texts to an HTML file, it appears to be only the search box itself has been returned.. Thus, subsequent Regex tricks will consistently generatenull
results when I continue to "print", saytitleResults
andurlResults
.