Created
June 15, 2011 13:57
-
-
Save gwobcke/1027154 to your computer and use it in GitHub Desktop.
Classic ASP Scrape External URL
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<% | |
FUNCTION LoadThePage(strPageText, strInputURL) | |
Set objXMLHTTP = Server.CreateObject("MSXML2.ServerXMLHTTP") | |
objXMLHTTP.Open "GET", strInputURL, False | |
objXMLHTTP.Send | |
strPageText = objXMLHTTP.responseText | |
Set objXMLHTTP = Nothing | |
End FUNCTION | |
FUNCTION GrabTheContent(strStart, strEnd) | |
Dim strStartPos, strEndPos, strLength | |
strStartPos = 0 | |
strEndPos = 0 | |
strLength = 0 | |
'Find the Start Position of the Search String | |
strStartPos = instr(strPageText,strStart) | |
'Starting from the Search String start position and call it the end position | |
strEndPos = instr(strStartPos, strPageText, strEnd) | |
'Compute the length of the string in between the start and end positions | |
strLength = strEndPos - strStartPos | |
'filter the content, use trim to eliminate leading and trailing spaces | |
myContent = trim(mid(strPageText,strStartPos, len(strStart))) & "<br/>" & vbCRLF | |
myContent = myContent & trim(mid(strPageText,strStartPos + len(strStart), StrLength - len(strStart))) & "<br/>" & vbCRLF | |
GrabTheContent = myContent | |
End FUNCTION | |
'Declare the string used to hold the HTTP and the start/end strings | |
Dim strPageText, strStart, strEnd | |
'Declare and initialize the string used to hold the Input Page URL | |
Dim strInputURL | |
if DateDiff("h", Application("updated"), Now()) >=1 then | |
strInputURL = "http://birdsinbackyards.net/species/Cacatua-roseicapilla" | |
'Load the desired page into a string | |
LoadThePage strPageText, strInputURL | |
strHTML = GrabTheContent("<h5>Description</h5>","<h5>Similar species</h5>") | |
Application.Lock | |
Application("content") = strHTML | |
Application("updated") = Now() | |
Application.Unlock | |
end if | |
strHTML = Application("content") | |
response.write (strHTML) | |
%> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Graham,
I'm a bit of a n00b when it comes to scraping and fetching via Classic ASP. I have to grab the "ASB PREMIERSHIP STANDINGS" table from this page: http://www.nzfootball.co.nz/asb-premiership/
I'm having trouble with the script - any chance you could assist?
Thanks,
Adam