Tuesday, November 1, 2011

Excel VBA: Download files from the Internet

There is no built-in function in Microsoft Excel which allows you to download contents from the Internet on the fly. To accomplish this task we need to use the API for WinInet. The use and explanation of API in VBA is for advanced users which have prior experience from either Visual Basic 6.0 or .NET.

Pitfalls
It is very important that all open Internet connections are closed as soon as the task is completed. WinInet only allows 2 concurrent connections to a given host. If you forget to shut down the connection after use, you will experience timeouts and misleading error messages. Please refer to the following website for more information related to the maximum allowed concurrent web connections:

Howto
The source code below should be pasted in a "Class Module" in Excel. If you are not sure how to open the VBA editor in Excel for your current Microsoft Office version, please refer to the following page:
  • Display the developer toolbar or ribbon in Excel

Create new class module:
  1. Open the Microsoft Visual Basic for Applications editor in Excel.
  2. Select Insert -> Class Module on the main menubar
  3. Rename the new class module to "WebClient"

Example
To use the code, you shold create a new instance of the class and any of the public methods:
  • DownloadFile - download a specific resource to a local file
  • UrlExists - check if a given URL exists

Dim objClient As New WebClient
Call objClient.DownloadFile("http://www.google.com", "c:\test.html")


Dependencies
The function "ReThrowError" is defined here:

Source Code


' API
Private Declare Function InternetOpen Lib "wininet.dll" Alias "InternetOpenA" (ByVal sAgent As String, ByVal lAccessType As Long, ByVal sProxyName As String, ByVal sProxyBypass As String, ByVal lFlags As Long) As Long
Private Declare Function InternetCloseHandle Lib "wininet.dll" (ByVal hInet As Long) As Integer
Private Declare Function InternetReadFile Lib "wininet.dll" (ByVal hFile As Long, ByVal sBuffer As String, ByVal lNumBytesToRead As Long, lNumberOfBytesRead As Long) As Integer
Private Declare Function InternetConnect Lib "wininet.dll" Alias "InternetConnectA" (ByVal hInternetSession As Long, ByVal sServerName As String, ByVal nServerPort As Integer, ByVal sUsername As String, ByVal sPassword As String, ByVal lService As Long, ByVal lFlags As Long, ByVal lContext As Long) As Long
Private Declare Function InternetOpenUrl Lib "wininet.dll" Alias "InternetOpenUrlA" (ByVal hInternetSession As Long, ByVal lpszUrl As String, ByVal lpszHeaders As String, ByVal dwHeadersLength As Long, ByVal dwFlags As Long, ByVal dwContext As Long) As Long
Private Declare Function InternetGetLastResponseInfo Lib "wininet.dll" Alias "InternetGetLastResponseInfoA" (lpdwError As Long, ByVal lpszBuffer As String, lpdwBufferLength As Long) As Boolean

Private Enum EHttpQueryInfoLevel
    http_QUERY_CONTENT_TYPE = 1
    http_QUERY_CONTENT_LENGTH = 5
    http_QUERY_EXPIRES = 10
    http_QUERY_LAST_MODIFIED = 11
    http_QUERY_PRAGMA = 17
    http_QUERY_VERSION = 18
    http_QUERY_STATUS_CODE = 19
    http_QUERY_STATUS_TEXT = 20
    http_QUERY_RAW_HEADERS = 21
    http_QUERY_RAW_HEADERS_CRLF = 22
    http_QUERY_FORWARDED = 30
    http_QUERY_SERVER = 37
    http_QUERY_USER_AGENT = 39
    http_QUERY_SET_COOKIE = 43
    http_QUERY_REQUEST_METHOD = 45
    http_STATUS_DENIED = 401
    http_STATUS_PROXY_AUTH_REQ = 407
End Enum

Private Declare Function HttpQueryInfo Lib "wininet.dll" Alias "HttpQueryInfoA" (ByVal hhttpRequest As Long, ByVal lInfoLevel As Long, ByRef sBuffer As Any, ByRef lBufferLength As Long, ByRef lIndex As Long) As Integer
Private Declare Function HttpOpenRequest Lib "wininet.dll" Alias "HttpOpenRequestA" (ByVal hHttpSession As Long, ByVal sVerb As String, ByVal sObjectName As String, ByVal sVersion As String, ByVal sReferer As String, ByVal something As Long, ByVal lFlags As Long, ByVal lContext As Long) As Long
Private Declare Function HttpSendRequest Lib "wininet.dll" Alias "HttpSendRequestA" (ByVal hhttpRequest As Long, ByVal sHeaders As String, ByVal lHeadersLength As Long, sOptional As Any, ByVal lOptionalLength As Long) As Integer

' Constants
Private Const INTERNET_FLAG_NO_CACHE_WRITE As Long = &H4000000
Private Const INTERNET_FLAG_NO_UI As Long = &H200
Private Const INTERNET_FLAG_EXISTING_CONNECT As Long = &H20000000
Private Const INTERNET_FLAG_RELOAD As Long = &H80000000

Private Const INTERNET_OPEN_TYPE_PRECONFIG = 0
Private Const INTERNET_OPEN_TYPE_DIRECT = 1
Private Const INTERNET_OPEN_TYPE_PROXY = 3


' User Agent
Private Const USER_AGENT = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"



' Open
Private Function OpenSession()
    Dim hSession As Long

    ' Open internet connection
    hSession = InternetOpen(USER_AGENT, INTERNET_OPEN_TYPE_PRECONFIG, vbNullString, vbNullString, 0)

    ' Valid session?
    If (hSession = 0) Then
        ' Error
        Err.Raise 1234, , "Unable to open internet connection!"
        
        ' Finished
        Exit Function
    End If
    
    ' Get the value
    OpenSession = hSession
End Function

' Close Handle
Private Sub CloseHandle(ByRef hHandle As Long)
   ' Valid handle?
   If (hHandle <> 0) Then
        ' Close
        Call InternetCloseHandle(hHandle)
        
        ' Clear handle
        hHandle = 0
    End If
End Sub


' Open Url
Private Function OpenUrl(ByVal hSession As Long, ByVal strUrl As String, Optional ByVal bRaiseError = True) As Long
    Dim hConnection As Long
    
    ' Valid session?
    If (hSession = 0) Then
        Err.Raise 2345345, , "The session is not set!"
        Exit Function
    End If
    
    ' Open Url
    hConnection = InternetOpenUrl(hSession, strUrl, vbNullString, ByVal 0&, INTERNET_FLAG_EXISTING_CONNECT Or INTERNET_FLAG_RELOAD, ByVal 0&)

     ' Valid file?
    If (hConnection = 0) Then
        ' Error
        Call RaiseLastError
        
        ' Exit
        Exit Function
    End If

    ' Get the value
    OpenUrl = hConnection

End Function

' Raise Last Error
Private Sub RaiseLastError()
    Dim strErrorMessage As String
    Dim lngErrorNumber As Long

    ' Get the last error
    lngErrorNumber = Err.LastDllError
    
    ' Valid error?
    If (lngErrorNumber <> 0) Then
        ' Error
        Err.Raise lngErrorNumber, , "DLL Error: " & CStr(lngErrorNumber)
    Else
        ' Get the error
        If (GetLastResponseInfo(lngErrorNumber, strErrorMessage)) Then
            ' Raise error
            Err.Raise lngErrorNumber, , strErrorMessage
        End If
    End If
End Sub

' Get Last Response Info
Private Function GetLastResponseInfo(ByRef lngErrorNumber As Long, ByRef strErrorMessage As String) As Boolean
    Dim intResult As Integer
    Dim lngBufferLength As Long
    
    ' Get the required buffer size
    intResult = InternetGetLastResponseInfo(lngErrorNumber, strErrorMessage, lngBufferLength)
        
    ' Valid length?
    If (lngErrorNumber <> 0) Then
        ' Allcoate the buffer
        strErrorMessage = String(lngBufferLength, 0)
        
        ' Retrieve the last respons info
        intResult = InternetGetLastResponseInfo(lngErrorNumber, strErrorMessage, lngBufferLength)
    
        ' Get the error message
        GetLastResponseInfo = True
        Exit Function
    End If
    
    ' Not an error
    GetLastResponseInfo = False
End Function


' File Exists?
Public Function UrlExists(ByVal strUrl As String) As Boolean
    On Error GoTo ErrorHandler
    
    Const BUFFER_LENGTH As Long = 255
    
    Dim hSession As Long
    Dim hConnection As Long
    Dim strBuffer As String * BUFFER_LENGTH
    Dim intBufferLength As Long
    Dim intResult As Integer
    Dim lngIndex As Long
    Dim strStatusCode As String
    Dim intStatusCode As Integer
    
    ' Open Session
    hSession = OpenSession
    
    ' Open the file
    hConnection = OpenUrl(hSession, strUrl, False)
    
    ' Set the default bufferlength
    intBufferLength = BUFFER_LENGTH
    
    ' Get the status code
    intResult = HttpQueryInfo(hConnection, http_QUERY_STATUS_CODE, ByVal strBuffer, intBufferLength, lngIndex)
    
    ' Valid value?
    If (intResult <> 0) Then
        ' Get the status code string
        strStatusCode = Left(strBuffer, intBufferLength)
        
        ' Get the integer status code
        intStatusCode = CInt(strStatusCode)
        
        ' Check the status code
        UrlExists = (intStatusCode = 200)
    End If
    
    ' Close the connection
    Call CloseHandle(hConnection)
    Call CloseHandle(hSession)
    Exit Function
    
ErrorHandler:
    Call CloseHandle(hConnection)
    Call CloseHandle(hSession)
    
    ' Re-throw
    Call ReThrowError(Err)
End Function


' Download File
Public Sub DownloadFile(ByVal strUrl As String, ByVal strFilename As String)
    On Error GoTo ErrorHandling
    
    ' Buffer size
    Const BUFFER_SIZE As Integer = 4096
    
    Dim hSession As Long
    Dim hConnection As Long
    Dim strBuffer As String * BUFFER_SIZE
    Dim intFile As Integer
    Dim lngRead As Long
    Dim intResult As Integer

    ' Open session
    hSession = OpenSession()

    ' Open the file
    hConnection = OpenUrl(hSession, strUrl)
    
    ' Find free file
    intFile = FreeFile
    
    ' Create file
    Open strFilename For Binary As #intFile
    
        Do
            ' Read the data
            intResult = InternetReadFile(hConnection, strBuffer, BUFFER_SIZE, lngRead)
    
            ' Valid function?
            If (intResult <> 0) Then
            
                ' Valid number of bytes read?
                If (lngRead > 0) Then
                
                    ' Is less than buffer size?
                    If (lngRead < BUFFER_SIZE) Then
                    
                        ' Get only the relevant data
                        strBuffer = Left(strBuffer, lngRead)
                    End If
                
                    ' Write the data
                    Put #intFile, , strBuffer
                End If
            End If
            
        Loop While (lngRead > 0)
        
    ' Close the file
    Close #intFile
    
ExitMe:
    ' Close connection
    Call CloseHandle(hConnection)
    Call CloseHandle(hSession)
    Exit Sub
    
ErrorHandling:
    ' Close connection
    Call CloseHandle(hConnection)
    Call CloseHandle(hSession)
     
    ' Rethrow
    Call ReThrowError(Err)

End Sub
Olá! Se você ainda não assinou, assine nosso RSS feed e receba nossas atualizações por email, ou siga nos no Twitter.
Nome: Email:

10 kommentarer:

Oliver said... at March 13, 2012 at 12:39 AM

Hi,
the code is working fine, but if I download a ZIP file, it gets corrupted.
Do you have any idea?
Thanks Oliver

Ulf said... at March 14, 2012 at 9:12 AM

The corrupted binary file is most likely related to the following statements:

Dim strBuffer As String * BUFFER_SIZE
and
strBuffer = Left(strBuffer, lngRead)

All downloaded data is treated as string and not binary. For this reason the binary ZIP files are corrupted.

The problem can most likely be solved by replacing the string with a byte array. This is to ensure that there is no string conversion. However, I have not tested it.

Anonymous said... at August 29, 2012 at 8:01 PM

This code works and is very helpful. Do you have code to upload a file to the web using a similar method?

Ulf said... at September 5, 2012 at 10:01 AM

Uploading files to the webserver depends on the implementation of you web application. To upload files you most likely need to do a HTTP Post using "multipart/form-data" encoding. A full implementation of a form-data post is quite complex in Excel.

Anonymous said... at September 26, 2012 at 8:59 PM

This code works fine as long as I use it in a situation similar to the example, downloading the www.google.com startpage as a file test.html.

However, it does not seem to work in the following situation:
- A hyperlink on a page links to a location like this: http://www.pzrooster.nl/dag/infoweb/export.php?ref=5&id=2Ba&dag=1348441200
- Clicking the hyperlink brings up a dialog enabling you to open or save a file with a preset name 'mijnRooster.ics' (a calendar file)
- The file is dynamically generated from PHP

Running your code does not generate an error and it creates a file as requested, but it is zero byte and contains only one character: the number 1.

Any suggestions or help will be highly appreciated!

Thanks,
Hans

Anonymous said... at November 18, 2012 at 7:14 PM

Further to the above: I found out that the website sets a cookie with a unique code. This code has to be sent with every request. If not, it returns an empty file. I am able now to download the file, sending a valid cookie value along with the request.
However, I am still in the process of finding out how to fetch the cookie at the time of creation. Again,, any suggestions will be appreciated.

Thanks,
Hans

Anonymous said... at December 23, 2012 at 9:52 AM

I also faced the same problem...I think solution to such problem would really help!

Anonymous said... at March 13, 2013 at 1:56 PM

does not work on windows 7 64-bit

Anonymous said... at June 27, 2013 at 2:49 PM

Hi,

This code is working fine but I am trying to download a CSV file and it downloads the file of size 4 KB which is the buffer size. Please suggest whats going wrong.

Thanks a lot for your time and help

hterzian said... at September 24, 2013 at 11:04 PM

I got it to work with Windows 7 64-bit and also solve the corruption issues as follows:
- Replaced "Private Declare Function" with "Private Declare PtrSafe Function" in the API calls.
- used FileSystemObject to write the file in the DownloadFile subroutine.

Post a Comment