The problem: Needing to download hundreds of images instantly

At Balihoo we are changing providers for our blogging platform, and one of the challenges that came up was how to extract all the images (Over 600) we had uploaded to our existing software without having a built-in tool for pulling them off the server. While I was able to get a complete list of the URLs by using some scraping tools, clearly copying each URL and “save image as…” for 600 wasn’t very efficient.
the_problem

The Solution

We needed to download the entire list of images and get the file name of each image and associate it with the original post it was extracted from. Since I had promised to finish this whole project on Thursday afternoon and it was Friday at 9AM, I also needed this whole process to be fast. Since extracting images from an img src isn’t the kind of thing you need a browser to render code for, so instead of coding IE I set up VBA code that would make requests for images directly through Microsoft’s XML API.

The big picture was something like this:

For URL
  Make API Request for Image from URL
  Determine ImageName from URL
  Save Image to filestructure/ImageName
Next URL

I chose to program this in VBA (since the list I was given for URLs was in Excel) and using MSXML12 for VBA. The code (thanks to VBA-and-Excel.com for most of it) for a single extraction is:

Sub Save_image()
    Dim oHTTP As Object
    Dim sDestFolder As String
    Dim sSrcUrl As String
    Dim sImageFile As String
    
    sDestFolder = "C:\Users\adale\Desktop\Compendium Images\"
    sSrcUrl = ActiveCell.Value
    If Left(sSrcUrl, 2) = "//" Then
        sSrcUrl = "http:" & sSrcUrl
    End If
    
    sImageFile = Right(ActiveCell.Value, Len(ActiveCell.Value) - InStrRev(ActiveCell.Value, "/"))
    Debug.Print sImageFile
    ActiveCell.Offset(0, 2).Value = sImageFile

    
    Set oHTTP = CreateObject("msxml2.XMLHTTP")
    oHTTP.Open "GET", sSrcUrl, False
    oHTTP.send
    
    Set oStream = CreateObject("ADODB.Stream")
    Const adTypeBinary = 1
    Const adSaveCreateOverWrite = 2
    oStream.Type = adTypeBinary
    oStream.Open
    
    oStream.write oHTTP.responseBody
    oStream.savetofile sDestFolder & sImageFile, adSaveCreateOverWrite
    
    Set oStream = Nothing
    Set oHTTP = Nothing
End Sub

Breakdown:

This code works by selecting a cell with an img URL in it (and an image name two columns to the right) and looking up that image and saving it to my hard drive. In addition to the saving script, there will be a sweeper script needed, but coding a sweeper script for excel is something I’ve covered extensively before.

Sub Save_image()
    Dim oHTTP As Object
    Dim sDestFolder As String
    Dim sSrcUrl As String
    Dim sImageFile As String

Standard object and variable definitons. oHTTP will become out XML API request object.

    sDestFolder = "C:\Users\adale\Desktop\Compendium Images\"
    sSrcUrl = ActiveCell.Value
    If Left(sSrcUrl, 2) = "//" Then
        sSrcUrl = "http:" & sSrcUrl
    End If
    
    sImageFile = Right(ActiveCell.Value, Len(ActiveCell.Value) - InStrRev(ActiveCell.Value, "/"))
    Debug.Print sImageFile
    ActiveCell.Offset(0, 2).Value = sImageFile

This bit of code defines the save location for the images (“Compendium Images” folder on my Desktop) as well as extracting the file name from the URL and setting that name as the value two columns to the right.

    Set oHTTP = CreateObject("msxml2.XMLHTTP")
    oHTTP.Open "GET", sSrcUrl, False
    oHTTP.send
    
    Set oStream = CreateObject("ADODB.Stream")
    Const adTypeBinary = 1
    Const adSaveCreateOverWrite = 2
    oStream.Type = adTypeBinary
    oStream.Open
    
    oStream.write oHTTP.responseBody
    oStream.savetofile sDestFolder & sImageFile, adSaveCreateOverWrite

This is really the bulk of the code. In the first 3 lines VBA makes the request to the URL specified for the file it is looking for. The second chunk of code is initiating a place to store the API response, and the last two lines take the API response and write it to the specified file. If you are curious about the specific commands given in this section, Google can provide you with more details about the request, the parameters, and the outputs.

Besides the cleanup code at the bottom of the script, there really is nothing else to this code. While there are thousands of potential ways to utilize the MSXML packages, I found that to be a particularly helpful use of them. When all was said and done, I spend 20 minutes coding this and 5 minutes for it to make the request to 600 URLs and save the associated files instead of hours and hours copying and pasting and right clicking.

solution

Related posts

2 thoughts on “Use Excel to download hundreds of images instantly”

  1. Thomas

    Great tip! Thanks for posting this. I’ve searched everywhere for a solution to this problem and this is the first one that works for me, sort of. I have a couple of issues wondering if you know the solution. One, I programed this as a module in VBA but when I run it it only does it one cell at a time. So i had to make a macro and do it cell by cell, how can I get it to run and do all of the cells in a column? Two, it is saving the image name in a new column. How can I get it to save the image name in the existing cell? Three, I have some cells that don’t have an image name like: http://www.xxxxxxx.com/core/media/media.nl?id=130185&c=849850&h=db34577c43aa1bd2a71. Any way you know of I can get it to pull the image and save the actual image name in the cell?

  2. Literate Aspects

    Excellent post and share!

Leave a Comment