What is site searching and why do I need it?

If you ever find yourself looking at a list of URLs and are contemplating doing anything that involves copying and pasting each one into a browser and looking at the corresponding website, you are probably looking at something that could be automated. To that end, I’ve programmed a VBA script within Excel that uses HTTP requests to automatically scan large lists of sites and search source code looking for certain terms or tags, either in the rendered text of the site or in the html. Possible uses:

  • Looking for a particular technology used- We recently wanted to take lists of over 10,000 websites and figure out which of those sites used Marketo. While its fairly easy to do if you browse through the source code of a site whether or not it was created with Marketo (There will always be references to Marketo cookies for example) doing it by hand to a large number of sites can be time consuming.

  • Looking for a site feature/functionality- Its no secret that we are interested in finding companies with distributed networks as prospective clients. Almost all of these will have sites with have some sort of “find a dealer/location/store/place to buy/agent” on their website somewhere. Instead of looking for these by hand on websites, we needed a way to scan bulk lists of website for these features.

  • Blacklisting sites (or emails) that already have a solution- Websites that have shopping carts indicate an ecommerce presence that means they would probably not be a good fit for the Balihoo solution. Instead of wasting effort and time connecting with leads like this, we prefer to segment off emails and accounts which are associated with sites that have a shopping cart/basket/checkout.

Download the tool

In addition to the above 3 mentioned uses, there may be many more. The file is available below, and I’d recommend my intro post on using VBA in Excel to familiarize yourself with the basics of running a macro in excel.

Right-click and “save link as” to download: Site_Source_Searcher.xlsm

Using the site searcher

The file ONLY works in Excel. When you first open the file, there will be a security warning at the top with the option to “Enable Macros”. For the tool to work, you will need to click this and enable them.
Paste in the URLs into the “Site” list on the left of the first sheet.
Add terms to search for and the score given for each match of each term to the 2nd sheet of the Excel file. Scores can be positive or negative for each match so you can weigh positive matches against negative matches for a net score. The tool will give the points listed for each instance of a match, so terms you would expect to show up a lot, consider giving lower point values to.
Return to the first sheet and select any URLs you want to score, then click the button on the top right. Be aware that while this tool usually takes around a second to score a URL, it can take as long as 20 seconds per URL if the URLs are bad and Excel is unable to contact them (it will spit out an error code in the “notes” column if this is the case). So, if you put in 200 URLs and select them all to score, it will probably take 30 seconds to 3 or 4 minutes. Excel will look like it is “Not Responding” mode while it is doing this on longer lists, but do not panic. Let it do its thing.
When it has finished, you will have a score next to each URL as well as the matching terms that contributed to the score. Since we negatively score things like having a shopping cart or an ecomm store, sometimes sites will have a total score that is negative.
To turn the list into a useful tool for prospecting (or whatever other purposes you may have) consider sorting by score to find the sites of interest quickly.
And that’s it! If you’ve got any questions about the script, be sure to let me know or leave a note in the comments. If there is enough interest in how this script works, I’ll write a post describing the coding that drives this tool.

Related posts

1 thought on “Search source code of any sites with Excel”

  1. […] off pages and you’re not trying to set up the code yourself, I’d recommend this post: Search source code of any sites with Excel. You can just download the completed VBA file, and get some ideas about practical marketing […]

Leave a Comment