This post is includes the second half of the scripting required for the web-scraping project described in last weeks post. With a completed list of the blog post and their corresponding URLS, the next step is to code a scraper that will visit each page, search out the blog content and any relevant tags then…
Read MoreAt Balihoo, we recently encountered a problem. We needed to extract all the text and formatting from past blog posts we had written, but our service provider didn’t have a tool to do that. Since there were over 600 posts, this isn’t the kind of project you want to do by hand. With a bit…
Read More