Web Crawling

Web Scraping with Go

Web scraping (Wikipedia entry) is a handy tool to have in your arsenal. It can be useful in a variety of situations, like when a website does not provide an API, or you need to parse and extract web content programmatically. This tutorial walks through using the standard library to perform a variety of tasks like making requests, changing headers, setting cookies, using regular expressions, and parsing URLs. It also covers the basics of the goquery package (a jQuery like tool) to scrape information from an HTML web page on the internet.

If you need to reverse engineering a web application based on the network traffic, it may also be helpful to learn how to do packet capture, injection, and analysis with Gopacket.

If you are downloading and storing content from a site you scrape, you may be interested in working with files in Go.