Web Scraping with Go

Web scraping (Wikipedia entry) is a handy tool to have in your arsenal. It can be useful in a variety of situations, like when a website does not provide an API, or you need to parse and extract web content programmatically. This tutorial walks through using the standard library to perform a variety of tasks like making requests, changing headers, setting cookies, using regular expressions, and parsing URLs. It also covers the basics of the goquery package (a jQuery like tool) to scrape information from an HTML web page on the internet.

If you need to reverse engineering a web application based on the network traffic, it may also be helpful to learn how to do packet capture, injection, and analysis with Gopacket.

If you are downloading and storing content from a site you scrape, you may be interested in working with files in Go.

Security with Go - My book now published!

Check out Security with Go, a book I recently wrote, available from Packt Publishing. It covers secure development, red team and blue team topics and is useful for developers and infosec professionals like analysts, investigators, engineers, and pentesters. It's a great book if you want to get to know Go better or if you want to start using Go for security.

Working with Images in Go

The Image interface is at the core of image manipulation in Go. No matter what format you want to import or export from, it ultimately ends up as an Image. This is where the beauty of Go interfaces really shines. Go comes with support for gif, jpeg, and png formats in the standard packages. These examples demonstrate how to programatically generate, encode, decode, write to file, and base64 encode images. We will also cover a little bit about interfaces.

2D Arrays and Slices in Go

Two dimensional arrays and slices are useful for many situations. When creating an array in the Go programming language it is automatically initialized to the empty state, which is usually a 0. Arrays cannot be expanded. Arrays live in the stack and take up space in the compiled executable. Arrays are also passed by copy whereas slices pass pointers. Slices require a little more initialization at first but are more versatile. You can grab subsets of the arrays if needed, access elements directly, or expand the slices. Slices are thin wrappers around arrays and hold pointers to the array. They are initialized during run time and live in the heap.

HTML Templates in Go

Here is a quick example of how to use Go's html/template package to loop through a slice of structs and print out the contents. Note that in the templates {{.}} prints all the template variables, and . refers to the data passed.

type Person struct {
  var Name string
  var Age int

Simple struct definition for our Person