Limiting your goroutines

How to properly implement goroutine pool in golang with input and output channel(s). Text expects knowledge about golang and about Go Concurency. That means I am not going to explain what goroutine, channel or defer actually are.

Intro

Gorutines are some kind of lightweight threads managed by userspace golang runtime. They save memory and context switches. Therefor it is possible to spawn literally Millions! of them. However as BenPar^W Voltaire said

With great power comes great responsibility
Source: linkedin

It is not always practical to spawn your task million times. For example when you need to crawl web pages. It’s good idea to NOT do it millions of times at the same time. On the other hand this is EXACTLY the kind of task golang and its goroutines shines at. The way of solving it is called pool, or workers, task queue, whatever you want. The idea is the same, we have a huge number N of tasks, which will be distributed to smaller number P of goroutines. Each goroutine will read next input from the channel, do the work and return result on a channel. This way we can parameterize the number of concurrent tasks we run. We can measure speed of execution using different numbers. Or we can ensure our code can deal with huge input without crashing the world :-)

Deadlocks everywhere

The situation looks simple. Golang is an advanced language with builtin support for goroutines and concurrency. One just need to search the internet a bit to find an example. Knowing very little about golang, I naturally came to StackOverflow to get the examples of READing data from the go channel. And the code looked simple and printed all numbers. That means we can distribute data from one channel and consume them from go routines. And golang runtime will make the magic behind to make this happen.

So let’s add writing to the an another output channel. We will read it from main thread and everything should work! Something like Worker pools @gobyexample.com.

And BAM! Code deadlock!

deadlock
Source: nikolar.com

Program crashed and never print anything. Panic mode started. To make the long story short. Here is the key

You MUST read from channel you’re writing into, or BAD things will happen.

My code was structured this way

  1. After some init, create P goroutines and execute them
  2. Write data to input channel
  3. Read results from output channel

But program stopped in part 2, so reading of the data never happened, which blocked the send part inside goroutines, … there is no better word than deadlock to describe the situation.

The solution

Fortunately golang provides a simple way to fix it. Offload the second part to goroutine. We can read from output channel immediately from main thread. Full code available on https://play.golang.org or below

Logo by samthor@Flickr: [https://www.flickr.com/photos/samthor/5994939587]