Iterators in Go
Go has always been a simple language from its inception. Perhaps too simple. I personally think of Go as a (successful) MVP of a programming language. With garbage collector, channels and goroutines.
This simplicity is the reason why many developers, myself included, love Go. And at the same time why so many other developers hate the language.
Go getting better
While in core Go remain simple language, it does not prevents an evolution. There were big and interesting changes in a past
- Context introduced in go 1.7
- Modules introduced in go 1.11, turned on in go 1.16
- Embedding introduced in go 1.16
- Generics introduced in go 1.18, which was the biggest change into the language up until this day
Go 1.22 language changes
Go 1.22 brings three interesting iteration related changes
for i := range foo
creates new variable, so thei := i
trick is no longer neededfor i := range int
is [permitted] https://go.dev/ref/spec#For_range- and last but not least, the rangefunc experiment
The problem
As explained in the motivation section of spec: add range over int, range over
func. The Go standard library already
contains structs iterating through something. Examples are bufio.Scanner
and
sql.Rows
. Each type needs to implement own distinct ad-hoc way of iterating.
There are some attempts to provide a better API - gubrak does a great job in show-casing us that the method chaining in Go is possible.
n := []string{"aa", "aaa", "aaaaaa", "a"}
result := gubrak.From(n).
Filter(func(s string) bool { return len(s) <= 3 }).
Map(func(s string) int { return len(s) }).
Result()
fmt.Println(result)
// Output: [2 3 1]
The obvious downside is it works on top of interface{}
, so compiler is not
able to catch any problem and in fact to use the API is much harder than the
code examples reveals.
.Result() interface{} // ==> description: returns the result after operation
.ResultAndError() (interface{}, error) // ==> description: returns the result after operation, and error object
.Error() error // ==> description: returns error object
.IsError() bool // ==> description: return `true` on error, otherwise `false`
So gubrak taught me an important lesson - generics are a must.
A more popular example is lo. It uses generics and works on slices, so it is the most idiomatic library for Go up to this date. The obvious drawback is that working on slices does not allow the developer to combine more operations into a single one. And writing an inner code for a FilterMap operation is not a pleasant experience. Especially if the filtering rules are not trivial.
n := []string{"aa", "aaa", "aaaaaa", "a"}
result := lo.FilterMap(n, func(s string, _ int) (int, bool) {
if len(s) > 3 {
return 0, false
}
return len(s), true
})
fmt.Println(result)
it
it is my answer. It builds on top of
rangefunc
experiment so it may become the idiomatic solution if the experiment
is accepted. On top of that it uses generics and provides a method base
API. Which is (type) safe to use.
n := []string{"aa", "aaa", "aaaaaaa", "a"}
slice := it.NewMapable[string, int](it.FromSlice(n)).
Map(func(s string) int { return len(s) }).
Index().
Filter2(func(index int, _ int) bool { return index <= 1 }).
Values().
Slice()
fmt.Println(slice)
// Output: [2 3]
The method based API is the second choice in Go and everything is implemented through generic functions. Those do not have the limitation of a struct methods.
n := []string{"aa", "aaa", "aaaaaaa", "a"}
// maps string->int->float32
s0 := it.FromSlice(n)
s1 := it.Map(s0, func(s string) int { return len(s) })
s2 := it.Map(s1, func(i int) float32 { return float32(i) })
s3 := it.Map(s2, func(f float32) string { return strconv.FormatFloat(float64(f), 'E', 4, 32) })
slice := it.AsSlice(s3)
fmt.Println(slice)
// Output: [2.0000E+00 3.0000E+00 7.0000E+00 1.0000E+00]
How the mapping works
The biggest limitation of a Go type system is that given type Foo[T any] struct
all methods accept only the T
and nothing more. This is a challenge
to do for a Map
operation, which works by translating T
into V
.
Yet the example above has a Map
method. How is this possible? The trick is
surprisingly simple. The struct itself must accept two type parameters T, V
and
Map
returns the struct with type parameters swapped.
type Mapable[T, V any] struct {
seq iter.Seq[T]
none V
}
func (g Mapable[T, V]) Map(mapFunc MapFunc[T, V]) Mapable[V, T] {}
Of course the trick only works well for two types. Using three or more would make the API hard to write and use. On the other hand mapping one type to another solves the majority of the cases. And anything more complicated is handled by simple functions and explicit variable passing.
How the Index works
One of my least favorite feature of lo
is it forces the developer to pass the
index parameter every time. Given the general verbosity of Go, especially the
anonymous function syntax, this adds a lot of unnecessary code.
So it
has been designed to avoid this. The index parameter does not
exists and can be added via Index
function only when it is needed. The
consequence is that iter.Seq[T]
is changed to iter.Seq2[int, T]
so
Filter2
and other methods must be used. As the compiler and gopls
don’t
allow the developer to use a wrong method, this is fine.
n := []string{"aa", "aaa", "aaaaaaa", "a"}
s0 := it.FromSlice(n)
for index, value := range it.Index(s0) {
fmt.Println(index, value)
}
Maps with an error
Error handling was a pain point of a gubrak
(see ResultAndError
, IsError
and so). Not in it
. MapError
is a function that returns the iter.Seq2[V, error]
from iter.Seq[T]
, so maps can fail and be processed later on.
n := []string{"forty-two", "42"}
s0 := it.FromSlice(n)
s1 := it.MapError(s0, strconv.Atoi)
for value, error := range s1 {
fmt.Println(value, error)
}
// Output:
// 0 strconv.Atoi: parsing "forty-two": invalid syntax
// 42 <nil>
Breaking the chain
One of the ideas I had was to break the chain. Imagine a situation where there’s a long chain of filters and some new complicated logic needs to be added in the middle. And the assigned developer loves good old Go and hates this functional chained non-sense.
chain := it.NewChain(it.FromSlice(n)).
Filter(func(s string) bool { return true })
// break the chain
m := magic()
for s := range chain.Seq() {
// here comes the new logic
m.push(s)
}
// continue the chain
chain2 := it.NewChain(p.seq()).
Filter(func(s string) bool { return len(s) > 2 })
slice := chain2.Slice()
fmt.Println(slice)
Github has working example which uses a channel and a goroutine in the background. I just can’t decide if this is good or a wrong idea.
Would love to hear any feedback about it
and features it provides.