Go Parser Combinator with Go Generics

Go Parser Combinator with Go Generics


Maybe this post will be old cuz I'm developing peachcomb actively!

This post introduces a Go parser-library that is being developed by me, called peachcomb . this library is aimed to reduce overhead with dynamically dispatching, and notify mismatching among parsers. To achieve them, I use Go Generics in peachcomb. Go Generics was released in Go 1.18.


This post won't describe the detail of Go Generics. I'll recommend you this tutorial to understand Go Generics briefly.



To use peachcomb, You should follow just 2-steps. Initialization and Calling it. Before describing in detail, I'll show you the simplest sample to understand peachcomb's usage. The below sample parses numbers that they're separated by |.


1package main 2 3import ( 4 "github.com/Drumato/peachcomb/pkg/strparse" 5 "github.com/Drumato/peachcomb/pkg/combinator" 6) 7 8func main() { 9 element := strparse.Digit1() 10 separator := strparse.Rune('|') 11 p := combinator.Separated1(element, separator) 12 i, o, err := p([]rune("123|456|789Drumato")) 13 fmt.Println(i) 14 fmt.Printf("%s\n", o) 15 fmt.Println(err) 16}
1$ go run main.go 2Drumato 3[123 456 789] 4<nil>

i is the rest input that parser p consumed. o is the p 's output. In this case, o forms like []string{"123", "456", "789"}. It's just all of the peachcomb's usage.


I strongly refered Geal/nom, that is a parser library in Rust. Nom achieves to construct fast/generic parsers by constrainting trait bounds.

Parser Signature

First of all, all parsers in peachcomb implements one signature, type Parser[E comparable, O parser.ParseOutput] . It's defined such as below.

1type Parser[E comparable, O ParseOutput] func(input ParseInput[E]) (ParseInput[E], O, ParseError) 2 3type ParseInput[E comparable] []E 4 5type ParseOutput interface{} 6 7type ParseError interface { 8 error 9}

I think there are some merits caused by designing Parser signature. First, if users want to use a certain parser but peachcomb doesn't suppport it, users can implement in their project, and pass them into generalized function in package combinator (e.g. Map()). Second, almost parsers can be implemented generically. We don't need to prepare almost parsers by each input type. Last, users only needs to know the interface. Initialization and Calling.

Type Resolving

Now let's see the type resolving among peachcomb's parsers. The playable sample code is placed in Go Playground.


1package main 2 3import ( 4 "fmt" 5 6 "github.com/Drumato/peachcomb/pkg/combinator" 7 "github.com/Drumato/peachcomb/pkg/parser" 8 "github.com/Drumato/peachcomb/pkg/strparse" 9) 10 11func main() { 12 var element parser.Parser[rune, string] = strparse.Digit1() 13 var separator parser.Parser[rune, rune] = strparse.Rune('|') 14 var p parser.Parser[rune, []string] = combinator.Separated1(element, separator) 15 16 var i []rune 17 var o []string 18 var err error 19 i, o, err = p([]rune("123|456|789Drumato")) 20 21 fmt.Println(string(i)) 22 fmt.Printf("%d\n", len(o)) 23 fmt.Printf("%s %s %s\n", o[0], o[1], o[2]) 24 fmt.Println(err) 25}

the actual function signatures are like this.

1func Digit1() parser.Parser[rune, string] 2func Rune(expected rune) parser.Parser[rune, rune] 3func Separated1[ 4 E comparable, 5 EO parser.ParseOutput, 6 SO parser.ParseOutput, 7]( 8 element parser.Parser[E, EO], 9 separator parser.Parser[E, SO]) parser.Parser[E, []EO]

Separated1()'s type parameters will be resolved to ...

So finally we know p implements parser.Parser[rune, []string] at compiliation time.

Next example shows us the peachcomb's constraints.


1package main 2 3import ( 4 "fmt" 5 6 "github.com/Drumato/peachcomb/pkg/byteparse" 7 "github.com/Drumato/peachcomb/pkg/combinator" 8) 9 10func main() { 11 sub := byteparse.UInt8() 12 p := combinator.Many1(sub) 13 i, o, err := p([]rune("aaaabaa")) 14 15 fmt.Println(string(i)) 16 fmt.Println(string(o)) 17 fmt.Println(err) 18}

parsers in the sample will be resolved to...

As you know in Playground, this sample will be failed to compile. the actual error message is below.

1./prog.go:13:23: cannot use []rune("aaaabaa") (value of type []rune) as type parser.ParseInput[byte] in argument to p

the above sample mismatched btw the actual input and the expected input. peachcomb can also detect inconsistencies among parsers.


1package main 2 3import ( 4 "fmt" 5 6 "github.com/Drumato/peachcomb/pkg/combinator" 7 "github.com/Drumato/peachcomb/pkg/strparse" 8) 9 10func main() { 11 sub := strparse.Digit1() 12 p := combinator.Map(sub, func(v byte) (bool, error) { return v == 0, nil }) 13 i, o, err := p([]byte("11112222abc")) 14 15 fmt.Println(string(i)) 16 fmt.Println(o) 17 fmt.Println(err) 18}

the actual error message is below.

1./prog.go:12:27: type func(v byte) (bool, error) of func(v byte) (bool, error) {…} does not match inferred type func(string) (O, error) for func(SO) (O, error)

in this sample, p has E: rune, SO: string type parameters so p requires func (v string) -> (O, error) as the 2nd argument, but the actual argument forms func(v byte) -> (bool, error).

Custom Input Types

Almost parsers can receive any custom input types to parse. If you want to know this mechanism in detail, please read the below example.



Today I described you a Go parser library called peachcomb. If you're interested in the project, pleace use this and send me feedbacks!