Skip to content
Phaiax edited this page Jan 19, 2019 · 9 revisions

What is combine?

combine is a parser combinator library. Let's explain that in two steps.

A parser is a thing that turns some input (for example a &str) into some output (for example (i32, Vec<i32>)) by applying an algorithm.

"combinator" refers to a special way of defining the exakt algorithm for a parser. If you write a parser from scratch, you usually end up with a big state machine and lots of slice handling. In constrast, a parser combinator defines the final parser by combining small building blocks. This is how it looks like:

# use combine::parser::range::{range, take_while1};
# use combine::parser::repeat::{sep_by};
# use combine::parser::Parser;

let input = "Hammer, Saw, Drill";

// a chain of alphabetic characters
let tool = take_while1(|c : char| c.is_alphabetic());

// many `tool`s, seperated by ", "
let mut tools = sep_by(tool, range(", "));

let output : Vec<&str> = tools.easy_parse(input).unwrap().0;
// vec!["Hammer", "Saw", "Drill"]

Listing A-1 - 'Hello combine' example

take_while1, range and sep_by are building blocks from the combine library. tool and tools are self-made building blocks. The latter is also the final parser.

Note: From now on, I will no longer use the term 'building block', but instead call them 'parsers'. Parsers that have nested parsers are 'combinators'.

Tutorial

Learn combine with the not so quick Quickstart Tutorial.

Inner machinery

Every parser in every language needs roughly these four things to work:

It may also support one or more of these extra functionalities

  • Resume parsing / streaming of input data
  • Giving location information of input data tokens (e.g. line, column for text input)

combine tries to be as generic as possible in these things which results in quite a few trait bounds all over the place.

The linked chapters describe the combine way of these things and why they are the way they are. This helps a lot understanding error messages and dealing with sticks and stones.

Alternatives

For reference, here are some alternatives in the rust ecosystem:

All parser libraries come with their own trade offs, so choose wisely 😄 .

Clone this wiki locally