Skip to content
Advertisement

Pyparsing recursive type definition in DSL

I am parsing a domain specific language which has several type definitions. Primitive types have been relatively straight-forward to parse, but moving on to the complex types in the DSL have proven more challenging. I am looking for a strategy to recursively match a definition that can contain a reference to itself.

For example, the map data type can look like this at a high level:

JavaScript

Which can appear in the input as a recursive definition:

JavaScript

This map of maps can theoretically be of arbitrary depth.

To make matters more complicated, these nested complex types could also be other complex types as well.

JavaScript

I’m looking for a strategy to handle parsing this so that I get an accurate type containing the nested map type parsed out. I considered just using a regex, but this doesn’t feel optimal to me. I also know there exists Forward for recursive grammars, but I haven’t quite been able to figure out how to apply it to this scenario. If there are any good patterns or references for this type of parsing I would really appreciate any guidance!

Advertisement

Answer

Forward is the way to do this. Since type declarations may themselves contain type declarations, this is probably a good choice for where the Forward will go:

JavaScript

Looking at your examples, I can see we will need some punctuation, we can define some suppressed expressions for these so that they parse, but won’t junk up the parsed results:

JavaScript

Next we can do the simple types:

JavaScript

Now the two complex types, map and array. For these, we will use the Forward wherever a complex type might occur:

JavaScript

I use the ‘-‘ operator here since, if there is a syntax error after seeing a valid “map” or “array” keyword, pyparsing will give us the error within that expression.

At this point we have our complex and simple types, so we can “inject” them into the previously-defined Forward, using the <<= operator:

JavaScript

Check using runTests:

JavaScript

Gives:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement