PPeg

A Python port of Lua's LPeg pattern matching library


Keywords
parsing, peg, grammar, regex
License
MIT
Install
pip install PPeg==0.9.4

Documentation

PPeg

PPeg is a pattern matching library for Python, based on Parsing Expression Grammars (PEGs). It's a port of the LPeg library from Lua.

Warning

PPeg is alpha software, it's not ready for general use. There are bugs, and they will crash/coredeump your Python process.

Warning

PPeg is experimental. The API and semantics are not stable. Future releases will break backward compatibility, without warning.

Usage

Unlike the re module [1], PPeg patterns can handle balanced sequences

>>> from _ppeg import Pattern as P
>>> pattern = P.Grammar('(' + ( (P(1)-P.Set('()')) | P.Var(0) )**0 + ')')
>>> pattern('(foo(bar()baz))').pos
15
>>> pattern('(foo(bar(baz)').pos
-1
>>> capture = P.Cap(pattern)
>>> capture('(foo(bar()baz))').captures
['(foo(bar()baz))']

This example corresponds roughly to the following LPeg example

> lpeg = require "lpeg"
> pattern = lpeg.P{ "(" * ((1 - lpeg.S"()") + lpeg.V(1))^0 * ")" }
> pattern:match("(foo(bar()baz))") -- Lua indexes begin at 1
16
> pattern:match("(foo(bar(baz)")
nil
> capture = lpeg.C(pattern)
> capture:match("(foo(bar()baz))")
"(foo(bar()baz))"
[1] Some regular expression implementations (e.g. PCRE, regex) support recursive patterns, which can match balanced sequences.

Limitations

  • PPeg only supports CPython 2.6 and 2.7.
  • PPeg doesn't support Unicode, only byte strings can be matched or searched. This is closely tied to how Lua and LPeg handle strings.
  • PPeg is untested on any platform except 64-bit Linux.
  • Bugs, lots of bugs.