a simple parser combinator inspired by RelaxNG
Unlaxer is a parser combinator written in Java.
1.0.0
see build.gradle
no third pary library required
git clone git@bitbucket.org:opaopa6969/unlaxer.git unlaxer
cd unlaxer
./gradlew install
./gradlew test
./gradlew eclipse
this command make eclipse .project file.
after you launched the Eclipse , goto menu Menu > File > Import > Existing Projects into workspace > select root folder that contains this project and check 'Search for nested projects' and click 'Finish'
./gradlew idea
You can run sample program to undestand how to use unlaxer.
Sample programa are stored in src/test/java folder for each project.
example org.unlaxer.elementary.MappedSingleCharacterParserTest.java stored in src/test/java/org.unlaxer.elementary.MappedSingleCharacterParserTest.java
public class MappedSingleCharacterParserTest extends ParserTestBase {
@Test
public void testExcludes() {
MappedSingleCharacterParser punctuationParserWithoutParenthesis =
new PunctuationParser().newWithout("()");
OneOrMore parser = new OneOrMore(punctuationParserWithoutParenthesis);
testAllMatch(parser, "$%&");
testPartialMatch(parser, "$%(&", "$%");
testUnMatch(parser, "()");
}
}
this test means
The parser defined to matched with one or more Punctuation without charactor '(' and ')'
First test method testAllMatch(oneOrMore, "$%&"); means parser accepts "$%&" 3 charactors exact match
Second test method testPartialMatch(oneOrMore, "$%(&", "$%"); means parser accepts partial of test string. parser dose not match '(' and ')'. cause, parser matched only "$%"
Third test method testUnMatch(oneOrMore, "()"); means parser does not match all test string.
After runnning the test. ParserTestBase creates parser log into build/parserTest/org.unlaxer.elementary.MappedSingleCharacterParserTest
testExcludes_testAllMatch_(1,L16).combined.log
testExcludes_testAllMatch_(1,L16).parse.log
testExcludes_testAllMatch_(1,L16).token.log
testExcludes_testAllMatch_(1,L16).transaction.log
testExcludes_testPartialMatch_(2,L17).combined.log
testExcludes_testPartialMatch_(2,L17).parse.log
testExcludes_testPartialMatch_(2,L17).token.log
testExcludes_testPartialMatch_(2,L17).transaction.log
testExcludes_testUnMatch_(3,L18).combined.log
testExcludes_testUnMatch_(3,L18).parse.log
testExcludes_testUnMatch_(3,L18).token.log
testExcludes_testUnMatch_(3,L18).transaction.log
try reading and running test program in unlaxer-common/src/test/java
this product already uploaded to maven central
add dependency to build.gradle
project(':application') {
dependencies {
compile ':unlaxer-common:'
}
}
Usage001_createParserAndParse.java
Sample parser build as following EBNF grammar.
<Clause> ::= [0-9]+([-+*/][0-9]+)*
static Parser createDigitsAndOperatorsParser() {
//<Clause> ::= [0-9]+([-+*/][0-9]+)*
Chain clauseParser = new Chain(
new OneOrMore(new DigitParser()),
new ZeroOrMore(
new Chain(
new Choice(
new PlusParser(),
new MinusParser(),
new MultipleParser(),
new DivisionParser()
),
new OneOrMore(new DigitParser())
)
)
);
return clauseParser;
}
ParseContext parseContext = new ParseContext(new StringSource("1+2+3"));
Parsed parsed = parser.parse(parseContext);
//get parsing status
System.out.format("parsed status: %s \n" , parsed.status);
got
parsed status: succeeded
//get rootToken
System.out.format("parsed Token: %s \n" , parsed.getRootToken());
got
parsed Token: '1+2+3' (0 - 5): org.unlaxer.PseudoRootParser
//get tokenTree representation
System.out.format("parsed TokenTree: %s \n" , TokenPrinter.get(parsed.getRootToken()));
got
parsed TokenTree:
'1+2+3' : org.unlaxer.PseudoRootParser
'1' : org.unlaxer.posix.DigitParser
'+' : org.unlaxer.ascii.PlusParser
'2' : org.unlaxer.posix.DigitParser
'+' : org.unlaxer.ascii.PlusParser
'3' : org.unlaxer.posix.DigitParser
Usage001_createParserAndParseWithNamedConcrete.java
static class SimpleExpression extends LazyChain{
@Override
public List<Parser> getLazyParsers() {
return new Parsers(
new NumberParser(),
new ZeroOrMore(
new OperatorAndOperandParser()
)
);
}
}
static class NumberParser extends LazyOneOrMore{
@Override
public Parser getLazyParser() {
return new DigitParser();
}
@Override
public Optional<Parser> getLazyTerminatorParser() {
return Optional.empty();
}
}
static class OperatorParser extends LazyChoice{
@Override
public List<Parser> getLazyParsers() {
return new Parsers(
new PlusParser(),
new MinusParser(),
new MultipleParser(),
new DivisionParser()
);
}
}
static class OperatorAndOperandParser extends LazyChain{
@Override
public List<Parser> getLazyParsers() {
return new Parsers(
new OperatorParser(),
new NumberParser()
);
}
}
parse and then got Tree
parsed TokenTree:
'1+2+3' : sample.Usage001_createParserAndParseWithNamedConcrete$SimpleExpression
'1' : sample.Usage001_createParserAndParseWithNamedConcrete$NumberParser
'1' : org.unlaxer.posix.DigitParser
'+2' : sample.Usage001_createParserAndParseWithNamedConcrete$OperatorAndOperandParser
'+' : sample.Usage001_createParserAndParseWithNamedConcrete$OperatorParser
'+' : org.unlaxer.ascii.PlusParser
'2' : sample.Usage001_createParserAndParseWithNamedConcrete$NumberParser
'2' : org.unlaxer.posix.DigitParser
'+3' : sample.Usage001_createParserAndParseWithNamedConcrete$OperatorAndOperandParser
'+' : sample.Usage001_createParserAndParseWithNamedConcrete$OperatorParser
'+' : org.unlaxer.ascii.PlusParser
'3' : sample.Usage001_createParserAndParseWithNamedConcrete$NumberParser
'3' : org.unlaxer.posix.DigitParser
//create parseContext with createMeta specifier
ParseContext parseContext =
new ParseContext(
new StringSource("1+2+3"),
CreateMetaTokenSprcifier.createMetaOn // <- specify createMetaOn
);
got token tree
'1+2+3' : org.unlaxer.combinator.Chain
'1' : org.unlaxer.combinator.OneOrMore
'1' : org.unlaxer.posix.DigitParser
'+2+3' : org.unlaxer.combinator.ZeroOrMore
'+2' : org.unlaxer.combinator.Chain
'+' : org.unlaxer.combinator.Choice
'+' : org.unlaxer.ascii.PlusParser
'2' : org.unlaxer.combinator.OneOrMore
'2' : org.unlaxer.posix.DigitParser
'+3' : org.unlaxer.combinator.Chain
'+' : org.unlaxer.combinator.Choice
'+' : org.unlaxer.ascii.PlusParser
'3' : org.unlaxer.combinator.OneOrMore
'3' : org.unlaxer.posix.DigitPars
Combinator parser accepts one or more child parser. This parser returns parsed result from that aggregates each children's parsed result.
Choice is select parser in children at match first.
sample code:
Choice digitOrSign = new Choice(
Singletons.get(DigitParser.class),
Singletons.get(SignParser.class)
);
String[] tests={"1","a","-"};
StringSource source = new StringSource(sourceString);
for(String test: tests){
try(ParseContext parseContext = new ParseContext(source)){
Parsed parsed = parser.parse(parseContext);
System.out.format("%s : match = %s\n" , test , parsed.success);
}
}
results:
1 : match = true
a : match = true
- : match = false
Chain (org.unlaxer.combinator.Chain)
NonOrderd (org.unlaxer.combinator.NOnOrdered)
prune syntax tree to get ast
ScopeTree
group + group reference
javadoc language set to english
Syntax tree reducer(eg . cut filter MetaFunctionParser)
SuggestableParser
thinking about parent parser for TerminatorParser in Occurs.
ChainTest activate ignored test
slice
Token stores consumed and matched each or Token stores TokenKind(matchOnly or consumed)
ReverseParser
FlattenParser
group + group reference
MIT