Simple DSL to generate training file for NLU engines


Keywords
chatbot, dsl, javascript, nlu, python, training-dataset
License
GPL-3.0
Install
npm install chatl@2.0.1

Documentation

chatl Build Status codecov npm version pypi version License

Tiny DSL used to generates training dataset for NLU engines (currently snips-nlu and rasa nlu). Heavily inspired by chatito.

DSL specifications

# The chatl syntax is easy to understand.
# With the `%` symbol, you define intents. Intent data can contains entity references
# with the `@` and optional (using `?` as a suffix) synonyms with `~`.
# Synonyms here will be used to generate variations of the given sentence.
%[get_weather]
  ~[greet?] what's the weather like in @[city] @[date]
  ~[greet] will it rain in @[city] at @[date#hour]

# With the `~`, you define synonyms used in intents and entities definitions.
~[greet]
  hi
  hey

~[new york]
  ny
  nyc
  the big apple

# With the `@`, you define entities.
@[city]
  paris
  rouen
  ~[new york]

# You can define properties on intents, entities and synonyms.
# Depending on the dataset you want to generate, some props may be used
# by the adapter to convey specific meanings.
@[date](type=datetime, another=`property backticked`)
  tomorrow
  this weekend
  this evening

# You can define entity variants with the `#` symbol. Variants will be merge with
# the entity it references but is used to provide different values when
# generating intent sentences.
@[date#hour]
  ten o'clock
  noon

Adapters

For now, only the snips adapter and rasa nlu has been done. Here is a list of adapters and their respective properties:

adapter snips rasa
type (1) ✔️ ✔️
extensible (2) ✔️
strictness (3) ✔️
regex (4) ️️❌️ ✔️
  1. Specific type of the entity to use (such as datetime, temperature and so on). For snips, if the given entity name could not be found in the chatl declaration, it will assume its a builtin one (supported types are listed here without the snips/ prefix). For rasa, it will use the referenced type as the slot name.
  2. Are values outside of training samples allowed?
  3. Parser threshold
  4. Regex pattern used to help the NLU to extract stuff

Installation

Choose your flavor and follow the lead: