reyaml

Reader of humane YAML files


Keywords
yaml, parser, configuration, config
License
BSD-3-Clause
Install
pip install reyaml==0.2.2

Documentation

reyaml

A humane YAML parser that tolerates comments, inline comments, whitespace; and catches some basic syntax errors for you.

The primary objective is to load configuration files. It is handy to leave comments inside the config file itself, to make administration tasks easier.

Setting it up is as easy as pip install reyaml.

Examples of use

There are two functions you need: load parses a string, while load_from_file takes a file path instead. They return a dictionary.

import reyaml

raw_config = """
log:
	# valid options: {debug, info, warning, error, critical}
	# if not specified, no logging takes place
	level: debug

	# full path to log file. If not specified, logging to STDOUT
	#path: taxisomatic.log 

	# this must be a valid Python log format string, reference:
	# https://docs.python.org/2/library/logging.html#logrecord-attributes 
	# If not specified, the default is `'%(asctime)-15s %(levelname)s %(message)s'`
	#format: "%(asctime)-15s %(levelname)s %(message)s"
	format: "%(asctime)-15s %(levelname)s %(filename)s %(funcName)s %(message)s"
"""

config = reyaml.load(raw_config)
print config
>>> {'log': {'level': 'debug', 'format': '%(asctime)-15s %(levelname)s %(filename)s %(funcName)s %(message)s'}}

# you can also load it from a file directly
# config = reyaml.load_from_file('system.conf')

Example configuration

Here is a YAML file that reyaml can absorb without a hitch:

# This is the main configuration file of the system, it is written in a
# relaxed flavour of YAML that gives you more freedom and makes the file
# more readable.

# As you have already noticed, a line that begins with '#' is a comment,
# it is ignored.

	#A comment can begin anywhere on the line, it is still ignored

# Empty lines are ignored as well.

# Strings don't have to be quoted, example:
#	test: the wise fox
#	url: http://myaddress.com

# However, sometimes you need quotes, for example if the string is a URL with 
# an anchor, the '#' sign will be treated as a comment, unless quoted.
#			trickyUrl: http://address.com/path#anchor  #WRONG
#			trickyUrl: "http://address.com/path#anchor"  #CORRECT, double-quote
#			trickyUrl: 'http://address.com/path#anchor'  #CORRECT, single-quote

# As you can see above, you can quote strings with double- and single-quotes,
# they are both valid.

# Lines can be indented with spaces or tabs, but you cannot mix them. An error
# will be shown if you do that.

# The configuration loader is designed to fail and make a lot of noise if something
# is wrong with the syntax. You will see an error and a chunk of the problematic
# line.
##############################################################################



# database settings go here; these will be used to connect to PostgreSQL
db:
	host: localhost # doesn't work with 127.0.0.1!!
	port: 60000
	db: ts_db
	user: jedi
	password: noAuthNeeded

# stuff related to the LDAP server
ldap:
	host: 192.135.1.7
	port: 389
	password: Cxronomat0peaea
	timeout: 3 # in seconds

	# the values below don't have to be quoted, but it gives one
	# psychological comfort when they have no doubts about how a
	# setting is interpreted, so there you go
	user: "cn=admin,o=Dekart"


# RabbitMQ connection settings
amqp:
	url: amqp://guest:guest@127.0.0.1



log:
	# valid options: {debug, info, warning, error, critical}
	# if not specified, no logging takes place
	level: debug

	# full path to log file. If not specified, logging to STDOUT
	#path: taxisomatic.log 

	# this must be a valid Python log format string, reference:
	# https://docs.python.org/2/library/logging.html#logrecord-attributes 
	# If not specified, the default is `'%(asctime)-15s %(levelname)s %(message)s'`
	#format: "%(asctime)-15s %(levelname)s %(message)s"
	format: "%(asctime)-15s %(levelname)s %(filename)s %(funcName)s %(message)s"

Rationale

Human errors can be roughly categorized as such:

  • Slip - your intention is right, but your action is wrong; e.g. accidentally press another button.
  • Mistake - your thought process was incorrect in the first place; e.g. you think that you can rename a file by selecting it and then pressing Shift+Del - the outcome is not what you wanted, even though the execution was flawless and you pressed exactly what you wanted.

YAML simplifies life by reducing the possibility of making slips in a configuration file. Below is an excerpt of a RabbitMQ configuration, it uses a different notation. You can see that it is easy to forget a bracket or miss a closing quote. As a result, you have to invest a lot of mental effort into counting symbols, when those problems could have easily been resolved by switching to YAML and leveraging indentation to express structure.

...
[
 {rabbit,
  [
   %% To listen on a specific interface, provide a tuple of {IpAddress, Port}.
   %% For example, to listen only on localhost for both IPv4 and IPv6:
   %% 
   {tcp_listeners, [{"127.0.0.1", 5672},
                    {"::1",       5672}]},
...

Added benefits

Reyaml takes that one step further and augments YAML by adding some additional features, here are a few:

  • allow whitespace, i.e. empty lines that separate sections logically
  • host: localhost # doesn't work with 127.0.0.1!! - inline comment
  • tab indentation
  • catch mixed tabs and spaces
  • tell you about # when it is not clear whether this is a comment or an anchor, e.g. https://docs.python.org/2/library/logging.html#logrecord-attributes
  • fail by making explicit remarks about what happened, instead of silently dropping #logrecord-attributes in the string above and moving on, leaving you clueless when you see that your system is doing something other than what is written in the configuration.