A parser for MediaWiki titles


Keywords
mediawiki, title, pagename, parser
License
GPL-3.0-only
Install
pip install mwtp==4.0.0

Documentation

MediaWikiTitleParser

Documentation status Tests License Supported versions

MWTP is a parser for MediaWiki titles. Its logic is partly derived from mediawiki.Title, and hence is licensed under GNU GPL.

It works as simple as follows:

from mwtp import TitleParser as Parser


parser = Parser(namespaces_data, namespace_aliases)
title = parser.parse(' _ FoO: this/is A__/talk page _ ')

print(repr(title))
# Title('Thảo luận:This/is A /talk page')

namespaces_data and namespace_aliases can be obtained by making a query to a wiki's API with action=query&meta=siteinfo&siprop=namespaces|namespacealiases:

namespaces_data = {
  '0': { 'id': 0, 'case': 'first-letter', 'name': '',          ...: ... },
  '1': { 'id': 1, 'case': 'first-letter', 'name': 'Thảo luận', ...: ... },
  ...: ...
}
namespace_aliases = [
  { 'id': 1, 'alias': 'Foo' },
  ...
]

Note that the following format (&formatversion=1) is not supported. Always use &formatversion=2 or &formatversion=latest.

namespaces_data = {
  '0': { 'id': 0, 'case': 'first-letter', '*': '',          ...: ... },
  '1': { 'id': 1, 'case': 'first-letter', '*': 'Thảo luận', ...: ... },
  ...: ...
}
namespace_aliases = [
  { 'id': 1, '*': 'Foo' },
  ...
]

For more information, see the documentation.