flatxml

Transform XML blob, file, or OrderedDict to a flat CSV or dictionary.


Keywords
flat, xml, csv
License
MIT
Install
pip install flatxml==0.12

Documentation

Description

Flattens an XML structure by joining nested tags with an underscore. Returns a dictionary if no output file is provided.

Example output

"""
	<media>
		<songs>
			<pop>Thriller</pop>
			<metal>Master of Puppets</metal>
		</songs>
		<movies>
			<comedy>Wedding Crashers</comedy>
			<comedy>Superbad</comedy>
			<horror>Frozen</horror>
		</movies>
	</media>
"""

To CSV

media_songs_pop media_songs_metal media_movies_comedy_[0] media_movies_comedy_[1] media_movies_horror
Thriller Master of Puppets Wedding Crashers Superbad Frozen

As dictionary

{
    'media_songs_pop': 'Thriller',
    'media_songs_metal': 'Master of Puppets',
    'media_movies_comedy_[0]': 'Wedding Crashers',
    'media_movies_comedy_[1]': 'Superbad',
    'media_movies_horror': 'Frozen'
}

Usage

Using a file:

import flatxml

result = flatxml.parse_file('/path/to/input_file.xml')

Using a blob:

import flatxml

result = flatxml.parse_blob("""<a>Apple</a>""")

print(result['a']) # Apple

Using an xmltodict.OrderedDict object:

import flatxml
import xmltodict

xml = """
    <useless-tag1>
        <useless-tag2>
            <important-tag1>Cookies</important-tag1>
            <important-tag2>Butter</important-tag2>
        </useless-tag2>
    </useless-tag1>
"""

ordered_dict = xmltodict.parse(xml)
slim_ordered_dict = ordered_dict['useless-tag1']['useless-tag2']

result = flatxml.parse_ordered_dict(slim_ordered_dict)

print(result['important-tag1']) # Cookies

All parsers accept an output file path as the second argument to generate a CSV

flatxml.parse_blob('...', '/path/to/output.csv')

Installation

pip install flatxml