onenote_parser

A parser for Microsoft OneNote® files


Keywords
parser, onenote, onenote-files, onenote-revision-store, rust
License
MPL-2.0

Documentation

Rust OneNote® File Parser

A parser for Microsoft OneNote® files implemented in Rust.

Status

Work in progress. Right now it can parse most of OneNote file contents but only if the files are in the FSSHTTP packaging format [MS-ONESTORE] 2.8. OneNote files as created and stored by the OneNote 2016 desktop application are not yet supported.

Goals

  • Read OneNote files available through both the OneNote 2016 application as well as through OneDrive download
  • Convert OneNote notebooks and sections into HTML (see the one2html project)

Non-Goals

  • The ability to write OneNote files

Architecture

The code organization and architecture follows the OneNote file format which is build from several layers of encodings:

  • fsshttpb/: This implements the FSSHTTP binary packaging format as specified in [MS-FSSHTTPB]: Binary Requests for File Synchronization via SOAP Protocol. This is the lowest level of the file format and specifies how objects and their relationships are encoded (and decoded) from a binary stream (in our case a file).
  • onestore/: This implements the OneStore format as specified in [MS-ONESTORE]: OneNote Revision Store File Format which describes how a OneNote revision store file (also called OneStore) containing all OneNote objects is stored in a FSSHTTP binary packaging file. This also includes the file header ([MS-ONESTORE] 2.8) and then how the OneNote revision store is built from the FSSHTTP objects and revisions ([MS-ONESTORE] 2.7).
  • one/: This implements the OneNote file format as specified in [MS-ONE]: OneNote File Format. This specifies how objects in a OneNote file are parsed from a OneStore revision file.
  • onenote/: This finally implements an API that provides access to the data stored in a OneNote file. It parses the FSSHTTPB data, the revision store data and then constructs the objects contained by the OneNote file. This includes resolving all references, e.g. looking up page's paragraphs.

Related Resources

Disclaimer

This project is neither related to nor endorsed by Microsoft in any way. The author does not have any affiliation with Microsoft.