Index Mapeo data in a SQLite database


License
MIT
Install
npm install @mapeo/sqlite-indexer@1.0.0-alpha.6

Documentation

@mapeo/sqlite-indexer

Node.js CI Coverage Status Npm package version

⚠️ This is an Alpha release and the API might change. Do not use in production. ⚠️

Index Mapeo data in a SQLite database.

Mapeo data is stored in multiple append-only logs (we use Hypercore). The data is structured as a Directed Acylclic Graph (DAG) for each document id: each edit of a particular document is stored as a new document that points to its "parent". This can result in "forks": the same parent can be edited in two different instances of Mapeo, resulting in two versions of the same document.

This indexer accepts batches of Mapeo documents of a particular type (namespace) and indexes the "head" document. If a document is forked then a "winner" is chosen deterministically, either by a timestamp or by comparing version ids. The documents heads are stored in a SQLite database, so that further querying and indexing of the documents is done within SQLite.

Any document that is indexed must have the following type:

type IndexableDocument = {
  id: string
  version: string
  links: string[]
  timestamp?: string | number
  [otherProp: string]: any
}

The SQLite database must include a table for storing these documents that must at a minimum include these columns, but can contain additional columns:

CREATE TABLE IF NOT EXISTS docs
  (
    id TEXT PRIMARY KEY NOT NULL,
    version TEXT NOT NULL,
    links TEXT NOT NULL,
    forks TEXT NOT NULL
  )

The database must also include a table for storing "backlinks" (used internally for indexing which documents are already linked):

  CREATE TABLE IF NOT EXISTS backlinks
    (version TEXT PRIMARY KEY NOT NULL)

For maximum performance, active Write-Ahead Logging and create the tables WITHOUT ROWID.

Table of Contents

Install

npm install @mapeo/sqlite-indexer

Usage

import Database from 'better-sqlite3'
import SQLiteIndexer from './index.js'

const db = new Database(':memory:')

db.pragma('journal_mode = WAL')

db.prepare(
  `CREATE TABLE IF NOT EXISTS docs
  (
    id TEXT PRIMARY KEY NOT NULL,
    version TEXT NOT NULL,
    links TEXT NOT NULL,
    forks TEXT NOT NULL
    ${extraColumns ? ', ' + extraColumns : ''}
  )
  WITHOUT ROWID`
).run()

db.prepare(
  `CREATE TABLE IF NOT EXISTS backlinks
  (version TEXT PRIMARY KEY NOT NULL)
  WITHOUT ROWID`
).run()

const docs = [
  { id: 'A', version: '1', links: [] },
  { id: 'A', version: '2', links: ['1'] },
  { id: 'A', version: '3', links: ['1'] },
  { id: 'A', version: '4', links: ['2', '3'] },
]

const indexer = new SQLiteIndexer(db, {
  docTableName: 'docs',
  backlinkTableName: 'backlinks',
})

indexer.batch(docs)

const A = db.prepare('SELECT * FROM docs WHERE id = A').run()
// { id: 'A', version: '4', links: '2,3', forks: null }

API

const indexer = new SQLiteIndexer(db, opts)

db

Required
Type: BetterSqlite3.Database

An instance of a better-sqlite3 database.

opts

Required
Type: object

opts.docTableName

Required
Type: string

The name of the table for storing the indexed documents.

opts.backlinkTableName

Required
Type: string

The name of the table for storing backlinks (used internally for indexing).

indexer.batch(docs)

Index an array of documents. Documents can be in any order. Documents must have an id property, a version property that is unique, and a links property which is an array of version ids for the documents parent(s).

indexer.deleteAll()

Delete all documents and backlinks. Useful if you want to reset the index.

docs

Requires
Type: Array<{ id: string, version: string, links: string[] }>

Additional properties will be ignored but included in the SQLite table. The document stored in SQLite will have a forks property which is an comma-separated string of version ids for other forks of the document id, if forks exist.

Maintainers

@digidem

Contributing

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © 2022 Digital Democracy