Get n-grams from text


Keywords
natural, language, n, gram, n-gram, unigram, bigram, trigram, dugram, enneagram, heptagram, hexagram, ngram, octogram, pentagram, tetragram
License
MIT
Install
npm install n-gram@2.0.2

Documentation

n-gram

Build Coverage Downloads Size

Get n-grams.

Contents

What is this?

This package gets you bigrams (or any n-gram, really).

When should I use this?

You’re probably dealing with natural language, and know you need this, if you’re here!

Install

This package is ESM only. In Node.js (version 12.20+, 14.14+, 16.0+), install with npm:

npm install n-gram

In Deno with esm.sh:

import {nGram} from 'https://esm.sh/n-gram@2'

In browsers with esm.sh:

<script type="module">
  import {nGram} from 'https://esm.sh/n-gram@2?bundle'
</script>

Use

import {bigram, trigram, nGram} from 'n-gram'

bigram('n-gram') // ['n-', '-g', 'gr', 'ra', 'am']
nGram(2)('n-gram') // ['n-', '-g', 'gr', 'ra', 'am']

trigram('n-gram') // ['n-g', '-gr', 'gra', 'ram']

nGram(6)('n-gram') // ['n-gram']
nGram(7)('n-gram') // []

// Anything with a `.length` and `.slice` works: arrays too.
bigram(['alpha', 'bravo', 'charlie']) // [['alpha', 'bravo'], ['bravo', 'charlie']]

API

This package exports the identifiers nGram, bigram, and trigram. There is no default export.

nGram(n)

Create a function that converts a given value to n-grams.

Want padding (to include partial matches)? Use something like the following: nGram(2)(' ' + value + ' ')

bigram(value)

Shortcut for nGram(2).

trigram(value)

Shortcut for nGram(3).

Types

This package is fully typed with TypeScript. It exports no additional types.

Compatibility

This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+, 16.0+, and 18.0+. It also works in Deno and modern browsers.

Related

Contribute

Yes please! See How to Contribute to Open Source.

Security

This package is safe.

License

MIT © Titus Wormer