Upstream is June 5 👉 RSVP

xml-cleaner
Release 2.0.4

Word and sentence tokenization.

Keywords: XML, natural-language-processing, python, text, text-analysis, tokenizer
License: MIT
Install: pip install xml-cleaner==2.0.4

Documentation

Ciseau

Word and sentence tokenization in Python.

Usage

Use this package to split up strings according to sentence and word boundaries. For instance, to simply break up strings into tokens:

tokenize("Joey was a great sailor.")
#=> ["Joey ", "was ", "a ", "great ", "sailor ", "."]

To also detect sentence boundaries:

sent_tokenize("Cat sat mat. Cat's named Cool.", keep_whitespace=True)
#=> [["Cat ", "sat ", "mat", ". "], ["Cat ", "'s ", "named ", "Cool", "."]]

sent_tokenize can keep the whitespace as-is with the flags keep_whitespace=True and normalize_ascii=False.

Installation

pip3 install ciseau

Testing

Run nose2.

If you find this project useful for your work or research, here's how you can cite it:

@misc{RaimanCiseau2017,
  author = {Raiman, Jonathan},
  title = {Ciseau},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/jonathanraiman/ciseau}},
  commit = {fe88b9d7f131b88bcdd2ff361df60b6d1cc64c04}
}

Dependencies: 0
Dependent packages: 2
Dependent repositories: 2
Total releases: 27
Latest release: Dec 29, 2016
First release: Oct 27, 2014
Stars: 9
Forks: 3
Watchers: 2
Contributors: 1
Repository size: 39.1 KB
SourceRank: 9

Source repo 2FA enabled: TEXT!
Package manager 2FA enabled: TEXT!
Is security responsive: TEXT!
Dependencies are managed: TEXT!
Issue-free release available: TEXT!
Succession plan available: TEXT!
Package manager 2FA enabled: TEXT!

Releases

2.0.4: Dec 29, 2016
2.0.3: Dec 16, 2016
2.0.2: Dec 10, 2016
2.0.1: Dec 7, 2016
2.0.0: Dec 5, 2016
1.0.21: Sep 26, 2015
1.0.20: Sep 22, 2015
1.0.19: Sep 22, 2015
1.0.18: Jun 26, 2015
1.0.17: Mar 9, 2015

See all 27 releases

Contributors

See all contributors

Something wrong with this page? Make a suggestion

Export .ABOUT file for this package

Last synced: 2021-12-15 16:47:07 UTC

xml-cleaner
Release 2.0.4

Release 2.0.4

2.0.4

2.0.3

2.0.2

2.0.1

2.0.0

1.0.21

1.0.20

1.0.19

1.0.18

1.0.17

Documentation

Ciseau

Usage

Installation

Testing

Stats

Development practices

Releases

Contributors

xml-cleaner Release 2.0.4

Release 2.0.4 Toggle Dropdown 2.0.4 2.0.3 2.0.2 2.0.1 2.0.0 1.0.21 1.0.20 1.0.19 1.0.18 1.0.17

Documentation

Ciseau

Usage

Installation

Testing

Stats

Development practices

Releases

Contributors

xml-cleaner
Release 2.0.4

Release 2.0.4

2.0.4

2.0.3

2.0.2

2.0.1

2.0.0

1.0.21

1.0.20

1.0.19

1.0.18

1.0.17