fasttrainer
- Free software: MIT license
- Documentation: https://fst2.readthedocs.io.
Overview
Trainning a nlp deeplearning model is not that easy especially for the new commers, cus you have to take care of many things: preparing data, capsulizing models using pytorch or tensorflow , and worry about a lot of overwhelming staffs like gpu settings, model setting etc., that make the whole process tedious and boring.
The goal of fst2 is just making the whole trainning process comfortable and easy based on different nlp tasks. Under the hood, fst2 leverage the power of the convenient package transformers
to make this happen, if you are new to transfomers, better check their homepage first.
Features
- Train your nlp models fast and easily based on different nlp tasks, fst2 supports the following tasks:
-
- 1 Name entity recognition.
- 2 Textclassification or Sentiment classification.
- 3 Text generation(comming soon).
- And so on.
Install
Install fst2 is very easy, just run:
pip install fst2
Quick Start
The whole process of trainning a nlp model is as following:
1 Prepare -> 2 Start -> 3 Use your model
1 Run fst prepare --gen-config --gen-dir
in the shell to choose a nlp task interactively(Type a corresponding number based on task is all you need).
gen-config
flag will generate a default config file -- configs.yml just under the current working directory. The configs.yml looks like the following
pipeline: task: ner do_train: true ... data: data_dir: fst/data_dir ... model: model_name_or_path: fst/pretrained_model ... tokenizer: tokenizer_name: ... train: num_train_epochs: 3 ...
Most settings are ok, you can just leave them alone, if you wanna change some of the settings, just feel free to do that.
gen-dir
flag will genenrate a buch of default directories for you, the directory tree looks like this:
fst ├── data ├── output_dir ├── pretrained_model └── result_dir
you can set a parent directory name by using parent-dirname
flag follows a directory name if you don't like the default parent name -- fst.
each subdirectory name and the meaning is explained as below:
- data: Holding necessary data files, e.g. a train data file, a label file and other necessary data files based on your seetings(like test file if you turn the
do_test
setting on in configs.yml), you should take care of your data format and delimiter. if let say you wanna train a ner model and you setdo_train
totrue
,do_test
totrue
,evaluate_during_training
to true, andlabel_file
tofst/data/labels.txt
in the configs.yml(also the default) , then you should put some data files under data directory, like below:
data   ├── dev.txt   ├── labels.txt   ├── test.txt   └── train.txt
and set a correct delimiter in the configs.yml. fst now only support data file with .txt extension for name entity recognition, see details for other tasks in the official documentation under Usage
section. And you can download a small sample of data from example_datas
under the root of this repo.
- pretrained_model: Holding pretrainned_model(In default fst will load model from pretrained_model directory
, but you can also change the value of model_name_or_path
in configs.yml to the pretrained model names provided by huggingface), the directory will looks like below:
pretrained_model ├── config.json ├── pytorch_model.bin └── vocab.txt
you can download pretrained model from google-reserch for bert model. Then use transfomers-cli to convert tf model into pytorch model.
- output_dir: Holds models that you trained.
- result_dir: Holds performance reports and predictions based on your test file.
2 Now just run the commands to start trainning:
fst start
3 After the train, You can use your well trained modle as input model for anthor loop of trainning or just use the transformers-cli serve command to serve your model, here is an example, use the following command to serve your model:
transfomers-cli serve --task ner --model {your trainned modle path} --tokenizer {your trainned modle path}
then the models should be served on the 8888 port of your localhost. Vist SwaggerUI Page and test your modle (if you want change the host, feel free to add a --host
flag).
Here is a demo , we can use our fresh trained model to predict the location in a sentence(chinese), which is often seen in task like intent recognition.
Then the result.
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.