mlimages

gather image data and create training data for machine learning


Keywords
imagenet, machine, learning, computer-vision, machine-learning
License
Other
Install
pip install mlimages==0.5

Documentation

mlimages

gather and create image dataset for machine learning.

imagenet

How to use

pip install mlimages

Or clone the repository. Then you can execute examples. If you want to do fine tuning, you can download pretrained model in examples/pretrained by git lfs.

This tool dependes on Python 3.5 that has async/await feature!

Gather Images

Please make python file in your project folder as below.

import mlimages.scripts.gather_command as cmd

if __name__ == "__main__":
    ps = cmd.make_parser()
    args = ps.parse_args()
    cmd.main(args)

Imagenet

Confirm the WordnetID on the ImageNet site

imagenet

Then download it.

python your_script_file.py -p path/to/data/folder -imagenet --wnid n11531193

Labeling

You can create training data from images data folder.

Please make python file in your project folder as below.

import mlimages.scripts.label_command as cmd

if __name__ == "__main__":
    ps = cmd.make_parser()
    args = ps.parse_args()
    cmd.main(args)

Then run it.

python label.py path/to/images/folder --out path/to/training_data.txt

Training

Now, you have images and training_data.txt. But you have to do some pre-processing to train your model. For example...

  • resize image
  • normalize image
  • sometimes change color...

😭

Don't warry. mlimages supports you!

from mlimages.model import LabelFile, ImageProperty


lf = LabelFile("path/to/training_data.txt", img_root="path_to_your_image_folder")
prop = ImageProperty(width=32, gray_scale=True)

td = lf.to_training_data(prop)
td.make_mean_image("path/to/mean_image")  # store mean image to normalize

for d in td.generate():
    # d is numpy array that adjusted according to ImageProperty, and normalized by mean_image!
    # only you have to do is train the model by it!
    print(d)

And also, you can restore the image from data.

image = td.result_to_image(numpy_array, label)