tartare on Pypi

Tartare: Make homebrew image dataset for machine learning.

You can create your own dataset for Keras and other.

It is a library to easily create a data set for Keras based on the JPEG images you collected. In addition to creating the data set, there is also a data augmentation function.

This version only supports JPEG format.

Compatible with: Python 3.5, 3.6

Operation confirmed:

macOS 10.13 High Sierra
Ubuntu 17.10 Artful Aardvark & Ubuntu 18.04 Bionic Beaver

Requirement

NumPy 1.14.5
SciPy >= 1.1.0
Pillow 5.2.0
Scikit-Learn 0.19.2

Usage

Precondition

Directory Strucure

.
├── tutorial.py
└── apple/
│   ├── 0.jpg
│   ├── 1.jpg
│   ├── 2.jpg
│   ├── 3.jpg
│   └── 4.jpg
│  
└── melon/
   ├── 0.jpg
   ├── 1.jpg
   ├── 2.jpg
   ├── 3.jpg
   └── 4.jpg

1, Data Augmentation

You can augment your images.

augment function

mirror=True : Flips the specified image horizontally.
flip=True : Flip the specified image vertically.
brightness=True : Decrease the brightness of the specified image randomly.
contrast=True : Raise the contrast of the specified image randomly.
mask=True : Based on the vertical or horizontal length of the input image, make a mask 0.3 - 0.5 times the length of the shorter side, and mask the random position of the image. Create 5 pictures.

scaling function

scale : Scaling rate (Float) Make new Directory and save scaled images in those directory.

`tutorial.py`

from tartare.Vision import DataAugmentation

DataAugmentation("apple").init().augment(mirror=True,
                                         flip=True,
                                         brightness=True,
                                         contrast=True,
                                         mask=True)

DataAugmentation("melon").init().augment(mirror=True,
                                         flip=True,
                                         brightness=True,
                                         contrast=True,
                                         mask=True)
                                         
DataAugmentation("apple").init().scaling(scale=1.5)
DataAugmentation("melon").init().scaling(scale=1.5)

2, MakeCategory

You can receive .npz file with contained image data & label

Caution: When creating an .npz file with MakeCategory (), make sure that the number of original image files is the same for all categories.

If you are Mac user, try to remove .DS_Store file.

target_dir : Directory name where images of categories are saved.
label : Correct answer label for supervised learning.
size : Image size when saving. Tuple. (width, height).
mode : Choose a "RGB" or a "gray".
filename: File name when output.
verbose : When True is selected, when the file output succeeds, the result is output.

`tutorial.py`

from tartare.Vision import MakeCategory

MakeCategory(target_dir="apple").init(label=0, size=(64, 64), mode="RGB").export_category(filename="apple.npz", verbose=True)
MakeCategory(target_dir="melon").init(label=1, size=(64, 64), mode="RGB").export_category(filename="melon.npz", verbose=True)

3, BuildDataset

Create a data set based on the .npz file created with MakeCategory ().

Caution: When creating an .npz file with MakeCategory (), make sure that the number of original image files is the same for all categories.

BuildDataset(filename1, filename2, ...) : Specify .npz for each category.
filename: File name when output.
verbose : When True is selected, when the file output succeeds, the result is output.

`tutorial.py`

from tartare.Vision import BuildDataset

BuildDataset("apple.npz", "melon.npz").export_dataset(filename="apple_melon.npz",verbose=True)

4, ExpandImgData

The image (tensor) and label (two dimensions) are given as return values.

filename= Name of Dataset.
test_size: the proportion of the dataset to include in the test split. Default: 0.3
division: Whether or not to divide the data set for training and testing. (Default=True)
shuffle: Whether to shuffle the data set. (Default=True)

`tutorial.py`

from tartare.Vision import ExpandImgData

(x_train, y_train), (x_test, y_test) = ExpandImgData(filename="apple_melon.npz").load_data(test_size=0.3,
                                                                                      division=True,
                                                                                      shuffle=True)

5, Sample

`Sample_NN.py`

from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense
from tartare.Vision import ExpandImgData

def main():
    (x_train, y_train), (x_test, y_test) = ExpandImgData("apple_melon.npz").load_data(test_size=0.3,
                                                                                      division=True,
                                                                                      shuffle=True)

    print(x_train.shape, y_train.shape)
    print(x_test.shape, y_test.shape)
    input_shape = x_train.shape[1] * x_train.shape[2] * x_train.shape[3]

    x_train = x_train.reshape(x_train.shape[0], input_shape)  # 2次元配列を1次元に変換
    x_test = x_test.reshape(x_test.shape[0], input_shape)

    train_image = x_train.astype("float32")
    test_image = x_test.astype("float32")

    train_image /= 255.0
    test_image /= 255.0

    train_label = np_utils.to_categorical(y=y_train, num_classes=2)
    test_label = np_utils.to_categorical(y=y_test, num_classes=2)


    model = Sequential()
    model.add(Dense(512, activation="relu", input_shape=(input_shape,)))
    model.add(Dense(512, activation="relu"))
    model.add(Dense(512, activation="relu"))
    model.add(Dense(2, activation='softmax'))
    model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

    epoch_num = 20
    history = model.fit(x=train_image, y=train_label, batch_size=4, epochs=epoch_num, validation_split=0.1, verbose=1)

    model.summary()

    score = model.evaluate(x=test_image, y=test_label, verbose=0)
    print("Test Loss: {}".format(score[0]))
    print("Test Accuracy: {}".format(score[1]))

if __name__ == "__main__":
    main()

Install

Install Tartare from PyPI:

pip3 install tartare

License

MIT

Author

Hirotaka Kawashima

tartare
Release 0.0.1

Release 0.0.1

0.0.1

0.1.3

0.1.2

0.1.1

0.1

Documentation

Tartare: Make homebrew image dataset for machine learning.

You can create your own dataset for Keras and other.

Requirement

Usage

1, Data Augmentation

augment function

scaling function

`tutorial.py`

2, MakeCategory

`tutorial.py`

3, BuildDataset

`tutorial.py`

4, ExpandImgData

`tutorial.py`

5, Sample

`Sample_NN.py`

Install

License

Author

Stats

Development practices

Releases

Contributors

tartare Release 0.0.1

Release 0.0.1 Toggle Dropdown 0.0.1 0.1.3 0.1.2 0.1.1 0.1

Documentation

Tartare: Make homebrew image dataset for machine learning.

You can create your own dataset for Keras and other.

Requirement

Usage

1, Data Augmentation

augment function

scaling function

tutorial.py

2, MakeCategory

tutorial.py

3, BuildDataset

tutorial.py

4, ExpandImgData

tutorial.py

5, Sample

Sample_NN.py

Install

License

Author

Stats

Development practices

Releases

Contributors

tartare
Release 0.0.1

Release 0.0.1

0.0.1

0.1.3

0.1.2

0.1.1

0.1

`tutorial.py`

`tutorial.py`

`tutorial.py`

`tutorial.py`

`Sample_NN.py`