Unicode string ops for TensorFlow

tensorflow, unicode, string, op
pip install tfunicode==2.2.0



Infrastructure to build TensorFlow custom ops wheel for unicode string preprocessing. For more info about provided ops see package README

Develop environment

Install all dependencies including python headers. Do not use pyenv in MacOS X, otherwise tests mostly likely will fail.

Build PIP package manually

You can build the pip package with Bazel:

export PYTHON_BIN_PATH=`which python2.7`
$PYTHON_BIN_PATH -m pip install -U tensorflow  # Only if you did not install it yet
bazel clean --expunge
bazel test --test_output=errors //tfunicode/...
bazel build build_pip_pkg
bazel-bin/build_pip_pkg wheels

Build release with Linux docker container

docker run -it -v `pwd`:/tfunicode library/ubuntu:xenial /tfunicode/build_linux_release.sh

Install and test PIP package

Once the pip package has been built, you can install it with:

pip install wheels/*.whl

Now you can test out the pip package:

cd /
python -c "import tensorflow as tf;import tfunicode as tfu;print(tfu.transform_zero_digits('123').eval(session=tf.Session()))"

You should see the op zeroed out all nonzero digits in string "123":