The clip_similarwords is the implementation of finding similar 1-token words of OpenAI's CLIP in less than one second.

OpenAI's CLIP uses text-image similarities so its text-text similarities may also be text's typical image similarities unlike WordNet or other synonym dictionaries.

Note that, for speed and storage reason (PyPI is limited to 60MB), the words composed by 2 or more tokens are not supported.

Installation

clip_similarwords is easily installable via pip command:

pip install clip_similarwords

pip install git+https://github.com/nazodane/clip_similarwords.git

Usage of the command

~/.local/bin/clip-similarwords [ word_fragment | --all ]

Usage of the module

from clip_similarwords import CLIPTextSimilarWords
clipsim = CLIPTextSimilarWords()
for key_token, sim_token, cos_similarity in clipsim("cat"):
    print("%s -> %s ( cos_similarity: %.2f )"%(key_token, sim_token, cos_similarity))

Requirements for model uses

Linux (should also works on other environmets)

no PyTorch nor CUDA are required.

Requirements for model generation

Linux
Python 3.10 or later
PyTorch 1.13 or later
CUDA 11.7 or later
DRAM 16GB or higher
RTX 3060 12GB or higher

The patches and informations on other enviroments are surely welcome!

License

The codes are under MIT License. The model was converted under Japanese law.

clip-similarwords
Release 0.0.3

Release 0.0.3

0.0.1

0.0.4.1

0.0.3.1

0.0.3

0.0.2

Documentation

Installation

Usage of the command

Usage of the module

Requirements for model uses

Requirements for model generation

License

Stats

Development practices

Releases

Contributors

clip-similarwords Release 0.0.3

Release 0.0.3 Toggle Dropdown 0.0.1 0.0.4.1 0.0.3.1 0.0.3 0.0.2

Documentation

Installation

Usage of the command

Usage of the module

Requirements for model uses

Requirements for model generation

License

Stats

Development practices

Releases

Contributors

clip-similarwords
Release 0.0.3

Release 0.0.3

0.0.1

0.0.4.1

0.0.3.1

0.0.3

0.0.2