KCM API are also available now:API Online Version
不知道要寫啥
Now three languages are available:
- Chinese
- English
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Prerequisities
- OS:Ubuntu / OSX would be nice
- environment:need python3
sudo apt-get update; sudo apt-get install; python3 python3-dev
- (optional) 簡繁轉換:
sudo apt-get install -y opencc
- (optional) 英文版Wiki,請先下載nltk:
python3 -m nltk.downloader -d /usr/local/share/nltk_data all
- (optional) 用各論壇資料建立KCM模型:請先參照這篇產出輸入檔KCM Data Source Extractor
Installing
- Install By Docker:
- You can only run this command in directory which has Dockerfile:
sudo docker build -t kcm .
- You can only run this command in directory which has Dockerfile:
- Install manually:
- 使用虛擬環境 Use virtualenv is recommended:
virtualenv venv
- 啟動方法 How to activate virtualenv
1. for Linux:
. venv/bin/activate
2. for Windows:venv\Scripts\activate
- 安裝 Install:
pip install -e git://github.com/UDICatNCHU/KCM.git@master#egg=KCM
Running & Testing
Run
-
整體使用方法 Usage:
- KCM物件的參數:
- lang:cht
中文
、英文eng
- io_dir:讀取建立model用的input檔的路徑
- thread_count:輸入整數,為啟動的執行緒數量
- uri:Mongo連接網址,
mongodb://資料庫IP
- lang:cht
- Instance Method:
- main:建立模型的函式
- setLang:選擇語系的函式,build會根據設定的語言到相對應的資料夾讀取input檔
- removeDB:清空MongoDB
- get:到MongoDB查詢結果 * 執行方法(二選一):
- command line執行:
python3 src/kcm/KCM/__main__.py(或是你放kcm/__main__.py的路徑) -p 你放要建立模型輸入檔的地方
from KCM.__main__ import KCM k = KCM('cht', '含有輸入檔的資料夾路徑') k.removeDB() k.main() print(k.get('美國隊長', 10))
- KCM物件的參數:
Results
the result of querying KCM model with keyword "Captain America":
[
["電影",93],
["復仇者",78],
["戰士",55],
["英雄",55],
["鋼鐵",52],
["內戰",50],
["復仇者聯盟",44],
["漫畫",42],
["酷寒",39],
["巴基",36]
]
Break down into end to end tests
- 執行全部的測試 run whole test:
python3 run_tests.py
And coding style tests
目前沒有coding style tests...
Deployment
目前只是一般的python程式,pip install
即可
Built With
- python3.5
Versioning
For the versions available, see the tags on this repository.
Contributors
License
Acknowledgments
- WikiExtractor
- 感謝fxsjy的jieba斷詞系統 Thanks python library jieba from fxsjy
- 感謝Google工程師釋出的word2vec Thanks word2vec from Google