A Korean Keywords Extractor with Syntactic Analysis


License
MIT
Install
pip install kokex==0.0.3

Documentation

kokex

GitHub GitHub Workflow Status Documentation Status

ํ•œ๊ตญ์–ด ๋ฌธ์„œ ์ง‘ํ•ฉ์—์„œ ํ‚ค์›Œ๋“œ๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. ๊ตฌ๋ฌธ๋ถ„์„์„ ํ†ตํ•ด ์‚ฌ์ „ ์—†์ด๋„ ๊ณ ์œ ๋ช…์‚ฌ, ๋ณตํ•ฉ๋ช…์‚ฌ, ์‹ ์กฐ์–ด์— ๊ฐ•๊ฑดํ•œ ํ‚ค์›Œ๋“œ ์ถ”์ถœ๊ธฐ๋ฅผ ๋งŒ๋“ค๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

๊ฐ„๋‹จํ•œ ์‚ฌ์šฉ์˜ˆ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

import kokex

keywords = kokex.keywords([
    '์ฒซ ๋ฒˆ์งธ ๋ฌธ์„œ์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ฌธ์žฅ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.',
    '๋‘ ๋ฒˆ์งธ ๋ฌธ์„œ์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ฌธ์„œ๋ฅผ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.'
])

print(keywords)  # {'๋ฒˆ์งธ': 2, '๋ฌธ์„œ': 3, '๋ฌธ์žฅ': 1, 'ํฌํ•จ': 2}

๋” ์ž์„ธํ•œ ์‚ฌ์šฉ๋ฒ•์€ ๋ฌธ์„œ ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

์„ค์น˜

์„ค์น˜๋Š” pip ๋ฅผ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ docker ๋ฅผ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

pip

pip install kokex

์ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” konlpy ์™€ Mecab ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ์— ์˜์กด์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. konlpy ์˜ ์„ค์น˜ ์•ˆ๋‚ด ์— ๋”ฐ๋ผ Mecab ์„ ์„ค์น˜ํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, Mac ํ™˜๊ฒฝ์—์„œ๋Š” ์•„๋ž˜์˜ ๋ช…๋ น์„ ์ž…๋ ฅํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

$ bash <(curl -s https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh)

docker

๋„์ปค๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด, ๋ชจ๋“  ์˜์กด์„ฑ์ด ์„ค์น˜๋œ ์ƒํƒœ๋กœ ๋กœ์ปฌ API ์„œ๋ฒ„๋ฅผ ๊ตฌ๋™ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

export KOKEX_PORT=80
export KOKEX_VERSION=0.0.11

docker pull kokex/server:${KOKEX_VERSION}
docker run -d -p ${KOKEX_PORT}:8081 --name kokex-server --rm kokex/server:${KOKEX_VERSION}

์„œ๋ฒ„๋ฅผ ์ข…๋ฃŒํ•  ๋•Œ๋Š” docker stop kokex-server ๋ฅผ ์ž…๋ ฅํ•˜์„ธ์š”.

์„œ๋ฒ„๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด http ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

curl -X POST \
  http://localhost/keywords \
  -d '{
	"docs": [
		"์ฒซ ๋ฒˆ์งธ ๋ฌธ์„œ์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ฌธ์žฅ์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.", 
		"๋‘ ๋ฒˆ์งธ ๋ฌธ์„œ์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ฌธ์„œ๋ฅผ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค."
	]
}'

๊ทธ๋ฆฌ๊ณ  http://localhost/docs ์— ์ ‘์†ํ•˜๋ฉด API ๋ฌธ์„œ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฐธ์—ฌ

๋ชจ๋“  ๋…ผ์˜๋Š” ์ด์Šˆ๋ฅผ ํ†ตํ•ด ์ด๋ฃจ์–ด์ง€๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค.

API ์‚ฌ์šฉ์ž

kokex api ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ถ„๋“ค์ด๋ผ๋ฉด ์ƒˆ๋กœ์šด ๋ฌธ์žฅ์— ๋Œ€ํ•œ ๋ถ„์„ ์š”์ฒญ์„ ํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ ๋ฌธ์žฅ๊ณผ ํ•จ๊ป˜ ์›ํ•˜๋Š” ๋ถ„์„๊ฒฐ๊ณผ๋ฅผ ๋‹ด์•„ ์ด์Šˆ๋ฅผ ์ƒ์„ฑํ•˜์‹ค ์ˆ˜ ์žˆ๋„๋ก ํ…œํ”Œ๋ฆฟ(test_suggest)์„ ์ค€๋น„ํ•ด๋‘์—ˆ์œผ๋‹ˆ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

API ๊ฐœ๋ฐœ์ž

kokex api ๊ฐœ๋ฐœ์— ๊ด€์‹ฌ์ด ์žˆ์œผ์‹œ๋‹ค๋ฉด, ์ด์Šˆ๋ฅผ ์ƒ์„ฑ ํ›„ ๋…ผ์˜๋ฅผ ์ง„ํ–‰ํ•˜๊ณ , ์ˆ˜์ •ํ•œ ์‚ฌํ•ญ์„ PR ํ•ด์ฃผ์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ•˜์‹ ๋‹ค๋ฉด, isort ์™€ black ์œผ๋กœ ํฌ๋งทํŒ…์„ ํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ์ €์žฅ์†Œ ๋ณต์ œ ํ›„ ์•„๋ž˜์™€ ๊ฐ™์ด ์ž…๋ ฅํ•˜์‹œ๋ฉด ์ด๋ฅผ ์ž๋™์œผ๋กœ ์ฒดํฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

pip install -r requirements.txt
pre-commit install

๊ทธ๋ฆฌ๊ณ  bin/run_test.sh ๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ๊ธฐ์กด ํ…Œ์ŠคํŠธ ๋ฌธ์žฅ์˜ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ํ•ด์น˜์ง€ ์•Š๋„๋ก ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ ๋ฌธ์žฅ์€, test ๋””๋ ‰ํ† ๋ฆฌ์˜ test_{API_NAME}.py ํŒŒ์ผ์—์„œ ์ฐพ์•„๋ณด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.