Collocation Search of Korean
Let's find out which words are used together with a certain word.
Requirements
- python >=3.6
- whoosh
Example
- Download and extract the indexed files
>>> from search import Collocate
>>> c = Collocate()
>>> q = "๋จน" # drop the final ending "-๋ค" for verbs/adjectives.
>>> collocates = c(q)
>>> for pos, cols in collocates.items():
>>> print(q + " as " + pos)
>>> for pos2, cols2 in cols.items():
>>> print(pos2, ", ".join(word + "(" + str(cnt) + ")" for word, cnt in cols2))
๋จน as verb
noun ๊ฒ(39), ์(29), ์์(23), ๋ฑ(16), ๊ณ ๊ธฐ(14), ..
verb ํ(33), ์(21), ์ด(17), ์ฆ๊ธฐ(11), ๊ตฝ(9), ..
adverb ๋ง์ด(10), ์ฃผ๋ก(7), ๋ค(5), ๊ฐ์ด(4), ์(4), ...
determiner ๋ค๋ฅธ(5), ๊ทธ(2), ์ฌ๋ฌ(1), ์ธ(1), ๋ช๋ช(1), ์(1)
adjective ์ถ(5), ์ด๋ฆฌ(1), ํธํ(1), ์(1), ์ข(1), ์์ฝ(1), ๋ชปํ(1)
๋จน as noun
noun ๋ถ(3), ์ข
์ด(2), ๋ฌ์ (1), ์ฒญ์(1), ์์ฅ๋(1), ์ ์กฐ(1), ..
verb ์ํ(1), ๊ทธ๋ฆฌ(1), ์ฐ(1), ์ฐจ(1), ๋์ด๋(1)
adverb ํ์ง๋ง(1)