nlp measurement package


Keywords
amazon, aws, cer, character-error-rate, computing-error-rates, evaluate, evaluation-functions, evaluation-metrics, korean, normalization, speech-analysis, speech-recognition, speech-to-text, test, text-digitisation, text-evaluation, transcribe, wer, word-error-rate
License
MIT
Install
pip install nlptutti==0.0.0.7

Documentation

GitHub license Downloads

ํ•œ๊ตญ์–ด ์ž๋™ ์Œ์„ฑ ์ธ์‹ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์œ ์‚ฌ๋„ ์ธก์ • ํ•จ์ˆ˜ ํŒจํ‚ค์ง€

์ด ์ €์žฅ์†Œ์—๋Š” Amazon Transcribes์™€ ๊ฐ™์€ ํ•œ๊ธ€ ๋ฌธ์žฅ ์ธ์‹๊ธฐ์˜ ์ถœ๋ ฅ ์Šคํฌ๋ฆฝํŠธ์˜ ๋‚ฑ๋ง ์˜ค๋ฅ˜์œจ(CER), ๋‹จ์–ด ์˜ค๋ฅ˜์œจ(WER)์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ฐ„๋‹จํ•œ Python ํŒจํ‚ค์ง€๊ฐ€ ํฌํ•จ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค. STT(speech-to-text) API์˜ ์‹ค์ œ(Ground truth)๋ฌธ์žฅ๊ณผ ๊ฐ€์„ค(hypothesis or transcribe)๋ฌธ์žฅ ์‚ฌ์ด์˜ ์ตœ์†Œ ํŽธ์ง‘๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ตœ์†ŒํŽธ์ง‘๊ฑฐ๋ฆฌ๋Š” Dynamic Programing ๊ธฐ๋ฒ• ์ค‘ Levenshtein์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

๋ฌธ์ž ์˜ค๋ฅ˜์œจ(CER/WER)์€ ์ž๋™ ์Œ์„ฑ ์ธ์‹ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์— ๋Œ€ํ•œ ์ผ๋ฐ˜์ ์ธ ๋ฉ”ํŠธ๋ฆญ์ž…๋‹ˆ๋‹ค. CER์€ WER(๋‹จ์–ด ์˜ค๋ฅ˜์œจ)๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ ๋‹จ์–ด ๋Œ€์‹  ๋ฌธ์ž์— ๋Œ€ํ•ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ WER ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.[1] ๋ฌธ์ž ์˜ค๋ฅ˜์œจ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.




CER(WER) = (S + D + I) / N = (S + D + I) / (S + D + I + C)

  • S : ๋Œ€์ฒด ์˜ค๋ฅ˜, ์ฒ ์ž๊ฐ€ ํ‹€๋ฆฐ ์™ธ์ž(uniliteral)/๋‹จ์–ด(word) ํšŸ์ˆ˜
  • D : ์‚ญ์ œ ์˜ค๋ฅ˜, ์™ธ์ž/๋‹จ์–ด์˜ ๋ˆ„๋ฝ ํšŸ์ˆ˜
  • I : ์‚ฝ์ž… ์˜ค๋ฅ˜, ์ž˜๋ชป๋œ ์™ธ์ž/๋‹จ์–ด๊ฐ€ ํฌํ•จ๋œ ํšŸ์ˆ˜
  • C : Ground truth์™€ hypothesis ๊ฐ„ ์˜ฌ๋ฐ”๋ฅธ ์™ธ์ž/๋‹จ์–ด(๊ธฐํ˜ธ)์˜ ํ•ฉ๊ณ„, (N - D - S)
  • N : ์ฐธ์กฐ์˜(Ground truth) ์™ธ์ž/๋‹จ์–ด ์ˆ˜

CER์˜ ์ถœ๋ ฅ์€ ํŠนํžˆ ์‚ฝ์ž… ์ˆ˜๊ฐ€ ๋งŽ์€ ๊ฒฝ์šฐ ํ•ญ์ƒ 0๊ณผ 1 ์‚ฌ์ด์˜ ์ˆซ์ž๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ์ด ๊ฐ’์€ ์ข…์ข… ์ž˜๋ชป ์˜ˆ์ธก๋œ ๋ฌธ์ž์˜ ๋ฐฑ๋ถ„์œจ๊ณผ ์—ฐ๊ด€๋ฉ๋‹ˆ๋‹ค. ๊ฐ’์ด ๋‚ฎ์„์ˆ˜๋ก ASR ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜๊ณ  CER์ด 0์ด๋ฉด ์™„๋ฒฝํ•œ ์ ์ˆ˜์ž…๋‹ˆ๋‹ค. ์ด ํ•จ์ˆ˜์—์„œ๋Š” insertion์— ๋”ฐ๋ฅธ ์˜ค๋ฅ˜๊ฐ’ ์ดˆ๊ณผ์— ๋Œ€ํ•ด normalized error rate์œผ๋กœ ์ ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.[2]

CER์€ ์ž๋™ ์Œ์„ฑ ์ธ์‹(ASR) ๋ฐ ๊ด‘ํ•™ ๋ฌธ์ž ์ธ์‹(OCR)๊ณผ ๊ฐ™์€ ์ž‘์—…์— ๋Œ€ํ•œ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์„ ๋น„๊ตํ•˜๋Š” ๋ฐ ์œ ์šฉํ•˜๋ฉฐ, ํŠนํžˆ ์–ธ์–ด์˜ ๋‹ค์–‘์„ฑ์œผ๋กœ ์ธํ•ด WER์ด ์ ํ•ฉํ•˜์ง€ ์•Š์€ ๋‹ค๊ตญ์–ด ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๊ฒฝ์šฐ์— ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. CER ๊ฐ™์€ ๊ฒฝ์šฐ, ๋ฒˆ์—ญ ์˜ค๋ฅ˜์˜ ํŠน์„ฑ์— ๋Œ€ํ•œ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ ์˜ค๋ฅ˜์˜ ์ฃผ์š” ์›์ธ์„ ์‹๋ณ„ํ•˜๊ณ  ์—ฐ๊ตฌ ๋…ธ๋ ฅ์— ์ง‘์ค‘ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ถ”๊ฐ€ ์ž‘์—…์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๊ฒฝ์šฐ์— ๋”ฐ๋ผ ์›๋ณธ ER์„ ๋ณด๊ณ ํ•˜๋Š” ๋Œ€์‹  ์‹ค์ˆ˜ ์ˆ˜๋ฅผ ํŽธ์ง‘ ์ž‘์—… ์ˆ˜(I + S + D)์™€ C(์ •ํ™•ํ•œ ๋ฌธ์ž ์ˆ˜)์˜ ํ•ฉ์œผ๋กœ ๋‚˜๋ˆˆ ์ •๊ทœํ™”๋œ ER์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ 0โ€“100% ๋ฒ”์œ„์— ์†ํ•˜๋Š” CER ๊ฐ’์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ๋ฐฉ๋ฒ•

๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋‘ ๋ฌธ์ž์—ด ๊ฐ„์˜ ํŽธ์ง‘ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

pip install nlptutti

CER

import nlptutti as metrics

refs = "์•„ํ‚คํƒํŠธ"
preds = "์•„ํ‚คํƒ์ณ"
# prints: [cer, substitutions, deletions, insertions] -> [CER = 1 / 4, S = 1, D = 0, I = 0] 
import nlptutti as metrics

refs = "์ œ์ด ์ฐจ ์„ธ๊ณ„ ๋Œ€์ „์€ ์ธ๋ฅ˜ ์—ญ์‚ฌ์ƒ ๊ฐ€์žฅ ๋งŽ์€ ์ธ๋ช… ํ”ผํ•ด์™€ ์žฌ์‚ฐ ํ”ผํ•ด๋ฅผ ๋‚จ๊ธด ์ „์Ÿ์ด์—ˆ๋‹ค."
preds = "์ œ์ด์ฐจ ์„ธ๊ณ„๋Œ€์ „์€ ์ธ๋ฅ˜ ์—ญ์‚ฌ์ƒ ๊ฐ€์žฅ๋งŽ์€ ์ธ๋ช…ํ”ผํ•ด์™€ ์žฌ์‚ฐํ”ผํ•ด๋ฅผ ๋‚จ๊ธด ์ „์Ÿ์ด์—ˆ๋‹ค."
result = metrics.get_cer(refs, preds)
cer = result['cer']
substitutions = result['substitutions']
deletions = result['deletions']
insertions = result['insertions']
# prints: [cer, substitutions, deletions, insertions] -> [CER = 0 / 34, S = 0, D = 0, I = 0]

WER

import nlptutti as metrics

refs = "๋Œ€ํ•œ๋ฏผ๊ตญ์€ ์ฃผ๊ถŒ ๊ตญ๊ฐ€ ์ž…๋‹ˆ๋‹ค."
preds = "๋Œ€ํ•œ๋ฏผ๊ตญ์€ ์ฃผ๊ถŒ๊ตญ๊ฐ€ ์ž…๋‹ˆ๋‹ค."
result = metrics.get_wer(refs, preds)

wer = result['wer']
substitutions = result['substitutions']
deletions = result['deletions']
insertions = result['insertions']
# prints: [wer, substitutions, deletions, insertions] -> [WER =  2 / 4, S = 1, D = 1, I = 0]

CRR

import nlptutti as metrics

refs = "์ œ์ด ์ฐจ ์„ธ๊ณ„ ๋Œ€์ „์€ ์ธ๋ฅ˜ ์—ญ์‚ฌ์ƒ ๊ฐ€์žฅ ๋งŽ์€ ์ธ๋ช… ํ”ผํ•ด์™€ ์žฌ์‚ฐ ํ”ผํ•ด๋ฅผ ๋‚จ๊ธด ์ „์Ÿ์ด์—ˆ๋‹ค."
preds = "์ œ์ด์ฐจ ์„ธ๊ณ„๋Œ€์ „์€ ์ธ๋ฅ˜ ์—ญ์‚ฌ์ƒ ๊ฐ€์žฅ๋งŽ์€ ์ธ๋ช…ํ”ผํ•ด์™€ ์žฌ์‚ฐํ”ผํ•ด๋ฅผ ๋‚จ๊ธด ์ „์Ÿ์ด์—ˆ๋‹ค."
result = metrics.get_cer(refs, preds)
crr = result['crr']
substitutions = result['substitutions']
deletions = result['deletions']
insertions = result['insertions']
# prints: [crr, substitutions, deletions, insertions] -> [CRR = 1 - (0 / 34), S = 0, D = 0, I = 0]

์ „์ฒ˜๋ฆฌ ์˜ˆ

๋„์–ด์“ฐ๊ธฐ

๊ฐ€์„ค ๋˜๋Š” ์ •๋‹ต ํ…์ŠคํŠธ์— ์ผ๋ถ€ ์ „์ฒ˜๋ฆฌ ๋‹จ๊ณ„๋ฅผ ์ ์šฉํ•ด์•ผ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•œ๊ตญ์–ด ๋ฌธ์žฅ ๊ตฌ์„ฑ์€ ๋‹จ์–ด๊ฐ„ ๋„์–ด์“ฐ๊ธฐ์˜ ๋ชจํ˜ธ์„ฑ์œผ๋กœ CER๊ณ„์‚ฐ์—์„œ ๊ณต๋ฐฑ์„ ๊ณ„์‚ฐํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๊ทผ๋Œ€ ์ด์ „๊นŒ์ง€ ๋™์–‘์˜ ์–ธ์–ด์—๋Š” โ€˜๋„์–ด์“ฐ๊ธฐโ€™ ๊ฐœ๋…์ด ์กด์žฌํ•˜์ง€ ์•Š์•˜๊ณ , ํ•œ๊ตญ์–ด๋Š” ๋งž์ถค๋ฒ• ์ƒ ๋„์–ด์“ฐ๊ธฐ ๊ทœ์น™์ด ์ •ํ•ด์ ธ ์žˆ๊ธฐ๋Š” ํ•˜๋‚˜, ๋„์–ด์“ฐ๊ธฐ๋ฅผ ์ง€ํ‚ค์ง€ ์•Š์•„๋„ ๋ฌธ์žฅ์˜ ๋งฅ๋ฝ์„ ์ดํ•ดํ•˜๋Š”๋ฐ ํฐ ๋ฌด๋ฆฌ๊ฐ€ ์—†๋Š” ์–ธ์–ด์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ CER ๊ณ„์‚ฐ์—์„œ ์ž…๋ ฅ ๋ณ€์ˆ˜์˜ whitespace๋Š” ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ณต๋ฐฑ ๋ฌธ์ž๋Š” \t, \n, \r, \x0b ๋ฐ \x0c์™€ whitespace์ž…๋‹ˆ๋‹ค.

ref = '๋˜ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๋Š” ์ž‘์—…๊ณผ ์“ฐ๋Š” ์ž‘์—…์„ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค'
refs ->  ๋˜๋‹ค๋ฅธ๋ฐฉ๋ฒ•์œผ๋กœ๋ฐ์ดํ„ฐ๋ฅผ์ฝ๋Š”์ž‘์—…๊ณผ์“ฐ๋Š”์ž‘์—…์„๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค

๊ตฌ๋‘์  ์ฒ˜๋ฆฌ

STT ์ธ์‹๊ธฐ์— ๋”ฐ๋ผ ๊ตฌ๋‘์ ์„ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋ณ€์ˆ˜์˜ ๊ตฌ๋‘์  ํ•„ํ„ฐ๋ง์€ flag์ฒ˜๋ฆฌ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•„ํ„ฐ๋ง ๊ธฐ๋ณธ๊ฐ’์€ True์ž…๋‹ˆ๋‹ค. ๊ตฌ๋‘์  ๋ฌธ์ž๋Š”:

๊ตฌ๋‘์  filter-> '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
import nlptutti as metrics
refs = "๋˜ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ, ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๋Š” ์ž‘์—…๊ณผ ์“ฐ๋Š” ์ž‘์—…์„ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค!"
preds = "๋˜! ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๋Š” ์ž‘์—…๊ณผ ์“ฐ๋Š” ์ž‘์—…์„ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค."
result = metrics.get_wer(refs, preds, rm_punctuation=True)

# prints: wer -> 0.0

References