com.hankcs.hanlp.restful:hanlp-restful

HanLP: Han Language Processing


Keywords
dependency-parser, hanlp, named-entity-recognition, natural-language-processing, nlp, pos-tagging, semantic-parsing, text-classification
License
Apache-2.0

Documentation

HanLP: Han Language Processing

面向生产环境的倚语种自然语蚀倄理工具包基于PyTorch和TensorFlow 2.x双匕擎目标是普及萜地最前沿的NLP技术。HanLP具倇功胜完善、粟床准确、性胜高效、语料时新、架构枅晰、可自定义的特点。

demo

借助䞖界䞊最倧的倚语种语料库HanLP2.1支持包括简繁䞭英日俄法執圚内的130种语蚀䞊的10种联合任务以及倚种单任务。HanLP预训练了十几种任务䞊的数十䞪暡型并䞔正圚持续迭代语料库䞎暡型

功胜 RESTful 倚任务 单任务 æš¡åž‹ 标泚标准
分词 教皋 教皋 教皋 tok 粗分、细分
词性标泚 教皋 教皋 教皋 pos CTB、PKU、863
呜名实䜓识别 教皋 教皋 教皋 ner PKU、MSRA、OntoNotes
䟝存句法分析 教皋 教皋 教皋 dep SD、UD、PMT
成分句法分析 教皋 教皋 教皋 con Chinese Tree Bank
语义䟝存分析 教皋 教皋 教皋 sdp CSDP
语义角色标泚 教皋 教皋 教皋 srl Chinese Proposition Bank
抜象意义衚瀺 教皋 暂无 教皋 amr CAMR
指代消解 教皋 暂无 暂无 暂无 OntoNotes
语义文本盞䌌床 教皋 暂无 教皋 sts 暂无
文本风栌蜬换 教皋 暂无 暂无 暂无 暂无
关键词短语提取 教皋 暂无 暂无 暂无 暂无
抜取匏自劚摘芁 教皋 暂无 暂无 暂无 暂无
生成匏自劚摘芁 教皋 暂无 暂无 暂无 暂无
文本语法纠错 教皋 暂无 暂无 暂无 暂无
文本分类 教皋 暂无 暂无 暂无 暂无
情感分析 教皋 暂无 暂无 暂无 [-1,+1]
语种检测 教皋 暂无 教皋 暂无 ISO 639-1猖码

量䜓裁衣HanLP提䟛RESTful和native䞀种API分别面向蜻量级和海量级䞀种场景。无论䜕种API䜕种语蚀HanLP接口圚语义䞊保持䞀臎圚代码䞊坚持匀源。劂果悚圚研究䞭䜿甚了HanLP请匕甚我们的EMNLP论文。

蜻量级RESTful API

仅数KB适合敏捷匀发、移劚APP等场景。简单易甚无需GPU配环境秒速安装。语料曎倚、暡型曎倧、粟床曎高区烈掚荐。服务噚GPU算力有限匿名甚户配额蟃少建议申请免莹公益API秘钥auth。

Python

pip install hanlp_restful

创建客户端填入服务噚地址和秘钥

from hanlp_restful import HanLPClient
HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='zh') # auth䞍填则匿名zh䞭文mul倚语种

Golang

安装 go get -u github.com/hankcs/gohanlp@main 创建客户端填入服务噚地址和秘钥

HanLP := hanlp.HanLPClient(hanlp.WithAuth(""),hanlp.WithLanguage("zh")) // auth䞍填则匿名zh䞭文mul倚语种

Java

圚pom.xml䞭添加䟝赖

<dependency>
    <groupId>com.hankcs.hanlp.restful</groupId>
    <artifactId>hanlp-restful</artifactId>
    <version>0.0.12</version>
</dependency>

创建客户端填入服务噚地址和秘钥

HanLPClient HanLP = new HanLPClient("https://www.hanlp.com/api", null, "zh"); // auth䞍填则匿名zh䞭文mul倚语种

快速䞊手

无论䜕种匀发语蚀调甚parse接口䌠入䞀篇文章埗到HanLP粟准的分析结果。

HanLP.parse("2021幎HanLPv2.1䞺生产环境垊来次䞖代最先进的倚语种NLP技术。阿婆䞻来到北京立方庭参观自然语义科技公叞。")

曎倚功胜包括语义盞䌌床、风栌蜬换、指代消解等请参考文档和测试甚䟋。

海量级native API

䟝赖PyTorch、TensorFlow等深床孊习技术适合䞓䞚NLP工皋垈、研究者以及本地海量数据场景。芁求Python 3.6至3.10支持Windows掚荐*nix。可以圚CPU䞊运行掚荐GPU/TPU。安装PyTorch版

pip install hanlp
  • HanLP每次发垃郜通过了Linux、macOS和Windows侊Python3.6至3.10的单元测试䞍存圚安装问题。

HanLP发垃的暡型分䞺倚任务和单任务䞀种倚任务速床快省星存单任务粟床高曎灵掻。

倚任务暡型

HanLP的工䜜流皋䞺加蜜暡型然后将其圓䜜凜数调甚䟋劂䞋列联合倚任务暡型

import hanlp
HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) # 䞖界最倧䞭文语料库
HanLP(['2021幎HanLPv2.1䞺生产环境垊来次䞖代最先进的倚语种NLP技术。', '阿婆䞻来到北京立方庭参观自然语义科技公叞。'])

Native API的蟓入单䜍䞺句子需䜿甚倚语种分句暡型或基于规则的分句凜数先行分句。RESTful和native䞀种API的语义讟计完党䞀臎甚户可以无猝互换。简掁的接口也支持灵掻的参数垞甚的技巧有

  • 灵掻的tasks任务调床任务越少速床越快诊见教皋。圚内存有限的场景䞋甚户还可以删陀䞍需芁的任务蟟到暡型瘊身的效果。
  • 高效的trie树自定义词兞以及区制、合并、校正3种规则请参考demo和文档。规则系统的效果将无猝应甚到后续统计暡型从而快速适应新领域。

单任务暡型

根据我们的最新研究倚任务孊习的䌘势圚于速床和星存然而粟床埀埀䞍劂单任务暡型。所以HanLP预训练了讞倚单任务暡型并讟计了䌘雅的流氎线暡匏将其组装起来。

import hanlp
HanLP = hanlp.pipeline() \
    .append(hanlp.utils.rules.split_sentence, output_key='sentences') \
    .append(hanlp.load('FINE_ELECTRA_SMALL_ZH'), output_key='tok') \
    .append(hanlp.load('CTB9_POS_ELECTRA_SMALL'), output_key='pos') \
    .append(hanlp.load('MSRA_NER_ELECTRA_SMALL_ZH'), output_key='ner', input_key='tok') \
    .append(hanlp.load('CTB9_DEP_ELECTRA_SMALL', conll=0), output_key='dep', input_key='tok')\
    .append(hanlp.load('CTB9_CON_ELECTRA_SMALL'), output_key='con', input_key='tok')
HanLP('2021幎HanLPv2.1䞺生产环境垊来次䞖代最先进的倚语种NLP技术。阿婆䞻来到北京立方庭参观自然语义科技公叞。')

曎倚功胜请参考demo和文档了解曎倚暡型䞎甚法。

蟓出栌匏

无论䜕种API䜕种匀发语蚀䜕种自然语蚀HanLP的蟓出统䞀䞺json栌匏兌容dict的Document:

{
  "tok/fine": [
    ["2021幎", "HanLPv2.1", "䞺", "生产", "环境", "垊来", "次", "䞖代", "最", "先进", "的", "倚", "语种", "NLP", "技术", "。"],
    ["阿婆䞻", "来到", "北京", "立方庭", "参观", "自然", "语义", "科技", "公叞", "。"]
  ],
  "tok/coarse": [
    ["2021幎", "HanLPv2.1", "䞺", "生产", "环境", "垊来", "次䞖代", "最", "先进", "的", "倚语种", "NLP", "技术", "。"],
    ["阿婆䞻", "来到", "北京立方庭", "参观", "自然语义科技公叞", "。"]
  ],
  "pos/ctb": [
    ["NT", "NR", "P", "NN", "NN", "VV", "JJ", "NN", "AD", "JJ", "DEG", "CD", "NN", "NR", "NN", "PU"],
    ["NN", "VV", "NR", "NR", "VV", "NN", "NN", "NN", "NN", "PU"]
  ],
  "pos/pku": [
    ["t", "nx", "p", "vn", "n", "v", "b", "n", "d", "a", "u", "a", "n", "nx", "n", "w"],
    ["n", "v", "ns", "ns", "v", "n", "n", "n", "n", "w"]
  ],
  "pos/863": [
    ["nt", "w", "p", "v", "n", "v", "a", "nt", "d", "a", "u", "a", "n", "ws", "n", "w"],
    ["n", "v", "ns", "n", "v", "n", "n", "n", "n", "w"]
  ],
  "ner/pku": [
    [],
    [["北京立方庭", "ns", 2, 4], ["自然语义科技公叞", "nt", 5, 9]]
  ],
  "ner/msra": [
    [["2021幎", "DATE", 0, 1], ["HanLPv2.1", "ORGANIZATION", 1, 2]],
    [["北京", "LOCATION", 2, 3], ["立方庭", "LOCATION", 3, 4], ["自然语义科技公叞", "ORGANIZATION", 5, 9]]
  ],
  "ner/ontonotes": [
    [["2021幎", "DATE", 0, 1], ["HanLPv2.1", "ORG", 1, 2]],
    [["北京立方庭", "FAC", 2, 4], ["自然语义科技公叞", "ORG", 5, 9]]
  ],
  "srl": [
    [[["2021幎", "ARGM-TMP", 0, 1], ["HanLPv2.1", "ARG0", 1, 2], ["䞺生产环境", "ARG2", 2, 5], ["垊来", "PRED", 5, 6], ["次䞖代最先进的倚语种NLP技术", "ARG1", 6, 15]], [["最", "ARGM-ADV", 8, 9], ["先进", "PRED", 9, 10], ["技术", "ARG0", 14, 15]]],
    [[["阿婆䞻", "ARG0", 0, 1], ["来到", "PRED", 1, 2], ["北京立方庭", "ARG1", 2, 4]], [["阿婆䞻", "ARG0", 0, 1], ["参观", "PRED", 4, 5], ["自然语义科技公叞", "ARG1", 5, 9]]]
  ],
  "dep": [
    [[6, "tmod"], [6, "nsubj"], [6, "prep"], [5, "nn"], [3, "pobj"], [0, "root"], [8, "amod"], [15, "nn"], [10, "advmod"], [15, "rcmod"], [10, "assm"], [13, "nummod"], [15, "nn"], [15, "nn"], [6, "dobj"], [6, "punct"]],
    [[2, "nsubj"], [0, "root"], [4, "nn"], [2, "dobj"], [2, "conj"], [9, "nn"], [9, "nn"], [9, "nn"], [5, "dobj"], [2, "punct"]]
  ],
  "sdp": [
    [[[6, "Time"]], [[6, "Exp"]], [[5, "mPrep"]], [[5, "Desc"]], [[6, "Datv"]], [[13, "dDesc"]], [[0, "Root"], [8, "Desc"], [13, "Desc"]], [[15, "Time"]], [[10, "mDegr"]], [[15, "Desc"]], [[10, "mAux"]], [[8, "Quan"], [13, "Quan"]], [[15, "Desc"]], [[15, "Nmod"]], [[6, "Pat"]], [[6, "mPunc"]]],
    [[[2, "Agt"], [5, "Agt"]], [[0, "Root"]], [[4, "Loc"]], [[2, "Lfin"]], [[2, "ePurp"]], [[8, "Nmod"]], [[9, "Nmod"]], [[9, "Nmod"]], [[5, "Datv"]], [[5, "mPunc"]]]
  ],
  "con": [
    ["TOP", [["IP", [["NP", [["NT", ["2021幎"]]]], ["NP", [["NR", ["HanLPv2.1"]]]], ["VP", [["PP", [["P", ["䞺"]], ["NP", [["NN", ["生产"]], ["NN", ["环境"]]]]]], ["VP", [["VV", ["垊来"]], ["NP", [["ADJP", [["NP", [["ADJP", [["JJ", ["次"]]]], ["NP", [["NN", ["䞖代"]]]]]], ["ADVP", [["AD", ["最"]]]], ["VP", [["JJ", ["先进"]]]]]], ["DEG", ["的"]], ["NP", [["QP", [["CD", ["倚"]]]], ["NP", [["NN", ["语种"]]]]]], ["NP", [["NR", ["NLP"]], ["NN", ["技术"]]]]]]]]]], ["PU", ["。"]]]]]],
    ["TOP", [["IP", [["NP", [["NN", ["阿婆䞻"]]]], ["VP", [["VP", [["VV", ["来到"]], ["NP", [["NR", ["北京"]], ["NR", ["立方庭"]]]]]], ["VP", [["VV", ["参观"]], ["NP", [["NN", ["自然"]], ["NN", ["语义"]], ["NN", ["科技"]], ["NN", ["公叞"]]]]]]]], ["PU", ["。"]]]]]]
  ]
}

特别地Python RESTful和native API支持基于等宜字䜓的可视化胜借盎接将语蚀孊结构圚控制台内可视化出来

HanLP(['2021幎HanLPv2.1䞺生产环境垊来次䞖代最先进的倚语种NLP技术。', '阿婆䞻来到北京立方庭参观自然语义科技公叞。']).pretty_print()

Dep Tree    	Token    	Relati	PoS	Tok      	NER Type        	Tok      	SRL PA1     	Tok      	SRL PA2     	Tok      	PoS    3       4       5       6       7       8       9 
────────────	─────────	──────	───	─────────	────────────────	─────────	────────────	─────────	────────────	─────────	─────────────────────────────────────────────────────────
 ┌─────────►	2021幎    	tmod  	NT 	2021幎    	───►DATE        	2021幎    	───►ARGM-TMP	2021幎    	            	2021幎    	NT ───────────────────────────────────────────►NP ───┐   
 │┌────────►	HanLPv2.1	nsubj 	NR 	HanLPv2.1	───►ORGANIZATION	HanLPv2.1	───►ARG0    	HanLPv2.1	            	HanLPv2.1	NR ───────────────────────────────────────────►NP─────   
 ││┌─►┌─────	䞺        	prep  	P  	䞺        	                	䞺        	◄─┐         	䞺        	            	䞺        	P ───────────┐                                       │   
 │││  │  ┌─►	生产       	nn    	NN 	生产       	                	生产       	  ├►ARG2    	生产       	            	生产       	NN ──┐       ├────────────────────────►PP ───┐       │   
 │││  └─►└──	环境       	pobj  	NN 	环境       	                	环境       	◄─┘         	环境       	            	环境       	NN ──┎►NP ───┘                               │       │   
┌┌┎┎────────	垊来       	root  	VV 	垊来       	                	垊来       	╟──►PRED    	垊来       	            	垊来       	VV ──────────────────────────────────┐       │       │   
││       ┌─►	次        	amod  	JJ 	次        	                	次        	◄─┐         	次        	            	次        	JJ ───►ADJP──┐                       │       ├►VP─────   
││  ┌───►└──	䞖代       	nn    	NN 	䞖代       	                	䞖代       	  │         	䞖代       	            	䞖代       	NN ───►NP ───┎►NP ───┐               │       │       │   
││  │    ┌─►	最        	advmod	AD 	最        	                	最        	  │         	最        	───►ARGM-ADV	最        	AD ───────────►ADVP──┌►ADJP──┐       ├►VP ───┘       ├►IP
││  │┌──►├──	先进       	rcmod 	JJ 	先进       	                	先进       	  │         	先进       	╟──►PRED    	先进       	JJ ───────────►VP ───┘       │       │               │   
││  ││   └─►	的        	assm  	DEG	的        	                	的        	  ├►ARG1    	的        	            	的        	DEG───────────────────────────       │               │   
││  ││   ┌─►	倚        	nummod	CD 	倚        	                	倚        	  │         	倚        	            	倚        	CD ───►QP ───┐               ├►NP ───┘               │   
││  ││┌─►└──	语种       	nn    	NN 	语种       	                	语种       	  │         	语种       	            	语种       	NN ───►NP ───┎────────►NP─────                       │   
││  │││  ┌─►	NLP      	nn    	NR 	NLP      	                	NLP      	  │         	NLP      	            	NLP      	NR ──┐                       │                       │   
│└─►└┎┎──┎──	技术       	dobj  	NN 	技术       	                	技术       	◄─┘         	技术       	───►ARG0    	技术       	NN ──┎────────────────►NP ───┘                       │   
└──────────►	。        	punct 	PU 	。        	                	。        	            	。        	            	。        	PU ──────────────────────────────────────────────────┘   

Dep Tree    	Tok	Relat	Po	Tok	NER Type        	Tok	SRL PA1 	Tok	SRL PA2 	Tok	Po    3       4       5       6 
────────────	───	─────	──	───	────────────────	───	────────	───	────────	───	────────────────────────────────
         ┌─►	阿婆䞻	nsubj	NN	阿婆䞻	                	阿婆䞻	───►ARG0	阿婆䞻	───►ARG0	阿婆䞻	NN───────────────────►NP ───┐   
┌┬────┬──┎──	来到 	root 	VV	来到 	                	来到 	╟──►PRED	来到 	        	来到 	VV──────────┐               │   
││    │  ┌─►	北京 	nn   	NR	北京 	───►LOCATION    	北京 	◄─┐     	北京 	        	北京 	NR──┐       ├►VP ───┐       │   
││    └─►└──	立方庭	dobj 	NR	立方庭	───►LOCATION    	立方庭	◄─┎►ARG1	立方庭	        	立方庭	NR──┎►NP ───┘       │       │   
│└─►┌───────	参观 	conj 	VV	参观 	                	参观 	        	参观 	╟──►PRED	参观 	VV──────────┐       ├►VP─────   
│   │  ┌───►	自然 	nn   	NN	自然 	◄─┐             	自然 	        	自然 	◄─┐     	自然 	NN──┐       │       │       ├►IP
│   │  │┌──►	语义 	nn   	NN	语义 	  │             	语义 	        	语义 	  │     	语义 	NN  │       ├►VP ───┘       │   
│   │  ││┌─►	科技 	nn   	NN	科技 	  ├►ORGANIZATION	科技 	        	科技 	  ├►ARG1	科技 	NN  ├►NP ───┘               │   
│   └─►└┎┎──	公叞 	dobj 	NN	公叞 	◄─┘             	公叞 	        	公叞 	◄─┘     	公叞 	NN──┘                       │   
└──────────►	。  	punct	PU	。  	                	。  	        	。  	        	。  	PU──────────────────────────┘   

关于标泚集含义请参考《语蚀孊标泚规范》及《栌匏规范》。我们莭买、标泚或采甚了䞖界䞊量级最倧、种类最倚的语料库甚于联合倚语种倚任务孊习所以HanLP的标泚集也是芆盖面最广的。

训练䜠自己的领域暡型

写深床孊习暡型䞀点郜䞍隟隟的是倍现蟃高的准确率。䞋列代码展瀺了劂䜕圚sighan2005 PKU语料库䞊花6分钟训练䞀䞪超越孊术界state-of-the-art的䞭文分词暡型。

tokenizer = TransformerTaggingTokenizer()
save_dir = 'data/model/cws/sighan2005_pku_bert_base_96.73'
tokenizer.fit(
    SIGHAN2005_PKU_TRAIN_ALL,
    SIGHAN2005_PKU_TEST,  # Conventionally, no devset is used. See Tian et al. (2020).
    save_dir,
    'bert-base-chinese',
    max_seq_len=300,
    char_level=True,
    hard_constraint=True,
    sampler_builder=SortingSamplerBuilder(batch_size=32),
    epochs=3,
    adam_epsilon=1e-6,
    warmup_steps=0.1,
    weight_decay=0.01,
    word_dropout=0.1,
    seed=1660853059,
)
tokenizer.evaluate(SIGHAN2005_PKU_TEST, save_dir)

其䞭由于指定了随机数种子结果䞀定是96.73。䞍同于那些虚假宣䌠的孊术论文或商䞚项目HanLP保证所有结果可倍现。劂果䜠有任䜕莚疑我们将圓䜜最高䌘先级的臎呜性bug第䞀时闎排查问题。

请参考demo了解曎倚训练脚本。

性胜

lang corpora model tok pos ner dep con srl sdp lem fea amr
fine coarse ctb pku 863 ud pku msra ontonotes SemEval16 DM PAS PSD
mul UD2.7
OntoNotes5
small 98.62 - - - - 93.23 - - 74.42 79.10 76.85 70.63 - 91.19 93.67 85.34 87.71 84.51 -
base 98.97 - - - - 90.32 - - 80.32 78.74 71.23 73.63 - 92.60 96.04 81.19 85.08 82.13 -
zh open small 97.25 - 96.66 - - - - - 95.00 84.57 87.62 73.40 84.57 - - - - - -
base 97.50 - 97.07 - - - - - 96.04 87.11 89.84 77.78 87.11 - - - - - -
close small 96.70 95.93 96.87 97.56 95.05 - 96.22 95.74 76.79 84.44 88.13 75.81 74.28 - - - - - -
base 97.52 96.44 96.99 97.59 95.29 - 96.48 95.72 77.77 85.29 88.57 76.52 73.76 - - - - - -
ernie 96.95 97.29 96.76 97.64 95.22 - 97.31 96.47 77.95 85.67 89.17 78.51 74.10 - - - - - -
  • 根据我们的最新研究单任务孊习的性胜埀埀䌘于倚任务孊习。圚乎粟床甚于速床的话建议䜿甚单任务暡型。

HanLP采甚的数据预倄理䞎拆分比䟋䞎流行方法未必盞同比劂HanLP采甚了完敎版的MSRA呜名实䜓识别语料而非倧䌗䜿甚的阉割版HanLP䜿甚了语法芆盖曎广的Stanford Dependencies标准而非孊术界沿甚的Zhang and Clark (2008)标准HanLP提出了均匀分割CTB的方法而䞍采甚孊术界䞍均匀䞔遗挏了51䞪黄金文件的方法。HanLP匀源了䞀敎套语料预倄理脚本䞎盞应语料库力囟掚劚䞭文NLP的透明化。

总之HanLP只做我们讀䞺正确、先进的事情而䞍䞀定是流行、权嚁的事情。

匕甚

劂果䜠圚研究䞭䜿甚了HanLP请按劂䞋栌匏匕甚

@inproceedings{he-choi-2021-stem,
    title = "The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders",
    author = "He, Han and Choi, Jinho D.",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.451",
    pages = "5555--5577",
    abstract = "Multi-task learning with transformer encoders (MTL) has emerged as a powerful technique to improve performance on closely-related tasks for both accuracy and efficiency while a question still remains whether or not it would perform as well on tasks that are distinct in nature. We first present MTL results on five NLP tasks, POS, NER, DEP, CON, and SRL, and depict its deficiency over single-task learning. We then conduct an extensive pruning analysis to show that a certain set of attention heads get claimed by most tasks during MTL, who interfere with one another to fine-tune those heads for their own objectives. Based on this finding, we propose the Stem Cell Hypothesis to reveal the existence of attention heads naturally talented for many tasks that cannot be jointly trained to create adequate embeddings for all of those tasks. Finally, we design novel parameter-free probes to justify our hypothesis and demonstrate how attention heads are transformed across the five tasks during MTL through label analysis.",
}

License

源代码

HanLP源代码的授权协议䞺 Apache License 2.0可免莹甚做商䞚甚途。请圚产品诎明䞭附加HanLP的铟接和授权协议。HanLP受版权法保技䟵权必究。

自然语义青岛科技有限公叞

HanLP从v1.7版起独立运䜜由自然语义青岛科技有限公叞䜜䞺项目䞻䜓䞻富后续版本的匀发并拥有后续版本的版权。

倧快搜玢

HanLP v1.3~v1.65版由倧快搜玢䞻富匀发继续完党匀源倧快搜玢拥有盞关版权。

䞊海林原公叞

HanLP 早期埗到了䞊海林原公叞的倧力支持并拥有1.28及前序版本的版权盞关版本也曟圚䞊海林原公叞眑站发垃。

预训练暡型

机噚孊习暡型的授权圚法埋䞊没有定论䜆本着尊重匀源语料库原始授权的粟神劂䞍特别诎明HanLP的倚语种暡型授权沿甚CC BY-NC-SA 4.0䞭文暡型授权䞺仅䟛研究䞎教孊䜿甚。

References

https://hanlp.hankcs.com/docs/references.html