Files
common/gse/dict/README.md

675 B

Some dict/zh data is from github.com/fxsjy/jieba

update at 2023-11-16:

add two new dict documents , which from github.com/GuocaiL/nlp_corpus

generated by nlp_corpus/open_ner_data/boson/boson.txt, open_ner_data/people_daily/people_daily_ner.txt, open_ner_data/tianchi_yiyao/train.txt,open_ner_data/ResumeNER/dev.txt

  1. tf_idf.txt

The first column of this document is the term , the second column is the word frequency of the corresponding term, and the third column is the inverse document frequency of the corresponding term

  1. tf_idf_origin.txt

the origin corpus text