675 B
675 B
Some dict/zh data is from github.com/fxsjy/jieba
update at 2023-11-16:
add two new dict documents , which from github.com/GuocaiL/nlp_corpus
generated by nlp_corpus/open_ner_data/boson/boson.txt, open_ner_data/people_daily/people_daily_ner.txt, open_ner_data/tianchi_yiyao/train.txt,open_ner_data/ResumeNER/dev.txt
- tf_idf.txt
The first column of this document is the term , the second column is the word frequency of the corresponding term, and the third column is the inverse document frequency of the corresponding term
- tf_idf_origin.txt
the origin corpus text