fix: GSE数据文件从gse/dict目录加载
This commit is contained in:
15
gse/dict/README.md
Normal file
15
gse/dict/README.md
Normal file
@@ -0,0 +1,15 @@
|
||||
Some dict/zh data is from [github.com/fxsjy/jieba](https://github.com/fxsjy/jieba)
|
||||
|
||||
update at 2023-11-16:
|
||||
|
||||
add two new dict documents , which from [github.com/GuocaiL/nlp_corpus](https://github.com/GuocaiL/nlp_corpus)
|
||||
|
||||
generated by `nlp_corpus/open_ner_data/boson/boson.txt`, `open_ner_data/people_daily/people_daily_ner.txt`, `open_ner_data/tianchi_yiyao/train.txt`,`open_ner_data/ResumeNER/dev.txt`
|
||||
|
||||
1. tf_idf.txt
|
||||
|
||||
The first column of this document is the term , the second column is the word frequency of the corresponding term, and the third column is the inverse document frequency of the corresponding term
|
||||
|
||||
2. tf_idf_origin.txt
|
||||
|
||||
the origin corpus text
|
||||
Reference in New Issue
Block a user