gmq版本

This commit is contained in:
2026-04-21 10:27:39 +08:00
parent e051046f77
commit f5c4977851
12 changed files with 0 additions and 1886714 deletions

View File

@@ -1,15 +0,0 @@
Some dict/zh data is from [github.com/fxsjy/jieba](https://github.com/fxsjy/jieba)
update at 2023-11-16:
add two new dict documents , which from [github.com/GuocaiL/nlp_corpus](https://github.com/GuocaiL/nlp_corpus)
generated by `nlp_corpus/open_ner_data/boson/boson.txt`, `open_ner_data/people_daily/people_daily_ner.txt`, `open_ner_data/tianchi_yiyao/train.txt`,`open_ner_data/ResumeNER/dev.txt`
1. tf_idf.txt
The first column of this document is the term , the second column is the word frequency of the corresponding term, and the third column is the inverse document frequency of the corresponding term
2. tf_idf_origin.txt
the origin corpus text

View File

View File

@@ -1 +0,0 @@
dict.txt 通过内部工具生成, Copyright 2017 ego authors. 商用和拷贝请注明来源和版权

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,88 +0,0 @@
,
.
?
!
"
@
 
~
*
<
>
/
\
|
-
_
+
=
&
^
%
#
`
;
$
︿
哎呀
哎哟
俺们
按照
吧哒
罢了
本着
比方
比如
鄙人
彼此
别的
别说

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long