osanseviero
/

fasttext_nearest

Text Classification

Model card Files Files and versions Community

osanseviero commited on Sep 23, 2021

Commit

07e68d7

•

1 Parent(s): bab71b5

Update README.md

Files changed (1) hide show

README.md +2 -55

README.md CHANGED Viewed

@@ -1,60 +1,7 @@
 ---
 tags:
-- feature-extraction
 library_name: generic
 ---
-# debate2vec
-Word-vectors created from a large corpus of competitive debate evidence, and data extraction / processing scripts
-#usage
-```
-import fasttext.util
-ft = fasttext.load_model('debate2vec.bin')
-ft.get_word_vector('dialectics')
-```
-# Download Link
-Github won't let me store large files in their repos.
-* [FastText Vectors Here](https://drive.google.com/file/d/1m-CwPcaIUun4qvg69Hx2gom9dMScuQwS/view?usp=sharing) (~260mb)
-# About
-Created from all publically available Cross Examination Competitive debate evidence posted by the community on [Open Evidence](https://openev.debatecoaches.org/) (From 2013-2020)
-Search through the original evidence by going to [debate.cards](http://debate.cards/)
-Stats about this corpus:
-* 222485 unique documents larger than 200 words (DebateSum plus some additional debate docs that weren't well-formed enough for inclusion into DebateSum)
-* 107555 unique words (showing up more than 10 times in the corpus)
-* 101 million total words
-Stats about debate2vec vectors:
-* 300 dimensions, minimum number of appearances of a word was 10, trained for 100 epochs with lr set to 0.10 using FastText
-* lowercased (will release cased)
-* No subword information
-The corpus includes the following topics
-* 2013-2014 Cuba/Mexico/Venezuela Economic Engagement
-* 2014-2015 Oceans
-* 2015-2016 Domestic Surveillance
-* 2016-2017 China
-* 2017-2018 Education
-* 2018-2019 Immigration
-* 2019-2020 Reducing Arms Sales
-Other topics that this word vector model will handle extremely well
-* Philosophy (Especially Left-Wing / Post-modernist)
-* Law
-* Government
-* Politics
-Initial release is of fasttext vectors without subword information. Future releases will include fine-tuned GPT-2 and other high end models as my GPU compute allows.
-# Screenshots
-![](https://github.com/Hellisotherpeople/debate2vec/blob/master/debate2vec.jpg)
-![](https://github.com/Hellisotherpeople/debate2vec/blob/master/debate2vec2.jpg)
-![](https://github.com/Hellisotherpeople/debate2vec/blob/master/debate2vec3.jpg)

 ---
 tags:
+- text-classification
 library_name: generic
 ---
+# Fasttext nearest neighbors