RapMinerz commited on
Commit
9473013
1 Parent(s): fa023db

update model

Browse files
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
README.md CHANGED
@@ -4,6 +4,7 @@ language:
4
  tags:
5
  - music
6
  - rap
 
7
  - word2vec
8
  library_name: gensim
9
  ---
@@ -11,11 +12,11 @@ library_name: gensim
11
 
12
  ## Overview
13
 
14
- Word2Bezbar are Word2Vec models trained on a 323MB dataset of cleaned French rap lyrics sourced from Genius. The model captures the semantic relationships between words in the context of French rap, providing a useful tool for academic and research purposes in natural language processing (NLP) and linguistic studies.
15
 
16
  ## Model Details
17
 
18
- Size is __small__
19
 
20
  | Parameter | Value |
21
  |----------------|--------------|
@@ -24,7 +25,7 @@ Size is __small__
24
  | Epochs | 10 |
25
  | Algorithm | CBOW |
26
 
27
- ## Requirements
28
 
29
  | Requirement | Version |
30
  |----------------|--------------|
@@ -48,7 +49,7 @@ Size is __small__
48
  3. **Navigate to the Model Directory**:
49
 
50
  ```bash
51
- cd Word2Bezbar-large
52
  ```
53
 
54
  ## Loading the Model
@@ -59,16 +60,67 @@ To load the Word2Bezbar Word2Vec model, use the following Python code:
59
  import gensim
60
 
61
  # Load the Word2Vec model
62
- model = gensim.models.Word2Vec.load("Word2Bezbar-small/word2vec.model")
63
  ```
64
 
65
  ## Using the Model
66
 
67
- Once the model is loaded, you can use it to find words similar to a given word. For example, to find words similar to "kichta":
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
  ```python
70
- similar_words = model.wv.most_similar("kichta")
71
- print(similar_words)
 
72
  ```
73
 
74
  ## Purpose and Disclaimer
@@ -77,4 +129,4 @@ This model is designed for academic and research purposes only. It is not intend
77
 
78
  ## Contact
79
 
80
- For any questions or issues, please contact the repository owner at [email protected].
 
4
  tags:
5
  - music
6
  - rap
7
+ - lyrics
8
  - word2vec
9
  library_name: gensim
10
  ---
 
12
 
13
  ## Overview
14
 
15
+ Word2Bezbar are Word2Vec models trained on a 323MB dataset of cleaned French rap lyrics sourced from Genius. The model captures the semantic relationships between words in the context of French rap, providing a useful tool for academic and research purposes in natural language processing and linguistic studies associated to music writing.
16
 
17
  ## Model Details
18
 
19
+ Size of this model is __small__
20
 
21
  | Parameter | Value |
22
  |----------------|--------------|
 
25
  | Epochs | 10 |
26
  | Algorithm | CBOW |
27
 
28
+ ## Versions
29
 
30
  | Requirement | Version |
31
  |----------------|--------------|
 
49
  3. **Navigate to the Model Directory**:
50
 
51
  ```bash
52
+ cd Word2Bezbar-small
53
  ```
54
 
55
  ## Loading the Model
 
60
  import gensim
61
 
62
  # Load the Word2Vec model
63
+ model = gensim.models.Word2Vec.load("word2vec.model")
64
  ```
65
 
66
  ## Using the Model
67
 
68
+ Once the model is loaded, you can use it as shown
69
+
70
+ To get the most similary words regarding a word:
71
+
72
+ ```python
73
+ model.wv.most_similar("bendo")
74
+ [('binks', 0.8920747637748718),
75
+ ('bando', 0.8460732698440552),
76
+ ('hood', 0.8299438953399658),
77
+ ('tieks', 0.8264378309249878),
78
+ ('hall', 0.817583441734314),
79
+ ('secteur', 0.8145656585693359),
80
+ ('barrio', 0.809047281742096),
81
+ ('block', 0.793493390083313),
82
+ ('bâtiment', 0.7826434969902039),
83
+ ('bloc', 0.7753982543945312)]
84
+
85
+ model.wv.most_similar("kichta")
86
+ [('liasse', 0.878665566444397),
87
+ ('sse-lia', 0.8552991151809692),
88
+ ('kishta', 0.8535938262939453),
89
+ ('kich', 0.7646669149398804),
90
+ ('skalape', 0.7576569318771362),
91
+ ('moula', 0.7466527223587036),
92
+ ('valise', 0.7429592609405518),
93
+ ('sacoche', 0.7324921488761902),
94
+ ('mallette', 0.7247079014778137),
95
+ ('re-pai', 0.7060815095901489)]
96
+ ```
97
+
98
+ To find the word that doesn't match in a list of words:
99
+
100
+ ```python
101
+ model.wv.doesnt_match(["racli","gow","gadji","fimbi","boug"])
102
+ 'boug'
103
+
104
+ model.wv.doesnt_match(["Zidane","Mbappé","Ronaldo","Messi","Jordan"])
105
+ 'Jordan'
106
+ ```
107
+
108
+ To find the similarity between two words:
109
+
110
+ ```python
111
+ model.wv.similarity("kichta", "moula")
112
+ 0.7466528
113
+
114
+ model.wv.similarity("bonheur", "moula")
115
+ 0.16985293
116
+ ```
117
+
118
+ Or even get the vector representation of a word:
119
 
120
  ```python
121
+ model.wv['ekip']
122
+ array([ 1.4757039e-01, ... 1.1260221e+00],
123
+ dtype=float32)
124
  ```
125
 
126
  ## Purpose and Disclaimer
 
129
 
130
  ## Contact
131
 
132
+ For any questions or issues, please contact the repository owner, __RapMinerz__, at [email protected].
Word2Bezbar-small.model → word2vec.model RENAMED
File without changes
Word2Bezbar-small.model.syn1neg.npy → word2vec.model.syn1neg.npy RENAMED
File without changes
Word2Bezbar-small.model.wv.vectors.npy → word2vec.model.wv.vectors.npy RENAMED
File without changes