update model

Browse files

Files changed (5) hide show

.DS_Store +0 -0
README.md +61 -9
Word2Bezbar-small.model → word2vec.model +0 -0
Word2Bezbar-small.model.syn1neg.npy → word2vec.model.syn1neg.npy +0 -0
Word2Bezbar-small.model.wv.vectors.npy → word2vec.model.wv.vectors.npy +0 -0

.DS_Store CHANGED Viewed

Binary files a/.DS_Store and b/.DS_Store differ

README.md CHANGED Viewed

@@ -4,6 +4,7 @@ language:
 tags:
 - music
 - rap
 - word2vec
 library_name: gensim
 ---
@@ -11,11 +12,11 @@ library_name: gensim
 ## Overview
-Word2Bezbar are Word2Vec models trained on a 323MB dataset of cleaned French rap lyrics sourced from Genius. The model captures the semantic relationships between words in the context of French rap, providing a useful tool for academic and research purposes in natural language processing (NLP) and linguistic studies.
 ## Model Details
-Size is __small__
 | Parameter      | Value        |
 |----------------|--------------|
@@ -24,7 +25,7 @@ Size is __small__
 | Epochs         | 10           |
 | Algorithm      | CBOW         |
-## Requirements
 | Requirement    | Version      |
 |----------------|--------------|
@@ -48,7 +49,7 @@ Size is __small__
 3. **Navigate to the Model Directory**:
     ```bash
-    cd Word2Bezbar-large
     ```
 ## Loading the Model
@@ -59,16 +60,67 @@ To load the Word2Bezbar Word2Vec model, use the following Python code:
 import gensim
 # Load the Word2Vec model
-model = gensim.models.Word2Vec.load("Word2Bezbar-small/word2vec.model")
 ```
 ## Using the Model
-Once the model is loaded, you can use it to find words similar to a given word. For example, to find words similar to "kichta":
 ```python
-similar_words = model.wv.most_similar("kichta")
-print(similar_words)
 ```
 ## Purpose and Disclaimer
@@ -77,4 +129,4 @@ This model is designed for academic and research purposes only. It is not intend
 ## Contact
-For any questions or issues, please contact the repository owner at [email protected].

 tags:
 - music
 - rap
+- lyrics
 - word2vec
 library_name: gensim
 ---
 ## Overview
+Word2Bezbar are Word2Vec models trained on a 323MB dataset of cleaned French rap lyrics sourced from Genius. The model captures the semantic relationships between words in the context of French rap, providing a useful tool for academic and research purposes in natural language processing and linguistic studies associated to music writing.
 ## Model Details
+Size of this model is __small__
 | Parameter      | Value        |
 |----------------|--------------|
 | Epochs         | 10           |
 | Algorithm      | CBOW         |
+## Versions
 | Requirement    | Version      |
 |----------------|--------------|
 3. **Navigate to the Model Directory**:
     ```bash
+    cd Word2Bezbar-small
     ```
 ## Loading the Model
 import gensim
 # Load the Word2Vec model
+model = gensim.models.Word2Vec.load("word2vec.model")
 ```
 ## Using the Model
+Once the model is loaded, you can use it as shown
+To get the most similary words regarding a word:
+```python
+model.wv.most_similar("bendo")
+[('binks', 0.8920747637748718),
+ ('bando', 0.8460732698440552),
+ ('hood', 0.8299438953399658),
+ ('tieks', 0.8264378309249878),
+ ('hall', 0.817583441734314),
+ ('secteur', 0.8145656585693359),
+ ('barrio', 0.809047281742096),
+ ('block', 0.793493390083313),
+ ('bâtiment', 0.7826434969902039),
+ ('bloc', 0.7753982543945312)]
+model.wv.most_similar("kichta")
+[('liasse', 0.878665566444397),
+ ('sse-lia', 0.8552991151809692),
+ ('kishta', 0.8535938262939453),
+ ('kich', 0.7646669149398804),
+ ('skalape', 0.7576569318771362),
+ ('moula', 0.7466527223587036),
+ ('valise', 0.7429592609405518),
+ ('sacoche', 0.7324921488761902),
+ ('mallette', 0.7247079014778137),
+ ('re-pai', 0.7060815095901489)]
+```
+To find the word that doesn't match in a list of words:
+```python
+model.wv.doesnt_match(["racli","gow","gadji","fimbi","boug"])
+'boug'
+model.wv.doesnt_match(["Zidane","Mbappé","Ronaldo","Messi","Jordan"])
+'Jordan'
+```
+To find the similarity between two words:
+```python
+model.wv.similarity("kichta", "moula")
+0.7466528
+model.wv.similarity("bonheur", "moula")
+0.16985293
+```
+Or even get the vector representation of a word:
 ```python
+model.wv['ekip']
+array([ 1.4757039e-01,  ... 1.1260221e+00],
+      dtype=float32)
 ```
 ## Purpose and Disclaimer
 ## Contact
+For any questions or issues, please contact the repository owner, __RapMinerz__, at [email protected].

Word2Bezbar-small.model → word2vec.model RENAMED Viewed

File without changes

Word2Bezbar-small.model.syn1neg.npy → word2vec.model.syn1neg.npy RENAMED Viewed

File without changes

Word2Bezbar-small.model.wv.vectors.npy → word2vec.model.wv.vectors.npy RENAMED Viewed

File without changes