vrashad commited on
Commit
3e3c87e
1 Parent(s): dd4e2f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -2
README.md CHANGED
@@ -42,11 +42,11 @@ pip install transformers
42
  ```
43
 
44
  ```python
45
- from transformers import AutoModelForSequenceClassification, AutoTokenizer
46
  import torch
47
 
48
  # Load tokenizer and model
49
- tokenizer = AutoTokenizer.from_pretrained("LocalDoc/language_detection")
50
  model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
51
 
52
  # Prepare text
@@ -67,6 +67,35 @@ predicted_label = labels[predicted_class_index]
67
  print(f"Predicted Language: {predicted_label}")
68
  ```
69
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
 
72
  Training Performance
 
42
  ```
43
 
44
  ```python
45
+ from transformers import AutoModelForSequenceClassification, XLMRobertaTokenizer
46
  import torch
47
 
48
  # Load tokenizer and model
49
+ tokenizer = XLMRobertaTokenizer.from_pretrained("LocalDoc/language_detection")
50
  model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
51
 
52
  # Prepare text
 
67
  print(f"Predicted Language: {predicted_label}")
68
  ```
69
 
70
+ ## Language Label Information
71
+
72
+ The model outputs a label for each prediction, corresponding to one of the languages listed below. Each label is associated with a specific language code as detailed in the following table:
73
+
74
+ | Label | Language Code | Language Name |
75
+ |-------|---------------|---------------|
76
+ | 0 | az | Azerbaijani |
77
+ | 1 | ar | Arabic |
78
+ | 2 | bg | Bulgarian |
79
+ | 3 | de | German |
80
+ | 4 | el | Greek |
81
+ | 5 | en | English |
82
+ | 6 | es | Spanish |
83
+ | 7 | fr | French |
84
+ | 8 | hi | Hindi |
85
+ | 9 | it | Italian |
86
+ | 10 | ja | Japanese |
87
+ | 11 | nl | Dutch |
88
+ | 12 | pl | Polish |
89
+ | 13 | pt | Portuguese |
90
+ | 14 | ru | Russian |
91
+ | 15 | sw | Swahili |
92
+ | 16 | th | Thai |
93
+ | 17 | tr | Turkish |
94
+ | 18 | ur | Urdu |
95
+ | 19 | vi | Vietnamese |
96
+ | 20 | zh | Chinese |
97
+
98
+ This mapping is utilized to decode the model's predictions into understandable language names, facilitating the interpretation of results for further processing or analysis.
99
 
100
 
101
  Training Performance