ryantwolf commited on
Commit
90fb34c
1 Parent(s): 279ac9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -9
README.md CHANGED
@@ -1,9 +1,56 @@
1
- ---
2
- tags:
3
- - pytorch_model_hub_mixin
4
- - model_hub_mixin
5
- ---
6
-
7
- This model has been pushed to the Hub using ****:
8
- - Repo: [More Information Needed]
9
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Overview
2
+ This is a text classification model to classify documents into one of 26 domain classes:
3
+
4
+ 'Adult', 'Arts_and_Entertainment', 'Autos_and_Vehicles', 'Beauty_and_Fitness', 'Books_and_Literature', 'Business_and_Industrial', 'Computers_and_Electronics', 'Finance', 'Food_and_Drink', 'Games', 'Health', 'Hobbies_and_Leisure', 'Home_and_Garden', 'Internet_and_Telecom', 'Jobs_and_Education', 'Law_and_Government', 'News', 'Online_Communities', 'People_and_Society', 'Pets_and_Animals', 'Real_Estate', 'Science', 'Sensitive_Subjects', 'Shopping', 'Sports', 'Travel_and_Transportation'
5
+
6
+ # Model Architecture
7
+ The model architecture is Deberta V3 Base
8
+ Context length is 512 tokens
9
+
10
+ # Training (details)
11
+ ## Training data:
12
+ - 1 million Common Crawl samples, labeled using Google Cloud’s Natural Language API: https://cloud.google.com/natural-language/docs/classifying-text
13
+ - 500k Wikepedia articles, curated using Wikipedia-API: https://pypi.org/project/Wikipedia-API/
14
+
15
+ ## Training steps:
16
+ - Train a first model on Wikipedia data
17
+ - Randomly sample 1 million Common Crawl data; label them using Google Cloud API
18
+ - Predict these 1 million samples using the first model
19
+ - Google’s labels and first model’s prediction agree on about 500k samples
20
+ - Split these 500k samples 80%/20%. Train the final model on the 80%, and evaluate on the 20%
21
+
22
+ # How To Use This Model
23
+
24
+ ## Input
25
+ The model takes one or several paragraphs of text as input.
26
+
27
+ Example input:
28
+ q Directions
29
+
30
+ 1. Mix 2 flours and baking powder together
31
+ 2. Mix water and egg in a separate bowl. Add dry to wet little by little
32
+ 3. Heat frying pan on medium
33
+ 4. Pour batter into pan and then put blueberries on top before flipping
34
+ 5. Top with desired toppings!
35
+
36
+ ## Output
37
+ The model outputs one of the 26 domain classes as the predicted domain for each input sample.
38
+
39
+ Example output:
40
+ Food_and_Drink
41
+
42
+ # Evaluation Benchmarks
43
+ Accuracy on 500 human annotated samples
44
+ - Google API 77.5%
45
+ - Our model 77.9%
46
+
47
+ PR-AUC score on evaluation set with 105k samples
48
+ - 0.9873
49
+
50
+ # References
51
+ https://arxiv.org/abs/2111.09543
52
+ https://github.com/microsoft/DeBERTa
53
+
54
+
55
+ # License
56
+ License to use this model is covered by the Apache 2.0. By downloading the public and release version of the model, you accept the terms and conditions of the Apache License 2.0.