RoboApocalypse commited on
Commit
aa13ef7
1 Parent(s): c37485e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -5,21 +5,21 @@ datasets:
5
  - wikimedia/wikipedia
6
  library_name: transformers
7
  ---
8
- # mini-mistral-327M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast
9
 
10
- This repository contains the **mini-mistral-327M** model, a 327 million parameter version of the Mistral architecture, trained for a single epoch. The model was trained on a diverse dataset comprising Wikipedia articles and the OpenHermes dataset. While this model is still in its early stages and not particularly useful as of now, it serves as an experimental showcase of integrating the Grokfast algorithm into the training process.
11
 
12
  ## Model Details
13
 
14
  - **Architecture**: Mistral
15
- - **Parameters**: 327 million
16
  - **Training Duration**: 1 epoch
17
  - **Training Dataset**: Wikipedia articles and OpenHermes dataset
18
  - **Training Method**: Grokfast-enhanced Transformers
19
 
20
  ## Purpose
21
 
22
- The primary goal of this experiment was to observe the impact of the Grokfast algorithm on the training dynamics of a 327M parameter Mistral model. During training, it was noted that the evaluation loss followed the training loss closely, which is an intriguing behavior warranting further investigation.
23
 
24
  ## Usage
25
 
@@ -28,8 +28,8 @@ To use this model, you can load it with the `transformers` library from HuggingF
28
  ```python
29
  from transformers import AutoModel, AutoTokenizer
30
 
31
- tokenizer = AutoTokenizer.from_pretrained("RoboApocalypse/mini-mistral-327M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
32
- model = AutoModel.from_pretrained("RoboApocalypse/mini-mistral-327M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
33
 
34
  # Example usage
35
  input_text = "Hello, world!"
@@ -59,4 +59,4 @@ This model is licensed under the OpenRAIL License.
59
 
60
  ---
61
 
62
- Feel free to check out the model and experiment with it [here](https://huggingface.co/RoboApocalypse/mini-mistral-327M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast). Your feedback and insights are welcome as I try and figure out wtf I'm doing.
 
5
  - wikimedia/wikipedia
6
  library_name: transformers
7
  ---
8
+ # mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast
9
 
10
+ This repository contains the **mini-mistral-360M** model, a 360 million parameter version of the Mistral architecture, trained for a single epoch. The model was trained on a diverse dataset comprising Wikipedia articles and the OpenHermes dataset. While this model is still in its early stages and not particularly useful as of now, it serves as an experimental showcase of integrating the Grokfast algorithm into the training process.
11
 
12
  ## Model Details
13
 
14
  - **Architecture**: Mistral
15
+ - **Parameters**: 360 million
16
  - **Training Duration**: 1 epoch
17
  - **Training Dataset**: Wikipedia articles and OpenHermes dataset
18
  - **Training Method**: Grokfast-enhanced Transformers
19
 
20
  ## Purpose
21
 
22
+ The primary goal of this experiment was to observe the impact of the Grokfast algorithm on the training dynamics of a 360M parameter Mistral model. During training, it was noted that the evaluation loss followed the training loss closely, which is an intriguing behavior warranting further investigation.
23
 
24
  ## Usage
25
 
 
28
  ```python
29
  from transformers import AutoModel, AutoTokenizer
30
 
31
+ tokenizer = AutoTokenizer.from_pretrained("RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
32
+ model = AutoModel.from_pretrained("RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
33
 
34
  # Example usage
35
  input_text = "Hello, world!"
 
59
 
60
  ---
61
 
62
+ Feel free to check out the model and experiment with it [here](https://huggingface.co/RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast). Your feedback and insights are welcome as I try and figure out wtf I'm doing.