Update README.md
Browse files
README.md
CHANGED
@@ -34,7 +34,7 @@ This is based on the Quiet Star Reasoning Project : which was abandoned earlier
|
|
34 |
|
35 |
Current update : UNDER TEST ! / Self Extend still to be applied : this is just an Early Release of the StaR model ! cleaned up tensors but !
|
36 |
problem with cos cache ?? --- Will fix it tomorow or this week here : Im workingonit!
|
37 |
-
|
38 |
(I cannot make it a gguf ?HOW!) - Unless maybe i hack the transformers library maybe ?(reframe the model as themistral and replace the existing file ( thats how they had doen it in the past , perhaps i will have to stay as a remte code model ?))
|
39 |
|
40 |
# Introduction :
|
|
|
34 |
|
35 |
Current update : UNDER TEST ! / Self Extend still to be applied : this is just an Early Release of the StaR model ! cleaned up tensors but !
|
36 |
problem with cos cache ?? --- Will fix it tomorow or this week here : Im workingonit!
|
37 |
+
Also cannot get to do the trainingbecause it need large memory and the A100 colab will not seem to work ! ( accelerate issue )
|
38 |
(I cannot make it a gguf ?HOW!) - Unless maybe i hack the transformers library maybe ?(reframe the model as themistral and replace the existing file ( thats how they had doen it in the past , perhaps i will have to stay as a remte code model ?))
|
39 |
|
40 |
# Introduction :
|