LeroyDyer commited on
Commit
cf7541a
1 Parent(s): b7640a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -32,11 +32,9 @@ pipeline_tag: text-generation
32
 
33
  This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)
34
 
35
- Current update : UNDER TEST ! / Self Extend still to be applied : this is just an Early Release of the StaR model ! cleaned up tensors but !
36
- problem with cos cache ?? --- Will fix it tomorow or this week here : Im workingonit!
37
- Also cannot get to do the trainingbecause it need large memory and the A100 colab will not seem to work ! ( accelerate issue )
38
- (I cannot make it a gguf ?HOW!) - Unless maybe i hack the transformers library maybe ?(reframe the model as themistral and replace the existing file ( thats how they had doen it in the past , perhaps i will have to stay as a remte code model ?))
39
-
40
  # Introduction :
41
 
42
  ## STAR REASONERS !
 
32
 
33
  This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)
34
 
35
+ Current update : UNDER TEST ! Currently loading in unsloth but there still is the cos-cache issue so the model will load as mistral :
36
+ to load as talking heads you will need to copy modelling.py/configuration.py to the mistral folder of the transformers library and compile the latest then it loads fine but as remote code it has this issue !
37
+ i will FIX IT !!!~
 
 
38
  # Introduction :
39
 
40
  ## STAR REASONERS !