raincandy-u commited on
Commit
809e1a0
1 Parent(s): d80e895

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: tongyi-qianwen
4
+ license_link: https://huggingface.co/Qwen/Qwen1.5-7B-Chat/blob/main/LICENSE
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - code
10
+ datasets:
11
+ - reciprocate/dpo_ultra-capybara-code_filtered-best
12
+ ---
13
+
14
+ # Coder1.8-ORPO-TEST
15
+
16
+ ## Model Description
17
+
18
+ Test model for ORPO finetune method, trained on ~20k code examples for 1 epoch on 2 x A40 cards with 4-bit QLora (lora rank=lora alpha=16).
19
+
20
+ ## Disclaimer
21
+
22
+ This is a test model and may generate incorrect responses. Use at your own risk.
23
+
24
+ ## Train Details
25
+
26
+ Base: Qwen1.5-1.8B
27
+ Training Data: ~20k [code examples](https://huggingface.co/datasets/reciprocate/dpo_ultra-capybara-code_filtered-best)
28
+ Epochs: 1
29
+ Method: ORPO
30
+ Hardware: 2 x A40
31
+ Quantization: 4-bit QLora
32
+ Lora Rank/Alpha: 16
33
+
34
+ # Limitations
35
+
36
+ Limited training data and quantization may impact performance.
37
+
38
+ # Join the Discussion
39
+
40
+ Have questions or feedback? Join our Discord server [Here](https://discord.gg/KugcbJX5).