raincandy-u
/

Coder1.8-ORPO-TEST

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

raincandy-u commited on Apr 19

Commit

809e1a0

•

1 Parent(s): d80e895

Create README.md

Files changed (1) hide show

README.md +40 -0

README.md ADDED Viewed

	@@ -0,0 +1,40 @@

+---
+license: other
+license_name: tongyi-qianwen
+license_link: https://huggingface.co/Qwen/Qwen1.5-7B-Chat/blob/main/LICENSE
+language:
+- en
+pipeline_tag: text-generation
+tags:
+- code
+datasets:
+- reciprocate/dpo_ultra-capybara-code_filtered-best
+---
+# Coder1.8-ORPO-TEST
+## Model Description
+Test model for ORPO finetune method, trained on ~20k code examples for 1 epoch on 2 x A40 cards with 4-bit QLora (lora rank=lora alpha=16).
+## Disclaimer
+This is a test model and may generate incorrect responses. Use at your own risk.
+## Train Details
+Base: Qwen1.5-1.8B
+Training Data: ~20k [code examples](https://huggingface.co/datasets/reciprocate/dpo_ultra-capybara-code_filtered-best)
+Epochs: 1
+Method: ORPO
+Hardware: 2 x A40
+Quantization: 4-bit QLora
+Lora Rank/Alpha: 16
+# Limitations
+Limited training data and quantization may impact performance.
+# Join the Discussion
+Have questions or feedback? Join our Discord server [Here](https://discord.gg/KugcbJX5).