--- license: other license_name: tongyi-qianwen license_link: https://huggingface.co/Qwen/Qwen1.5-7B-Chat/blob/main/LICENSE language: - en pipeline_tag: text-generation tags: - code datasets: - reciprocate/dpo_ultra-capybara-code_filtered-best --- # Coder1.8-ORPO-TEST ## Model Description Test model for ORPO finetune method, trained on ~20k code examples for 1 epoch on 2 x A40 cards with 4-bit QLora (lora rank=lora alpha=16). ## Disclaimer This is a test model and may generate incorrect responses. Use at your own risk. ## Train Details Base: Qwen1.5-1.8B Training Data: ~20k [code examples](https://huggingface.co/datasets/reciprocate/dpo_ultra-capybara-code_filtered-best) Epochs: 1 Method: ORPO Hardware: 2 x A40 Quantization: 4-bit QLora Lora Rank/Alpha: 16 # Limitations Limited training data and quantization may impact performance. # Join the Discussion Have questions or feedback? Join our Discord server [Here](https://discord.gg/KugcbJX5).