--- license: mit datasets: - IlyaGusev/ru_turbo_alpaca - IlyaGusev/ru_turbo_alpaca_evol_instruct - IlyaGusev/ru_turbo_saiga - IlyaGusev/ru_sharegpt_cleaned - IlyaGusev/oasst1_ru_main_branch - IlyaGusev/gpt_roleplay_realm - lksy/ru_instruct_gpt4 language: - ru - en library_name: peft pipeline_tag: conversational tags: - Saiga - ruGPT-3.5 - 13B - chat - lora - Peft - adapter --- # ruGPT-3.5 13B LoRA This is an adapter-only version, based on [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B). Training code is [here](https://github.com/EvilFreelancer/ruGPT-3.5-13B-lora) > You may use ruGPT-3.5 13B fp16 base model instead. ## Training procedure The following `bitsandbytes` quantization config was used during training: - quant_method: bitsandbytes - load_in_8bit: True - load_in_4bit: False - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: fp4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float32 ### Framework versions - PyTorch 2.1.0 - PEFT 0.5.0 - bitsandbytes 0.41.1 - transformers 4.34.0