Model Name: LoRA Fine-Tuned Model for Dialogue Summarization
Model Type: Seq2Seq with Low-Rank Adaptation (LoRA)
Base Model: google/t5-base
Model Details
- Architecture: T5-base
- Finetuning Technique: LoRA (Low-Rank Adaptation)
- PEFT Method: Parameter Efficient Fine-Tuning
- Data: samsumdataset
- Metrics: Evaluated using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum)
Intended Use
This model is designed for summarizing dialogues, such as conversations between individuals in a chat or messaging context. It’s suitable for applications in:
- Customer Service: Summarizing chat logs for quality monitoring or training.
- Messaging Apps: Generating conversation summaries for user convenience.
- Content Creation: Assisting writers by summarizing character dialogues.
Training Process
Optimizer: AdamW with learning rate 3e-5
Batch Size: 4 (gradient accumulation steps of 2)
Training Epochs: 2
Evaluation Metrics: ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum
Hardware: Trained on a single GPU with mixed precision to optimize performance.
The model was trained using the Seq2SeqTrainer class from transformers, with LoRA parameters applied to selected attention layers to reduce computation without compromising accuracy.
- Downloads last month
- 11
Model tree for dnzblgn/Chat-Summarization
Base model
google-t5/t5-base