Model Card for Quantized T5-Large
Licensing
Original Model
The base model, T5-Large, is licensed under the Apache 2.0 License. For more details, please refer to the T5-Large Model Card.
Your Model
This quantized version of T5-Large is licensed under the MIT License. The modifications include quantization and optimization for specific use cases.
License
This model is licensed under the MIT License. See the LICENSE
file for details.
Compliance
- This model includes modifications to the original T5-Large model. The original Apache 2.0 license terms are respected, and the original license and notices are included in the distribution.
Model Details
Model Description: The Quantized T5-Large model is a version of the T5-Large model with 770 million parameters that has been quantized for a reduced memory footprint and faster inference. The T5 model is designed to handle a wide range of NLP tasks by framing all tasks as text-to-text problems.
Model Type: Language model
Languages: English, French, Romanian, German
License: MIT
Related Models: All T5 Checkpoints
Resources for More Information:
Uses
Direct Use and Downstream Use: This model can be used for machine translation, document summarization, question answering, and classification tasks.
Out-of-Scope Use: Please refer to known limitations and consider potential biases.
Bias, Risks, and Limitations
- Bias and Risks: The model may reflect biases present in the training data. Users should be aware of potential risks and limitations when applying the model to sensitive or high-stakes tasks.
Training Details
Training Data: The model is pre-trained on the Colossal Clean Crawled Corpus (C4), among other datasets.
Datasets Used:
- Unsupervised: C4, Wiki-DPR
- Supervised: Various datasets for tasks like sentiment analysis, question answering, etc.
Training Procedure: The T5 framework involves a comprehensive training procedure for converting every language problem into a text-to-text format. For detailed information, see the T5 Research Paper.
Evaluation
- Testing Data, Factors & Metrics: The model was evaluated on 24 tasks. For detailed evaluation results, refer to the T5 Research Paper.
Environmental Impact
- Hardware Type: Google Cloud TPU Pods
- Hours Used: [Include if available]
- Cloud Provider: GCP
- Carbon Emitted: [Include if available]
Citation
- BibTeX:
@article{2020t5, author = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu}, title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer}, journal = {Journal of Machine Learning Research}, year = {2020}, volume = {21}, number = {140}, pages = {1-67}, url = {http://jmlr.org/papers/v21/20-074.html} }
How to Get Started With the Model
from transformers import T5Tokenizer, T5ForConditionalGeneration
# Load tokenizer and model
tokenizer = T5Tokenizer.from_pretrained("AlanaBF/abf_quantized_t5_large")
model = T5ForConditionalGeneration.from_pretrained("AlanaBF/abf_quantized_t5_large")
# Example usage
input_text = "Translate English to German: How are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
- Downloads last month
- 0