metadata

license: mit
datasets:
  - agicorp/python_code_instructions_18k_alpaca
language:
  - en
base_model:
  - openai-community/gpt2
pipeline_tag: text-generation
library_name: transformers
tags:
  - code

Model Card

GPT2Coder is a language model that uses openAI's GPT2 model architecture, the model was pre-trained on multiple code data focused on python and languages such as Spanish and English.

It is a pre-trained model in a medium amount of code, so it is not recommended to use it like this, but it is functional and serves uses such as fine tuning and other tasks.

Model Details

Developed by: BueormAI
Shared by: BueormLLC
Model type: Transformer
Language(s) (NLP): English (en), Spanish (es)
License: MiT
Finetuned from model: GPT2 Architecture

Bias, Risks, and Limitations

The model can generate unexpected code and output, in addition to offensive texts and non-functional code.

Recommendations

We recommend using the model with caution and handling its outputs with discretion as they may turn out to be non-functional outputs and harmful and dangerous code.

Training Details

Training Hyperparameters

Training regime: fp16 mixed precision
Max_lenght: 1024 tokens
pretrain epochs: 1 epochs
finetuning epochs: 2 epochs

Environmental Impact

Hardware Type: GPU P100
Hours used: 18 hours
Cloud Provider: Kaggle

By Bueorm

Thanks to all the people who download and support our projects and manage a vision towards the future with AI, we hope you will support us to continue advancing and launching more followed models.