gpt2-medium-finetuned-contract-gen
Overview
gpt2-medium-finetuned-contract-gen
is a model specialized in generating Solidity contract codes. Derived from the gpt2-medium model by Hugging Face, it's been meticulously trained on an extensive set of Solidity contracts and patterns, making it apt for assisting in drafting or suggesting contract structures.
Model Description
This model has been designed specifically for generating Solidity contracts. Being a derivative of the gpt2-medium
model, it retains the broader capabilities of the parent model while demonstrating a keen proficiency in understanding and generating Solidity-centric texts.
Performance
The model reported a loss of 0.3127
on the evaluation set.
Intended Uses & Limitations
Intended Uses:
- Assist developers by auto-generating contract code snippets based on prompts.
- Help in understanding and drafting complex contract structures.
Limitations:
- The generated code must be reviewed for security and functional correctness.
- The clarity of the generated code largely depends on the specificity of the prompt.
Training Details
Dataset
The model was fine-tuned on an undisclosed dataset comprised of a range of Solidity contracts.
Training Hyperparameters:
- Learning Rate:
5e-05
- Train Batch Size:
4
- Evaluation Batch Size:
4
- Seed:
42
- Optimizer: Adam (
betas=(0.9,0.999)
,epsilon=1e-08
) - Learning Rate Scheduler: Cosine with restarts
- Warmup Steps:
241
- Epochs:
4
Training Results:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.4744 | 0.21 | 1000 | 0.4736 |
0.467 | 0.41 | 2000 | 0.4146 |
0.4089 | 0.62 | 3000 | 0.3852 |
0.4018 | 0.83 | 4000 | 0.3688 |
0.3475 | 1.04 | 5000 | 0.3523 |
0.2751 | 1.24 | 6000 | 0.3434 |
0.2966 | 1.45 | 7000 | 0.3334 |
0.292 | 1.66 | 8000 | 0.3230 |
0.2899 | 1.87 | 9000 | 0.3200 |
0.2508 | 2.07 | 10000 | 0.3164 |
0.28 | 2.28 | 11000 | 0.3127 |
Dependencies:
- Transformers:
4.31.0
- Pytorch:
2.0.1+cu118
- Datasets:
2.14.2
- Tokenizers:
0.13.3
How to Use
If you wish to use this model to generate Solidity contract code, follow the steps below:
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")
model = AutoModelForCausalLM.from_pretrained("ckandemir/gpt2-medium-finetuned-contract-gen")
# Input your code prompt
input_text = "contract MyToken"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
sample_output = model.generate(input_ids, do_sample=True, max_length=400, num_return_sequences=1, temperature=0.7)
# Decode and print the generated text
generated_text = tokenizer.decode(sample_output[0], skip_special_tokens=True)
print(generated_text)
- Downloads last month
- 20
Model tree for ckandemir/gpt2-medium-finetuned-contract-gen
Base model
openai-community/gpt2-medium