Edit model card

byt5-small-finetuned-English-to-BASH

Created by: Josh Shih, Alex Sha, Kevin Um for EEP 596 - Natural Language Processing at University of Washington (Seattle).

Model description

This model is a fine-tuned version of google/byt5-small on a more balanced iteration of the NL2BASH dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4850
  • Nl2bash M: 0.6376
  • Gen Len: 16.9946

Intended uses & limitations

Purpose: To generate bash commands from text input, and help people learn to use linux bash. This is a proof of concept model using transfer learning to fine-tune an existing language model and produce structured code instead of natural language.

Training and evaluation data

This model was trained and evaluated using a custom iteration of NL2BASH. The original NL2BASH dataset contains a large class imbalance with too many bash commands which begin with 'find'.

A maximum threshold was set to remove text/BASH pairs which exceeded the threshold, and GPT-3 API was used to generate text/BASH pairs for those below the threshold.

~5500 original text/BASH pairs and ~5700 generated text/BASH pairs were used, giving a total of ~11200 lines of text/BASH pairs. Shown below is the class distribution for the top-5 commands. class_balanced.png

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Nl2bash M Gen Len
1.6031 1.0 561 0.8678 0.2384 16.9411
0.9581 2.0 1122 0.6940 0.4089 17.2855
0.7882 3.0 1683 0.6043 0.4878 17.1481
0.708 4.0 2244 0.5689 0.5439 17.1427
0.6459 5.0 2805 0.5368 0.5716 16.893
0.5978 6.0 3366 0.5189 0.5903 17.1615
0.5562 7.0 3927 0.5053 0.6162 17.0571
0.5368 8.0 4488 0.4914 0.622 17.0705
0.5012 9.0 5049 0.4880 0.6393 16.8359
0.4956 10.0 5610 0.4850 0.6376 16.9946

Framework versions

  • Transformers 4.27.0.dev0
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.0
  • Tokenizers 0.13.2
Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.