apcl
/

Edit model card

Jam_so

Jam_so is a GPT2-like model for research in fine-grained Java analysis. It is intended for fine-grained analysis of Java source code at the level of methods, statements, and variables, as a foundation for downstream tasks like code completion, comment generation, and automated bug repair.


Jam_so Training Details

  • We trained the jam_so model using the training procedures from Daniel Grittner's NanoGPT-LoRA

  • The dataset used to train our model is our own dataset so13m dataset, processed from 13 million StackOverflow posts picked from a Stack Exchange data dump for posts between January 2014 and December 2022.

  • We train the model on training set for 1 epoch, roughly 300,000 training iterations.

  • Our GitHub repo contains the code for re-training using the raw data.

Hyperparameter Description Value
e embedding dimensions 1024
L number of layers 24
h attention heads 16
c block size / context length 256
b batch size 4
a accumulation steps 32
d dropout 0.20
r learning rate 3e-5
y weight decay 1e-1

We train our models using a single NVidia A5000 GPUs.


Jam Projects

Current projects using the jam_so pre-trained model can be found at our Github repository:

https://github.com/apcl-research/jam

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Dataset used to train apcl/jam_so