Jam_sojm

Jam_sojm is a GPT2-like model for research in fine-grained Java analysis. It is intended for fine-grained analysis of Java source code at the level of methods, statements, and variables, as a foundation for downstream tasks like code completion, comment generation, and automated bug repair.

Jam_sojm Training Details

We trained the jam_sojm model using the training procedures from Daniel Grittner's NanoGPT-LoRA
The datasets used to train our model are our own datasets so13m dataset and jm52m dataset.
First we train the model on so13m training set for 1 epoch, roughly 300,000 training iterations.
We reset the learning rate and weight decay, then train it again on the jm52mm training set for 1 more epoch, roughly 300,000 more training iterations for a total of 600,000 iterations.
Our GitHub repo contains the code for re-training using the raw data.

Hyperparameter	Description	Value
e	embedding dimensions	1024
L	number of layers	24
h	attention heads	16
c	block size / context length	256
b	batch size	4
a	accumulation steps	32
d	dropout	0.20
r	learning rate	3e-5
y	weight decay	1e-1

We train our models using a single NVidia A5000 GPUs.

Jam Projects

Current projects using the jam_sojm pre-trained model can be found at our Github repository:

https://github.com/apcl-research/jam

apcl
/

jam_sojm

Jam_sojm

Jam_sojm Training Details

Jam Projects

Datasets used to train apcl/jam_sojm