metadata

license: bigscience-openrail-m
datasets:
  - apcl/jm52m

Jam

Jam is a GPT2-like model for research in fine-grained Java analysis. It is intended for fine-grained analysis of Java source code at the level of methods, statements, and variables, as a foundation for downstream tasks like code completion, comment generation, and automated bug repair.

Jam Training Details

We trained the jam model using the training procedures from Daniel Grittner's NanoGPT-LoRA
The dataset used to train our model is our own dataset jm52m dataset, which consists of the processed source code of 52 million Java methods.
We train the model on training set for 1 epoch, roughly 300,000 training iterations.
Our GitHub repo contains the code for re-training using the raw data

Hyperparameter	Description	Value
e	embedding dimensions	1024
L	number of layers	24
h	attention heads	16
c	block size / context length	256
b	batch size	4
a	accumulation steps	32
d	dropout	0.20
r	learning rate	3e-5
y	weight decay	1e-1

We train our models using a single NVidia A5000 GPU.

Jam Projects

Current projects using the JAM pre-trained model can be found at our Github repository:

https://github.com/apcl-research/jam