apcl/jam-contextsum · Hugging Face

Jam-Contextsum

Jam-Contextsum is a GPT2-like model finetuned to generate summary on why the method exists.

ckpt_pretrain is the file that we use to finetune the model for generating the summary on why the method exists
Our GitHub repo contains the code for reproduction using the same data.

Hyperparameter	Description	Value
e	embedding dimensions	512
L	number of layers	4
h	attention heads	4
c	block size / context length	1,024
b	batch size	4
a	accumulation steps	32
d	dropout	0.20
r	learning rate	3e-5
y	iterations	1e-5
iter	number of iterations after pretraing	137,900