MultiPL-T DeepSeekCoder-33b-Base

This repository holds a DeepSeekCoder-33b-base fine-tune on MultiPL-T Racket. Examine the commit message to determine the language and checkpoint. We have a checkpoint for each epoch.

For more information the training process, see the MultiPL-T paper:

@misc{cassano:multipl-t,
      title={Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs}, 
      author={Federico Cassano and John Gouwar and Francesca Lucchetti and Claire Schlesinger and Anders Freeman and Carolyn Jane Anderson and Molly Q Feldman and Michael Greenberg and Abhinav Jangda and Arjun Guha},
      year={2024},
      eprint={2308.09895},
      archivePrefix={arXiv},
      primaryClass={cs.PL}
}

For usage instructions, see the model card for the original model. Replace the model name with the name of this repository, and set revision=COMMIT_HASH.

Downloads last month
12
Safetensors
Model size
33.3B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nuprl/MultiPL-T-DeepSeekCoder_33b

Quantizations
2 models

Dataset used to train nuprl/MultiPL-T-DeepSeekCoder_33b

Collection including nuprl/MultiPL-T-DeepSeekCoder_33b