Dataset

by titan087 - opened Mar 9

Mar 9

Hey,

What dataset did you use to finetune this model? I was looking for one to finetune codellama 34b and havent found one that looked good.

Thanks!

akameswa

Owner Mar 10

Same here. So I chose a benchmark dataset.
https://huggingface.co/datasets/codeparrot/xlcost-text-to-code

akameswa

Owner Mar 10

The JavaScript subsection has about 10K rows. I felt that to be good enough for a fine-tune. Let me know your thoughts as well.

titan087

May 19

Its worth a shot, for a basic test I can try training either Gemma or Llama3, or potentially Phi-3, at least to start with. If it works well enough, than scale it up to one of the coding based 34b models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment