Is there anything about the training data that makes this specifically better at Java and C++?
Hi, just converting this model to GGUF format now and have a couple of questions.
From: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard
humaneval-python = 76.83
java = 60.76
javascript = 66.46
cpp = 65.22
Is there anything about the training data that makes this specifically better at Java and C++? This seems to be the first recently fine-tuned coding model I've seen that isn't massively biased towards Python (to game the humaneval-python
benchmarks, etc). The recent WizardCoder-33B-V1.1
, which is also fine-tuned from Deepseek-Coder-33B
, is so over-trained on Python that it tries to convert everything it's given in C++ or Java into Python, and is basically unusable for anything else!!!
I will give it a try and report back on how I get on.
Sadly I don't have enough upload bandwidth to upload the GGUF(s), but hopefully @TheBloke or @LoneStriker will convert it soon as a non-Python targeted fine-tune could be very useful to a lot of people.
I'm sorry for not being able to respond in time.
In the training of this model, we used unit test generation data containing Java/C++ and code practice exercises (also containing Java/C++) we constructed (referencing the PHI-Textbook work). We have published an article on WeChat's official accounts which contains more information; however, I apologize that it is written in Chinese https://mp.weixin.qq.com/s/2Ddm7-aUJuEnsESSxkmkGg