Questions regarding Stack v2 and StarCoder v2

#21
by aditya2211 - opened

Hi BigCoders,

I had a few questions around Stack v2 and StarCoder v2:
(a) When can we expect the remaining Stack v2 data (documentation etc.) to be released?
(b) For StarCoder v2 pretraining, what was the policy used for packing/chunking? Were documents chunked into multiple segments during pretraining to (a) pack more and (b) to accomodate for longer documents?

Sign up or log in to comment