competitions/CryCeleb2023 · How the data is split in the public test set and private test set?

dhoa

May 21, 2023

Could you explain how the data is divided into the public test set and the private test set? I can't find information about this split ( to know about the confidence in the public leaderboard). Thanks!

gorinars

May 23, 2023

Private and public are split at random but public is much smaller (1024 pairs, 32 infants). Private has 24576 pairs and 160 infants

We will be looking closer at private closer to the end of the competition. If differences are too huge we may perhaps consider opening private leaderboard for a couple of submissions to allow participants select the best model.

At this point, I can say the trend of private/public is OK but there are few outliers
We strongly encourage to use dev and perhaps cross-validation

fconti

Jun 19, 2023

I believe the public set is actually quite small, even smaller than the dev set. There might be significant surprises in the final leaderboard.

gorinars

Jun 19, 2023

@fconti yes. This is why we decided to open the private set for a couple of days before the competition end (see timeline)