edugp commited on
Commit
a618bc2
1 Parent(s): 98c2b8e

Add README

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ # Download datasets:
2
+ * Download and decompress tsv file from here: https://github.com/google-research-datasets/wit/blob/main/DATA.md
3
+ * Use `prepare_wit.py` to download images from Wikipedia.
4
+ * Use `discard_incorrect_files` to filter out corrupt files.`TODO: Still some corrupt files are being kept.` `TODO: Make it a CLI`.
5
+ * Finally, use `run-clip.sh` to train.