Spaces:
Build error
Build error
Commit History
More flexibility in specifying cache directory.
101aa18
meg-huggingface
commited on
Scripts to generate cache
db74ba9
meg-huggingface
commited on
Standardizing filenaming a bit.
0803ab3
meg-huggingface
commited on
More modularizing; npmi and labels
a2ae370
meg-huggingface
commited on
Some additional modularizing and caching of the text lengths widget
335424f
meg-huggingface
commited on
Modularization and caching of text length widget
85cf91c
meg-huggingface
commited on
Removes extraneous debugging print statements
6a9c993
meg-huggingface
commited on
Missing a dependency; adding to requirements.txt
6557527
meg-huggingface
commited on
Begins modularizing so that each widget can be independently loaded without having a requirement on the ordering of load_or_preparing in app.py. This means that each function corresponding to a widget will check if the variables it depends on have been calculated yet. If not, it will call back to calculate them. Because of the messiness this causes with passing the use_cache variable around, I've now set use_cache as a global variable, set when the DatasetStatisticsCacheClass is initialized, and removed the use_cache arguments appearing in nearly every function.
4b53042
meg-huggingface
commited on
Removing need to keep around base dset for the header widget; now just saving what is shown -- the first n lines of the base dataset -- as a json, and loading if it's cached.
66693d5
meg-huggingface
commited on
Removing any need for a dataframe in expander_general_stats; instead making sure to cache and load the small amount of details needed for this widget. Note I also moved around a couple functions -- same content, just moved -- so that it was easier for me to navigate through the code. I also pulled out a couple of sub-functions from larger functions, again to make the code easier to work with/understand, as well as helping to further modularize so we can limit what needs to be cached.
e1f2cc3
meg-huggingface
commited on
Splitting prepare_dataset into preparing the base dataset, and the tokenized dataset. This will help us to have further control over caching and loading data, eventually removing the storage of base dataset.
6af9ef6
meg-huggingface
commited on
Updating NLTK requirements due to vulnerability in versions below 3.6.4: contained an inefficient Regular Expression and is vulnerable to regular expression denial of service attacks
937841c
meg-huggingface
commited on
Continuing cache minimizing in new repository. Please see https://github.com/huggingface/DataMeasurements for full history
d8ab532
meg-huggingface
commited on
:art: add line to file to bump ci
7c5b4e0
yourusername
commited on
hate speech18 cache
976b82a
meg-huggingface
commited on
hate speech 18 pmi file cache
6508f0b
meg-huggingface
commited on
Test to push up a simple cache file
14dcacc
meg-huggingface
commited on
:construction_worker: update CI to rebase
6dec358
yourusername
commited on
:bug: filter_vocab -> filter_words
78cc3f9
yourusername
commited on
:bug: really make sure log_files/ exists
e1cd6af
yourusername
commited on
:bug: add log_files dir if not exists
c070f8c
yourusername
commited on
:rocket: add app
e88b792
yourusername
commited on
:rocket: add app and reqs
64a1ca0
yourusername
commited on
:construction_worker: add CI
3c3199f
yourusername
commited on
:bug: remove line added by CoPilot
3f4a261
yourusername
commited on
:memo: add README.md
07eebf0
yourusername
commited on
:tada: init
9b51db9
yourusername
commited on
Initial commit
b9430ed
unverified
Yacine Jernite
commited on