-
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 12 -
ProTIP: Progressive Tool Retrieval Improves Planning
Paper • 2312.10332 • Published • 7 -
Paloma: A Benchmark for Evaluating Language Model Fit
Paper • 2312.10523 • Published • 12 -
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 86
daje kang
daje
·
AI & ML interests
None yet
Organizations
None yet
Collections
1
models
16
daje/code-llama-7b-text-to-sql
Updated
daje/chapter5_code-llama3-8B-text-to-sql-ver0.1
Updated
daje/chapter5_psychological_chatbots
Updated
daje/20240830_model
Updated
daje/meta-llama3.1-8B-qna-koalpaca-v1.1
Text Generation
•
Updated
•
10
daje/model_output
Updated
daje/chinese_results_20240729_021938
Updated
daje/code-llama3-8B-text-to-sql-ver0.1
Text Generation
•
Updated
•
9
daje/code-llama3-8B-text-to-sql
Updated
daje/code-llama3-8b-kotext-to-sql
Updated
datasets
8
daje/keyword_summary_3.2.1
Viewer
•
Updated
•
1k
•
38
daje/kotext-to-sql-v1
Viewer
•
Updated
•
262k
•
49
daje/mistral_tokenized_en_wiki
Viewer
•
Updated
•
16.1M
•
79
daje/mistral_tokenized_ko_wiki
Viewer
•
Updated
•
1.7M
•
39
daje/tokenized_enwiki
Viewer
•
Updated
•
16.4M
•
133
daje/tokenized_kowiki
Viewer
•
Updated
•
1.71M
•
46
daje/en_wiki
Viewer
•
Updated
•
5.09M
•
293
daje/ko_wiki
Viewer
•
Updated
•
311k
•
144
•
5