Biomedical Collection Models for biomedical research applications, such as radiology report generation and biomedical language understanding. β’ 9 items β’ Updated 20 days ago β’ 4
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI Paper β’ 2411.04872 β’ Published 14 days ago β’ 4
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 10 items β’ Updated about 9 hours ago β’ 172
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs β’ 15 items β’ Updated 19 days ago β’ 76
inftyBench: Extending Long Context Evaluation Beyond 100K Tokens Paper β’ 2402.13718 β’ Published Feb 21 β’ 1
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Paper β’ 2408.08459 β’ Published Aug 15 β’ 44
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them? Paper β’ 2404.12691 β’ Published Apr 19 β’ 2
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI Paper β’ 2310.16787 β’ Published Oct 25, 2023 β’ 5
Consent in Crisis: The Rapid Decline of the AI Data Commons Paper β’ 2407.14933 β’ Published Jul 20 β’ 12
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien β’ May 15 β’ 12
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper β’ 2404.18796 β’ Published Apr 29 β’ 68
Bio Series Collection Embeddings and NLG related to biology / amino acid sequences β’ 10 items β’ Updated Sep 19 β’ 1
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24 β’ 59
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning Paper β’ 2402.03046 β’ Published Feb 5 β’ 6
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper β’ 2402.09844 β’ Published Feb 15 β’ 20