Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published 13 days ago • 127
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Paper • 2408.03837 • Published Aug 7 • 17
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 7 days ago • 585
view article Article Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing By Pclanglais • Jul 19 • 17
BM25S: Orders of magnitude faster lexical search via eager sparse scoring Paper • 2407.03618 • Published Jul 4 • 10
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28 • 93
Probably DPO datasets Collection A collection of datasets that probably support DPO • 146 items • Updated Jun 26 • 12
TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings Paper • 2406.15586 • Published Jun 21 • 2
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Paper • 2406.14563 • Published Jun 20 • 30
nabla^2DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials Paper • 2406.14347 • Published Jun 20 • 99
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains Paper • 2406.12045 • Published Jun 17 • 5
Multimodal Models 🔀 Collection A collection of multimodal models developed by the Komorebi AI team • 2 items • Updated Jun 18 • 2
Aligning to Thousands of Preferences via System Message Generalization Paper • 2405.17977 • Published May 28 • 6
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • Apr 28 • 37
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien • May 7 • 7
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 68
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4 • 69
Configurable Safety Tuning ⚙️ Collection CST allows for configurable inference-time control of LLM safety levels, so users can dictate model behavior based on the system prompt • 10 items • Updated Jul 24 • 2
Configurable Safety Tuning of Language Models with Synthetic Preference Data Paper • 2404.00495 • Published Mar 30 • 2
Quantized Models (GGUF, IQ, Imatrix) Collection Various quantizations of models in the GGUF format. Models with a "checkmark" are personal favorites. An "orange arrow" means it's being uploaded. • 86 items • Updated 4 days ago • 45
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs Paper • 2402.08005 • Published Feb 12 • 1
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models Paper • 2402.03749 • Published Feb 6 • 12
🛰️🌍 Geospatial Datasets Collection A curated collections of diverse geospatial and satellite imagery datasets. • 54 items • Updated Mar 6 • 14
Exotic Frankenmerges 🥨 Collection Merges of models of different architectures and sizes that end up working surprisingly well • 1 item • Updated Jun 13 • 1
Upscaled Models ⏫ Collection A collection of my frankenmerges, upscaling several models. All of them have the corresponding GGUF variants. • 4 items • Updated Jun 13 • 2
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 212
Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective Paper • 2312.01957 • Published Dec 4, 2023 • 1
Optimised Translation Models 🌍 Collection A collection of optimised and quantised multilingual translation models • 6 items • Updated Nov 7, 2023 • 3
Fast Adaptation with Bradley-Terry Preference Models in Text-To-Image Classification and Generation Paper • 2308.07929 • Published Jul 15, 2023 • 1
Personalizing Text-to-Image Generation via Aesthetic Gradients Paper • 2209.12330 • Published Sep 25, 2022 • 1
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition Paper • 2307.13269 • Published Jul 25, 2023 • 31