PRefLexOR - a lamm-mit Collection

lamm-mit 's Collections

LAMM MIT papers

Cephalo

Leaf-inspired Image Generation

Bioinspired LLMs

PRefLexOR

updated 23 days ago

PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking

PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking

Paper • 2410.12375 • Published Oct 16 • 2

Note Paper on arXiv
lamm-mit/PRefLexOR_ORPO_DPO_EXO_10242024

Text Generation • Updated 23 days ago • 69

Note Model produces thinking tokens before answering
lamm-mit/PRefLexOR_ORPO_DPO_EXO_REFLECT_10222024

Text Generation • Updated 23 days ago • 20 • 1

Note Model produces both thinking and reflection tokens before answering
lamm-mit/meta-llama-Meta-Llama-3.2-3B-Instruct-Reasoning-Tokenizer

Updated 29 days ago

Note Llama tokenizer with special tokens added (thinking, reflection, scratchpad, response)