Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs
Abstract
We introduce Lumos, a novel framework for training language agents that employs a unified data format and a modular architecture based on open-source large language models (LLMs). Lumos consists of three distinct modules: planning, grounding, and execution. The planning module breaks down a task into a series of high-level, tool-agnostic subgoals, which are then made specific by the grounding module through a set of low-level actions. These actions are subsequently executed by the execution module, utilizing a range of off-the-shelf tools and APIs. In order to train these modules effectively, high-quality annotations of subgoals and actions were collected and are made available for fine-tuning open-source LLMs for various tasks such as complex question answering, web tasks, and math problems. Leveraging this unified data and modular design, Lumos not only achieves comparable or superior performance to current, state-of-the-art agents, but also exhibits several key advantages: (1) Lumos surpasses GPT-4/3.5-based agents in complex question answering and web tasks, while equalling the performance of significantly larger LLM agents on math tasks; (2) Lumos outperforms open-source agents created through conventional training methods and those using chain-of-thoughts training; and (3) Lumos is capable of effectively generalizing to unseen interactive tasks, outperforming larger LLM-based agents and even exceeding performance of specialized agents.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- AgentTuning: Enabling Generalized Agent Abilities for LLMs (2023)
- Lemur: Harmonizing Natural Language and Code for Language Agents (2023)
- FireAct: Toward Language Agent Fine-tuning (2023)
- HeaP: Hierarchical Policies for Web Actions using LLMs (2023)
- Agent Instructs Large Language Models to be General Zero-Shot Reasoners (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Building Better Language Agents: Lumos and Open-Source LLMs
Links š:
š Subscribe: https://www.youtube.com/@Arxflix
š Twitter: https://x.com/arxflix
š LMNT (Partner): https://lmnt.com/
Models citing this paper 25
Browse 25 models citing this paperDatasets citing this paper 14
Browse 14 datasets citing this paperSpaces citing this paper 0
No Space linking this paper