--- license: mit --- # GPT4Tools: Teaching LLM to Use Tools via Self-instruction [Lin Song](http://linsong.info/), [Yanwei Li](https://yanwei-li.com/), [Rui Yang](https://github.com/Yangr116), Sijie Zhao, [Yixiao Ge](https://geyixiao.com/), [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en) GPT4Tools is a centralized system that can control multiple visual foundation models. It is based on Vicuna (LLaMA), and 71K self-built instruction data. By analyzing the language content, GPT4Tools is capable of automatically deciding, controlling, and utilizing different visual foundation models, allowing the user to interact with images during a conversation. With this approach, GPT4Tools provides a seamless and efficient solution to fulfill various image-related requirements in a conversation. Different from previous work, we support users teach their own LLM to use tools with simple refinement via self-instruction and LoRA.

[![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://youtu.be/Qrj94ibQIT8) [![arXiv](https://img.shields.io/badge/arXiv-Paper-.svg)](https://arxiv.org/abs//2305.18752)