Introducing Agents.js: Give tools to your LLMs using JavaScript
We have recently been working on Agents.js at huggingface.js. It's a new library for giving tool access to LLMs from JavaScript in either the browser or the server. It ships with a few multi-modal tools out of the box and can easily be extended with your own tools and language models.
Installation
Getting started is very easy, you can grab the library from npm with the following:
npm install @huggingface/agents
Usage
The library exposes the HfAgent
object which is the entry point to the library. You can instantiate it like this:
import { HfAgent } from "@huggingface/agents";
const HF_ACCESS_TOKEN = "hf_..."; // get your token at https://huggingface.co/settings/tokens
const agent = new HfAgent(HF_ACCESS_TOKEN);
Afterward, using the agent is easy. You give it a plain-text command and it will return some messages.
const code = await agent.generateCode(
"Draw a picture of a rubber duck with a top hat, then caption this picture."
);
which in this case generated the following code
// code generated by the LLM
async function generate() {
const output = await textToImage("rubber duck with a top hat");
message("We generate the duck picture", output);
const caption = await imageToText(output);
message("Now we caption the image", caption);
return output;
}
Then the code can be evaluated as such:
const messages = await agent.evaluateCode(code);
The messages returned by the agent are objects with the following shape:
export interface Update {
message: string;
data: undefined | string | Blob;
where message
is an info text and data
can contain either a string or a blob. The blob can be used to display images or audio.
If you trust your environment (see warning), you can also run the code directly from the prompt with run
:
const messages = await agent.run(
"Draw a picture of a rubber duck with a top hat, then caption this picture."
);
Usage warning
Currently using this library will mean evaluating arbitrary code in the browser (or in Node). This is a security risk and should not be done in an untrusted environment. We recommend that you use generateCode
and evaluateCode
instead of run
in order to check what code you are running.
Custom LLMs π¬
By default HfAgent
will use OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 hosted Inference API as the LLM. This can be customized however.
When instancing your HfAgent
you can pass a custom LLM. A LLM in this context is any async function that takes a string input and returns a promise for a string. For example if you have an OpenAI API key you could make use of it like this:
import { Configuration, OpenAIApi } from "openai";
const HF_ACCESS_TOKEN = "hf_...";
const api = new OpenAIApi(new Configuration({ apiKey: "sk-..." }));
const llmOpenAI = async (prompt: string): Promise<string> => {
return (
(
await api.createCompletion({
model: "text-davinci-003",
prompt: prompt,
max_tokens: 1000,
})
).data.choices[0].text ?? ""
);
};
const agent = new HfAgent(HF_ACCESS_TOKEN, llmOpenAI);
Custom Tools π οΈ
Agents.js was designed to be easily expanded with custom tools & examples. For example if you wanted to add a tool that would translate text from English to German you could do it like this:
import type { Tool } from "@huggingface/agents/src/types";
const englishToGermanTool: Tool = {
name: "englishToGerman",
description:
"Takes an input string in english and returns a german translation. ",
examples: [
{
prompt: "translate the string 'hello world' to german",
code: `const output = englishToGerman("hello world")`,
tools: ["englishToGerman"],
},
{
prompt:
"translate the string 'The quick brown fox jumps over the lazy dog` into german",
code: `const output = englishToGerman("The quick brown fox jumps over the lazy dog")`,
tools: ["englishToGerman"],
},
],
call: async (input, inference) => {
const data = await input;
if (typeof data !== "string") {
throw new Error("Input must be a string");
}
const result = await inference.translation({
model: "t5-base",
inputs: input,
});
return result.translation_text;
},
};
Now this tool can be added to the list of tools when initiating your agent.
import { HfAgent, LLMFromHub, defaultTools } from "@huggingface/agents";
const HF_ACCESS_TOKEN = "hf_...";
const agent = new HfAgent(HF_ACCESS_TOKEN, LLMFromHub("hf_..."), [
englishToGermanTool,
...defaultTools,
]);
Passing input files to the agent πΌοΈ
The agent can also take input files to pass along to the tools. You can pass an optional FileList
to generateCode
and evaluateCode
as such:
If you have the following html:
<input id="fileItem" type="file" />
Then you can do:
const agent = new HfAgent(HF_ACCESS_TOKEN);
const files = document.getElementById("fileItem").files; // FileList type
const code = agent.generateCode(
"Caption the image and then read the text out loud.",
files
);
Which generated the following code when passing an image:
// code generated by the LLM
async function generate(image) {
const caption = await imageToText(image);
message("First we caption the image", caption);
const output = await textToSpeech(caption);
message("Then we read the caption out loud", output);
return output;
}
Demo π
We've been working on a demo for Agents.js that you can try out here. It's powered by the same Open Assistant 30B model that we use on HuggingChat and uses tools called from the hub. π