In the realm of artificial intelligence, two terms reign supreme now: LLM (Large Language Models) and AGI (Artificial General Intelligence). Together, they represent the bleeding edge of AI’s potential to transform the digital landscape.
LLM and T9: Cousins in Prediction
An LLM is an AI prodigy when it comes to processing and understanding natural language. It can generate human-like responses, translate languages, and tackle various language-related tasks. Interestingly, LLMs bear a striking resemblance to T9, the text input method that once ruled the world of numeric keypads. Both LLMs and T9 share a common goal: predicting outcomes based on input, be it keys pressed or text entered.
AGI: AI’s Ultimate Frontier
AGI, the holy grail of artificial intelligence, represents a machine capable of learning and performing any intellectual task a human can do. While LLMs like ChatGPT aren’t quite there yet, they do exhibit tantalizing hints of AGI’s potential, such as generating human-like text and solving complex language problems.
Generative Agents: AI’s Creative Maestros
Generative agents focus on creating AI models that generate original content rather than simply analyzing existing data. ChatGPT plugins exemplify these agents, acting as digital sidekicks to extend ChatGPT’s skillset. With plugins, users can enjoy a dynamic, customized experience tailored to their needs and interests.
Imagine ChatGPT plugins generating intricate 3D models or composing mesmerizing melodies—the possibilities are endless. These generative agents harmonize with ChatGPT, unleashing a symphony of innovation and progress.
Creating Your Own Generative Agent
To create your own generative agent using the GPT API, simply add an instruction like this to your GPT prompt:
If a user asks for a drawing, determine the requested subject, and enclose it within a <draw> tag. Optionally, include a witty description, ensuring there's only one <draw> tag per response. For instance, if the user requests a rum bottle sketch, reply with: <draw>a pirate holding a bottle of rum</draw>. Keep the entire description inside the tag.
If GPT decides you want an image, it will output the response like this:
Astartsky: hey matee
Pirate: Arrr, ahoy matee! How's life on the high seas treating ya, matee? Anything exciting happening in yer adventures?
Astartsky: heard you're a skilled painter, matee! mind drawing me a map of the island?
Pirate: Arrr, yer heard right, matee! This swine-pirate boy's got some mad painting skills! I'd be happy to draw ye a map of the island. Just give me a moment to grab me paints and me trusty paintbrush. \n\n<draw>a map of the island</draw>\n\nHere ye go, matee! A map of the island, complete with all the landmarks ye need to navigate yer way to treasure and adventure! Arrr!
And there you have it! All you need to do is parse the message and pass the draw instruction to the draw agent. This idea can be used for all sorts of integrations, even something as wild as opening fence gates for you!
Example implementation
I will share my very simplified implementation of the idea for a Telegram chatbot
import { openai } from './openai';
import { bot } from './bot';
// System message containing the drawing instruction for GPT
const drawingInstruction = {
role: 'SYSTEM',
content: `If a user asks for a drawing, determine the requested subject, and enclose it within a <draw> tag. Optionally, include a witty description, ensuring there's only one <draw> tag per response. For instance, if the user requests a rum bottle sketch, reply with: <draw>a pirate holding a bottle of rum</draw>. Keep the entire description inside the tag.`,
};
// Function to parse a string based on a list of regular expressions
const parseFromTemplate = (
input: string,
regexList: RegExp[],
): string | null => {
for (const regex of regexList) {
const match = regex.exec(input);
if (match && match[1]) {
return match[1];
}
}
return null;
};
// Function to handle user input, generate GPT response, and send it back
const handleResponse = async (chatId: number, input: string): Promise<void> => {
// Generate GPT response
const gptResponse = await openai.createChatCompletion({
model: 'gpt-3.5-turbo',
temperature: 0.7,
messages: [drawingInstruction, { role: 'user', content: input }],
});
let textResponse = gptResponse.data.choices[0].message?.content.trim();
if (!textResponse) {
throw new Error('Empty response from GPT');
}
// Check for drawing description in GPT response
const drawingDescription = parseFromTemplate(textResponse, [
/<draw>(.*)<\/draw>/gim,
]);
// If there's a drawing description, generate image and send it
if (drawingDescription) {
const imageResponse = await openai.createImage({
prompt: drawingDescription,
n: 1,
size: '256x256',
response_format: 'b64_json',
user: 'gpt',
});
const base64Image = imageResponse.data.data[0].b64_json;
if (!base64Image) {
throw new Error('No image data');
}
const imageBuffer = Buffer.from(base64Image, 'base64');
textResponse = textResponse.replace(/<draw>(.*)<\/draw>/gim, '');
// Send image response
bot.sendPhoto(chatId, imageBuffer, { caption: drawingDescription });
}
// Send text response
bot.sendMessage(chatId, textResponse, {
reply_to_message_id: messageId,
});
};