What Exactly Are AI Agents that OpenAI’s CEO is Talking About?

As one direction for AI applications to land, the concept of AI agents has received attention from many enterprises and practitioners. So, what exactly are AI agents? And how do they work? Let’s take a look at the analysis and interpretation in this article.

Understanding AI agents is important because investors are only interested in AI agent companies, and leading AI companies like OpenAI are researching them.

AI agents are one direction for landing AI applications.

I have a revolutionary idea but lack a programmer to implement it. In the future, AI agents will become the “programmers” to realize your forward-thinking ideas.

After reading this article, you will understand what AI agents are and how they work.

I. What are AI Agents

AI agents are Artificial Intelligence agents, intelligent entities that can perceive the environment, make decisions, and act.

For example, an AI agent is like a Siri that lives on your phone or computer with intelligence and observational capabilities.

When you say to it: “Siri, I’m not feeling well.”

By observing your condition, temperature, and activities in the past 24 hours and cross-referencing with the recent activities of online health advisors, it will magically go through a dizzying array of analyses and, in 1 second, conclude: “You may have contracted a virus.”

Then, it proactively drafts a concise sick leave email based on company policy. You nod, and the email is sent to your manager.

At this point, Siri doesn’t stop, it notices your pantry lacks medicine and a thermometer, browses local delivery services, and prepares an item list. Once you confirm, they will arrive within an hour.

In addition, it senses that you now need rest, so Siri dims the lights, adjusts the temperature for optimal comfort, and schedules a series of soothing ambient music playlists to help you relax. If you need to see a doctor, you can conveniently schedule an Uber to ensure you don’t have to worry about transportation.

This is the result of a series of agents working together.

II. How Does it Work So Well

One picture tells you how AI agents work. This diagram describes how an intelligent entity processes, analyzes, and responds to external information.

It’s unclear. Let me break it down.

AI agents consist of 4 parts:

1. Perception

This is the first step of the process. The AI establishes perception of the outside world through sensors, cameras, microphones, etc.

Inputs: The perceived information is input into the system. In this example, the inputs are: “I’m not feeling well,” my temperature, mental state, sleep time, etc.

Environment: The system’s environment or context. For example, “I’m not feeling well” would involve the weather, environment (such as whether it’s in a place with pollen allergens), etc.

2. Agent’s Brain (Information Processing)

This is a universal large model + with countless knowledge bases to process information. It contains the following systems:

1) Information Storage Related

Memory System: Includes Storage and Memory to store long-term and short-term data.

For example, long-term data is my basic information, preferences, underlying conditions, etc;

For short-term data, for example, I only have 1 cold medicine pill left at home; after purchasing this memory, it can be deleted.

Knowledge Base: Includes medical knowledge bases, product catalogs, etc, used to diagnose my current condition and subsequent treatment and life management needs.

2) Large Model Processes Information

Based on the perceived information (input + environment), memories, knowledge bases, etc., conduct processing and draw conclusions (Decision Making): “I’m unwell, it’s likely a viral infection.”

3) Then Formulate Next Steps (Planning)

Action/Reasoning are specific actions based on its decisions, but not yet executed.

Help draft a sick leave email, buy medicine, adjust the comfortable environment, schedule an Uber, etc.

3. Action

Based on Brain’s dizzying operations, it reached conclusions and formulated the following steps, now, it needs to execute (Action).

The large model cannot complete these tasks, it needs to invoke external tools.

At this point, it will use third-party tools (Tools and Calling APIs), through interfaces or applications, and interact with other apps to achieve the final results.

4. Output

After execution, an outlet is needed to tell you the results. Like my Siri. It tells you: “You have contracted a virus, I’ve helped you draft the sick leave email.”

The above is how AI agents work.

This system describes a simplified model, showing how an AI agent can perceive information, go through internal processing and decision-making, and ultimately respond.

III. Conclusion

AI agents are one direction for AI development in the future.

It can be a personal or work assistant that amplifies your abilities, fills your gaps, and makes you a super individual.

If you want to learn more, follow my book “ChatGPT’s Guide to AI Mastery” Part II Chapter 14, “IFTTT in the Field of Large Models.”