AI Agents Demystified: The Step-by-Step Guide for Non-Techies Using Real Life Examples

Written by Massa Medi
AI, AI AI AI AI, AI. There’s a lot of noise out there—“agentic this,” “AI agent that,” and workflows that sound straight out of a sci-fi novel. Most explanations seem to either deep-dive into technical jargon or skate along the surface so lightly that you end up more confused than you started.
But what if you’re someone who has zero technical background, yet you’re using AI tools every day? Maybe you’re curious about AI agents, but don’t know where to begin. If that’s you, you’re in the right place. This article follows a simple, three-step journey: we’ll build on what you already know (think ChatGPT), then ramp up to AI workflows, and finally, crack open the world of AI agents. And don’t worry: we’ll use plenty of practical, real-world examples you’ll actually encounter.
And, seriously, all those intimidating acronyms and words you see like “RAG” (not a cleaning cloth!) or “React” (not the JavaScript framework!) are way simpler than you think. Ready for a plain-speak rundown? Let’s go.
Level 1: Large Language Models—The Chatbots You Already Know
Let’s start at basecamp: Large Language Models, or LLMs. These are the backbone behind popular AI chatbots like ChatGPT, Google Gemini, and Claude. Their specialty? Generating and editing text.
Imagine this:
You (the human) provide an input, and the LLM processes it, returning an output based on the extensive (but not infinite) data it was trained on.
For example: If you tell ChatGPT, “Draft an email requesting a coffee chat,” your initial text (“prompt”) is the input. ChatGPT works its magic and produces an email that’s probably way more polished (and polite) than your typical quick DM. That result? The output.
Simple, right? But what if you ask it, “When is my next coffee chat?” Without even waiting for a response, you and I both know ChatGPT will fail here. Why? Because it doesn’t have access to your personal calendar.
This reveals two important traits of LLMs:
- They have limited knowledge about proprietary or personal info. No access to your private stuff or company data unless you specifically provide it.
- They’re passive. These models wait for your prompt and then respond. They don’t leap into action themselves.
Hold on to these two facts as we move onward.
Level 2: AI Workflows—Moving From Q&A to Automated Tasks
Let’s build up from our coffee chat scenario. What if we took things a step further:
Every time you ask your chatbot about a personal event, what if it automatically fetches data from your Google Calendar before answering?
With this logic, next time you ask, “When’s my chat with Elon Hussler?” the chatbot can tap into your calendar, find the answer, and—voilà!—give you the info you need.
But there’s a hitch. What if your very next question is: “What will the weather be like that day?” The workflow fails here, because our setup only tells the LLM to check your calendar—there’s no pathway to the weather report.
AI workflows follow pre-defined paths, known as “control logic.” They only do what you program them to do. If you want more steps—like checking the weather or using a text-to-audio model to read the answer aloud—you need to add those steps yourself. No matter how many steps you create, as long as a human (that’s you) is making the decisions and programming the route, it remains just an AI workflow—not an agent.
Pro Tip: What is RAG?
You’ll see “RAG” everywhere—it stands for Retrieval Augmented Generation. In plain English, this just means giving an AI model a way to “look things up” (e.g., fetching your calendar or the weather) before answering. RAG is simply an AI workflow that pulls in outside info as needed, not magic.
Speaking of practical workflows, here’s a real-world example, inspired by an awesome Helena Liu tutorial:
- Compile news article links in a Google Sheet—think of it as your personal news database.
- Use Perplexity to summarize those news articles, turning them into bite-size insights.
- Employ Claude (an AI language model) with a custom prompt to draft LinkedIn and Instagram posts based on those summaries.
- Automate it: Set the workflow to run daily at 8am, so your posts are ready with your morning coffee.
What you see here: each step is planned and programmed by a human. If you’re not happy with the output—say your latest LinkedIn post isn’t hilarious enough (and you know you’re naturally funny)—you have to manually go back and tweak the prompts, then test it all over again. In traditional workflows, iteration is human-driven.
Level 3: AI Agents—When AI Becomes the Decision-Maker
Let’s take that workflow and ask: What’s different if we want to upgrade to a genuine AI agent?
So far, as a human, your roles were to:
- Reason about the best approach: Which articles should I pick? How to summarize them? How to write a catchy social post?
- Take action: Compile links, use summarization tools, draft posts, etc.
Now—for an AI workflow to become an AI agent, you—the human decision-maker—must get replaced by the LLM. In short:
The single biggest change: The AI agent reasons and acts autonomously, using tools and decision-making steps, without relying on a human to drive the process.
Imagine the agent thinking, “What’s the most efficient way to gather news articles? Copy and paste them all into Word? Nah—let’s stick to compiling links in a spreadsheet and then fetch content as needed.” It will decide, “Should I use Excel or Google Sheets? Well, the user’s already connected Google to make.com, so Sheets it is.”
Pro Tip: The 'React' Framework for AI Agents
Not to be confused with Facebook’s JavaScript UI library, the “React” in this context refers to the cycle of REason and ACT—the essential workflow for any AI agent. Every agent must independently reason (plan) and act (do stuff).
There’s one more trick: iteration. Previously, if your LinkedIn post wasn’t funny enough, you’d adjust the prompt and rerun the workflow yourself. With an AI agent, it can critique its own work—drafting content, then having another LLM review it based on best practices, and automatically revising until the quality is high enough. That loop can happen, start to finish, without human involvement.
Real-World AI Agent In Action: The Andrew Ng Demo
To see an agent at work, look no further than a real-world demo by Andrew Ng, one of the big names in AI.
On this demo website, you input a keyword—say, “skier.” The AI vision agent reasons, “What does a skier look like?” (It imagines: a person on skis, whizzing down a snowy mountain.) Then, it acts by sifting through clips in a video, analyzing footage, and picking out scenes it thinks match your criteria. It indexes those clips and serves them up to you.
Here’s what’s special: this agent does it all automatically, no human needed to pre-tag footage or label scenes as “skier,” “mountain,” or “snow.” The tech under the hood is advanced, but from the user’s perspective, it’s seamless and simple—search, and results appear.
And that’s the magic: making high-tech workflows accessible to everyday users who just want things to work.
For those interested, I’m currently building a basic AI agent of my own using Nan. Drop a comment if you have an idea for what kind of AI agent you’d like a tutorial on!
The Three Levels of AI In a Nutshell
To wrap up, here’s a visual summary (described in words) of everything covered:
- Level 1: LLMs as Chatbots – You give an input, the LLM responds with an output. Direct and simple.
- Level 2: AI Workflows – You give an input, and a pre-determined set of steps unfolds. External info (calendars, weather, APIs) may get pulled in. But the human sets those steps and tweaks them as needed.
- Level 3: AI Agents – The LLM receives a broad goal, devises its own plan for how to achieve it, acts using tools, self-critiques and iterates, and returns a finished result—the agent is now the decision-maker.
If you found this guide helpful and want your AI skills to level up further, check out my tutorial on building a prompts database in Notion. See you next time—and until then, happy automating!