From Clicking to Chatting: Building Practical AI Agents with LangChain

S

Shashank Rajak

May 28, 2025

11 min read

cover image

In my last blog, I talked about different approaches towards AI and how various people have followed different paths to solve problems with AI. Out of all the approaches, there is one that is currently outshining: the Rational Agent.

If we see the current AI landscape, we see a flood of "agents" all around us. So in this blog, I will talk a little bit about AI Agents and how to start building your own.

Even before AI took center stage, the concept of an 'agent' was already present in our daily lives. A perfect example? Your vacuum cleaner.

When I was a child, vacuum cleaners were such fascinating products. Although I've never used one myself and don't think I will be using it any sooner, if we look at a modern vacuum cleaner, we can certainly call it an agent. A good agent that cleans our house while we are doing other chores. It's smart and rational enough to detect which part of the floor is dirty and needs cleaning, it does its job, and gets back to its charging station after cleaning.

If we analyze this whole process, we can see different components in this story.

First, we have a floor. Obviously, we need a floor where our vacuum cleaner will work. So this floor is called the Environment where our vacuum cleaner has to do its job. And so, we can call our vacuum cleaner an Agent that performs a task. To do its job, the vacuum cleaner needs some input, like which way to go, where the dirt is, and so on. All this input is provided to it via some Sensors like a camera or ultrasonic waves. Once it gets the input, it does some calculation as per its program (some software) and then it implements the decision via some wheels, a brush, etc., to move around and clean the floor. These are called Actuators.

So, with this simple example, we have a definition of an Agent, and an AI agent is no different.

An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators.

Let's take a more relevant example for software engineers: a coding agent. When we give a prompt like, "develop this website to show pictures of dogs," it doesn't just instantly create it. Instead, it acts as an automated developer, breaking down your request.

Here's a concise look at how it works:

  1. Understand (Sensor Input): The agent's LLM receives your prompt, understanding the goal is a webpage with dog pictures (implying HTML, CSS, etc.).

  2. Plan & Act (Internal Program/Reasoning & Actuators):

    • Thought: "Where do I get dog pictures?"

    • Action (Tool Use): It uses a "web search" tool (an actuator) to find image sources.

    • Observation: It gets URLs for dog images.

    • Thought: "Now, build the webpage structure and add images."

    • Action (Tool Use): It uses a "code generation" tool (another actuator) to write the HTML and CSS, inserting the fetched image URLs.

  3. Deliver (Actuation): The agent then saves the index.html and style.css files (its final actuators), potentially even deploying the site if configured.

Essentially, an AI coding agent combines natural language understanding with various tools (like web search, code generation, file handling) and a powerful reasoning engine (the LLM) to turn your high-level request into tangible code.

What exactly is an AI Agent? Understanding different components

While the vacuum cleaner analogy is great, let's deepen our understanding of AI agents, especially in the context of Large Language Models (LLMs). At their core, AI agents are essentially sophisticated programs that can reason, plan, and act in an environment to achieve a specific goal. Think of them as intelligent decision-makers.

Agents are simple a sophisticated system that use Large Language Models (LLMs) to determine and order a set of actions to fulfill the user query. These actions can be writing an email, debugging a code etc.

Here's a breakdown of the key components of an AI agent, often powered by an LLM:

  • LLM (Large Language Model): This is the "core" of our agent. It's what allows the agent to understand natural language instructions, reason about problems, and generate human-like text. For our purposes, this will be our primary decision-making engine.

  • Tools: LLMs are incredibly powerful, but they have limitations. They might not have real-time information (like current weather), or they might not be able to interact with external systems (like sending emails or booking flights). Every model comes with a knowledge cutoff date, which tells us that the knowledge of this model is limited to this date. So to overcome this shortcoming of LLM models, "tools" come into picture. Tools are basic functions or APIs that the LLM can choose to use to extend its capabilities. For example, a web search tool can provide real-time information, or an email tool can send an email.

  • Memory: For an agent to be truly useful, it needs to remember past interactions and information. This allows for more coherent and complex conversations or task executions over time. Imagine trying to book a multi-leg flight without remembering the first leg!

  • Planning/Reasoning: This is the agent's ability to break down a complex task into smaller, manageable steps. It's about deciding what to do next and how to use the available tools to achieve the goal. Often, the LLM itself performs this reasoning process, but sometimes dedicated planning modules are used.

So a usual workflow of an AI agent looks like this (check the below figure).

A user asks a question e.g. What’s the stock price of Google today? Obviously the AI model can’t answer this question, so it will respond some generic message in usual case. But, since this is an AI agent, it has access to tools (e.g. google search or some stock price API), and the LLM model is smart enough to identify that it has access to tools and it can call relevant tool to get the answer. It can define its own input for the tool and then call the tool, get the answer and then again decide if it has got the right answer for the user’s query, and accordingly it can plan further action like to call another tool or simple respond to user. This loop goes on until the model has found the required answer and then finally it combines all the knowledge and responds to the user.

In essence, an AI agent takes your natural language request, the LLM processes it, decides if it needs any external tools to get the information or perform an action, uses those tools, and then gives you the final, intelligent response or performs the desired action.

So, there may be different reasons, but what I think is that AI agents represent a significant behavioral shift in how we are going to use digital applications. For example, let's say I want to check the weather today. What would I do? I will open a weather app and enter the location for which I want the weather data.

With an AI agent, I can just ask it in natural language, "Tell me the weather for Delhi." And this will give me the response.

So, it's just the pattern of getting the information that is changing now. It's about moving from a rigid, app-centric interaction to a more natural, conversational, and intelligent interaction. This shift has a huge benefit: it makes technology more accessible to everyone. Instead of learning how to navigate complex app interfaces or memorizing commands, you can simply speak or type your request as you would to another person. This opens up digital services to a much broader audience, including those who might find traditional apps challenging to use. AI agents empower us to achieve our goals with less friction and more intuitive commands.

Some More Use Cases of AI Agents:

  • Automated Research and Content Creation: Imagine an agent that can research a topic, summarize findings, and then draft an email or a report, all based on a simple prompt.

  • Personalized Assistants: Beyond just checking the weather, an agent could manage your calendar, send personalized reminders, or even suggest restaurants based on your preferences and current location.

  • Complex Task Automation: Think about an agent that can book flight tickets, reserve hotel rooms, and even arrange transportation for a whole trip, all by understanding your travel plans in natural language.

  • Customer Service Enhancement: Agents can handle complex customer queries, troubleshoot problems, and even escalate to human agents when necessary, improving efficiency and customer satisfaction.

How to Build an AI Agent? Introducing LangChain!

Now that we've learned a little bit about AI agents, let's see how we can build one.

Although we can build an AI agent from scratch by stitching together all the components we discussed earlier (LLM, tools, memory, planning), why start from zero? We have some excellent frameworks available that make it easy to work with agents. One such powerful framework is LangChain.

LangChain is a popular open-source framework that simplifies the development of applications powered by large language models. It provides a structured way to connect LLMs with external data sources and computational tools, making it ideal for building sophisticated AI agents. It handles the complexities of chaining together different components, managing memory, and orchestrating tool usage. It supports both Python and JavaScript so you can choose your preferred language.

I am going to use Python for this tutorial. To proceed further, you will need to install LangChain packages, Python, and an API key for Google Gemini (you can get your API key from Google AI Studio). I mostly use Gemini because getting an API key is easy to learn and try out things, and it has an excellent context window compared to other models available. You are free to choose your model (like OpenAI's GPT models or others).

Let's Build a Simple Weather Agent with LangChain!

Now, we will build an AI agent that uses web search to get us the current weather. Any LLM model obviously won't have this up-to-date knowledge that it can tell the current weather. So, we will use a "tool" to provide the answer to the LLM, and then it will return us the final response.

You can checkout the completed code in this repository file

https://github.com/shashanksrajak/building-LLMs-for-production/blob/41cc935191bd5373617c10947c3de7ad9c0a7aa5/agents/weather-agent.ipynb

Before starting create a new repository and setup virtual environment for Python. I am using a Jupyter notebook, you can also use a Python script. Personally, I use uv a lot to work on Python projects. You can checkout their docs to start using it.

Once your environment is ready, install these packages

pip install duckduckgo-search langchain-community langchain[google-genai] langgraph python-dotenv

First we need to load environment variable for the API key for our model. Make sure you have stored your API key in a .env file.

import os
from dotenv import load_dotenv
import getpass

load_dotenv()

if not os.environ.get("GOOGLE_API_KEY"):
    os.environ["GOOGLE_API_KEY"] = getpass.getpass(
        "Enter API key for Google Gemini: ")

Next we will initialize our AI model Gemini and test if it is working with a simple prompt.

from langchain.chat_models import init_chat_model
model = init_chat_model(model="gemini-2.0-flash", model_provider="google_genai")
model.invoke("Hello Gemini")

Next we will initialize tools, in this case I am using DuckDuckGo search to search for weather. You can use any other search engine or API. DuckDuckGo is easy to start with, it does not need any API key or account creation. So its quite handy to use.

from langchain_community.tools import DuckDuckGoSearchRun, DuckDuckGoSearchResults

search = DuckDuckGoSearchRun()

search.invoke("What's the weather currently in Mumbai?")

This will give us some response after doing web search. We are good to go.

We will add this search engine as a tool in the code, tools is a list of different tools which we can pass to the model.

tools = [search]

Next we will just play with our Gemini model by asking about current weather and what response it gives.

response = model.invoke("What's the weather now in Mumbai?")
response

The response:

I'm sorry, I'm unable to retrieve weather information. I can perform web searches, if you'd like me to search for the weather in Raipur.

For me it gave this response, which is expected. This is why we are going to use a tool to give our model some superpowers.

Next we will initialize an agent using LangGraph

from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(model, tools)

Now we will invoke our model with this tool. Now, different models can work differently with tools, e.g. I had to explicitly write in the prompt to use a tool if needed while other models like Open AI’s GPT may work directly. So prompt also matters a lot.

response = agent_executor.invoke({"messages": ["What's the weather now in Mumbai? Use DuckDuckGo search tool if needed."]})

response["messages"]

This will give us expected weather information as opposed to earlier response without tool usage.

This demonstrates how the LLM, when faced with a knowledge gap, intelligently leverages the provided tools to gather the necessary information and then formulates a coherent response.

Beyond This Simple Example

This is just the tip of the iceberg! With LangChain, you can build much more sophisticated agents by:

  • Adding more tools: Integrate with APIs for email, calendar, databases, or even custom internal systems.

  • Implementing memory: Allow your agent to remember past conversations and context for more complex and ongoing interactions.

  • Using different agent types: LangChain offers various agent types beyond ReAct, suitable for different complexities and use cases.

  • Integrating with external data: Load documents, databases, or even entire websites as knowledge bases for your agent.

AI agents are a powerful paradigm shift, enabling more intuitive and intelligent interactions with digital systems. By understanding the core concepts and leveraging frameworks like LangChain, you're well on your way to building truly smart applications. So, go ahead and experiment, build your own agents, and explore the exciting possibilities of this evolving field!

Thoughts or questions?

I'd love to hear your feedback on this article.