Build an AI Agent from Scratch
Build a research agent that combines OpenAI with Spider’s web search. The agent forms search queries, evaluates whether the results answer the question, refines its approach, and delivers a final answer.
Setup
First, let’s set up our environment and install the necessary dependencies.
Install Required Packages
Install the required packages using pip:
pip install python-dotenv openai spider-client colorama
python-dotenv: Manages environment variablesopenai: Interfaces with OpenAI’s powerful language modelsspider-client: Scraping, crawling and web searching (all of Spiders capabilities)colorama: Adds color to our console output for better readability
Environment Variables
Create a .env file in your project root and add your API keys:
OPENAI_API_KEY=<your_openai_api_key_here>
SPIDER_API_KEY=<your_spider_api_key_here>
Building the AI Research Agent
Let’s break down the process of building our AI agent into steps.
Step 1: Import Dependencies and Set Up
import os
from dotenv import load_dotenv
import openai
from spider import Spider
from typing import List, Dict, Any
from colorama import init, Fore
init(autoreset=True)
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
SPIDER_API_KEY = os.getenv("SPIDER_API_KEY")
Import libraries and load environment variables. colorama adds color to console output for readability.
Step 2: Create the AIResearchAgent Class
The AIResearchAgent class encapsulates all the functionality:
class AIResearchAgent:
def __init__(self, openai_api_key: str, spider_api_key: str):
self.openai_client = openai.OpenAI(api_key=openai_api_key)
self.spider_client = Spider(spider_api_key)
This sets up connections to the OpenAI and Spider APIs.
Step 3: Implement Web Search Functionality
The agent searches the web using Spider’s API to fetch relevant, up-to-date information.
def search(self, query: str, limit: int = 5) -> List[Dict[str, Any]]:
"""Perform a web search using Spider."""
params = {"limit": limit, "fetch_page_content": False}
print(f"{Fore.GREEN}Searching for: {query}")
results = self.spider_client.search(query, params)
return results
Step 4: Implement OpenAI Request Helper
def openai_request(self, system_content: str, user_content: str) -> str:
"""Helper method to make OpenAI API requests."""
response = self.openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_content},
{"role": "user", "content": user_content}
]
)
return response.choices[0].message.content
A helper that wraps OpenAI API calls.
Step 5: Implement Text Summarization (optional)
This method isn’t used in the main loop below, but you can add it by calling it before combined_summary in the research method.
def summarize(self, text: str) -> str:
"""Summarize the given text using OpenAI."""
print(f"{Fore.BLUE}Summarizing...", text)
return self.openai_request(
"You are a helpful assistant that summarizes text.",
f"Summarize this text in 2-3 sentences: {text}"
)
Step 6: Implement Answer Evaluation
def evaluate(self, question: str, summary: str) -> str:
"""Evaluate if the summary answers the question."""
print(f"{Fore.MAGENTA}Evaluating...")
evaluation = self.openai_request(
"You are an AI research assistant. Your task is to evaluate if the given summary answers the user's question.",
f"Question: {question}\n\nSummary:\n{summary}\n\nDoes this summary answer the question? If it does, write exactly: 'does answer the question'. If not, explain why."
)
print(f"{Fore.MAGENTA}Evaluation: {evaluation}")
return evaluation
The agent evaluates whether the summary answers the original question. If not, it continues searching. This self-evaluation loop is what makes it a level 3 agent.
Step 7: Implement Search Query Formation
The user’s query might not be an effective search query:
- User query: What is the weather in Boston?
- Search query: Boston weather
def form_search_query(self, user_query: str) -> str:
"""Form a search query from the user's input."""
search_query = self.openai_request(
"You are an AI research assistant. Your task is to form an effective search query based on the user's question.",
f"User's question: {user_query}\n\nPlease provide a concise and effective search query to find relevant information."
)
return search_query
Step 8: Implement Final Answer Formation
Once the agent has gathered and evaluated enough information, it forms a comprehensive answer:
def form_final_answer(self, user_query: str, summary: str) -> str:
"""Form a final answer based on the user's query and the summary."""
final_answer = self.openai_request(
"You are an AI research assistant. Your task is to form a comprehensive answer to the user's question based on the provided summary.",
f"User's question: {user_query}\n\nSummary of research:\n{summary}\n\nPlease provide a comprehensive answer to the user's question based on this information."
)
print(f"{Fore.GREEN}Formed final answer.")
return final_answer
Step 9: Implement Question Refinement
def refine_question(self, original_question: str, evaluation: str) -> str:
"""Refine the search question based on the evaluation."""
print(f"{Fore.CYAN}Refining...")
return self.openai_request(
"You are an AI research assistant. Your task is to refine a search query based on the original question and the evaluation of previous search results.",
f"Original question: {original_question}\n\nEvaluation of previous results: {evaluation}\n\nPlease provide a refined search query to find more relevant information."
)
Refining questions based on previous results makes the agent iteratively converge on better answers.
Step 10: Implement the Main Research Loop
The main research loop ties everything together:
def research(self, user_query: str, max_iterations: int = 5) -> str:
"""Perform research on the given question."""
print(f"{Fore.BLUE}Starting research for: {user_query}")
for iteration in range(max_iterations):
print(f"{Fore.YELLOW}Iteration {iteration + 1}/{max_iterations}")
search_query = self.form_search_query(user_query)
search_results = self.search(search_query)
# OPTIONAL: call the summarize method here to summarize the search results
combined_summary = "\n".join([result['description'] for result in search_results['content']])
evaluation = self.evaluate(user_query, combined_summary)
if "does answer the question" in evaluation.lower():
final_answer = self.form_final_answer(user_query, combined_summary)
return f"{Fore.GREEN}Final Answer:\n{final_answer}\n\nBased on:\n{combined_summary}"
user_query = self.refine_question(user_query, evaluation)
return f"{Fore.RED}Couldn't find a satisfactory answer after {max_iterations} iterations. Last summary:\n{combined_summary}"
Each iteration:
- Forms a search query
- Evaluates whether results answer the question
- Refines the query if not
- Synthesizes a final answer when satisfied
Step 11: Implement the Main Function
An interactive loop for the agent:
def main():
agent = AIResearchAgent(OPENAI_API_KEY, SPIDER_API_KEY)
while True:
user_input = input("What would you like to research? (Type 'exit' to quit): ")
if user_input.lower() == 'exit':
break
result = agent.research(user_input)
print(result)
if __name__ == "__main__":
main()
Conclusion
The finished agent can:
- Search the web using Spider
- Evaluate whether results are sufficient
- Self-refine its search query when results fall short
- Form a final answer from gathered data
Complete Code
Complete code:
import os
from dotenv import load_dotenv
import openai
from spider import Spider
from typing import List, Dict, Any
from colorama import init, Fore
init(autoreset=True)
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
SPIDER_API_KEY = os.getenv("SPIDER_API_KEY")
class AIResearchAgent:
def __init__(self, openai_api_key: str, spider_api_key: str):
self.openai_client = openai.OpenAI(api_key=openai_api_key)
self.spider_client = Spider(spider_api_key)
def search(self, query: str, limit: int = 5) -> List[Dict[str, Any]]:
"""Perform a web search using Spider."""
params = {"limit": limit, "fetch_page_content": False}
print(f"{Fore.GREEN}Searching for: {query}")
results = self.spider_client.search(query, params)
return results
def _openai_request(self, system_content: str, user_content: str) -> str:
"""Helper method to make OpenAI API requests."""
response = self.openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_content},
{"role": "user", "content": user_content}
]
)
return response.choices[0].message.content
def summarize(self, text: str) -> str:
"""Summarize the given text using OpenAI."""
print(f"{Fore.BLUE}Summarizing...")
return self._openai_request(
"You are a helpful assistant that summarizes text.",
f"Summarize this text in 2-3 sentences: {text}"
)
def evaluate(self, question: str, summary: str) -> str:
"""Evaluate if the summary answers the question."""
print(f"{Fore.MAGENTA}Evaluating...")
evaluation = self._openai_request(
"You are an AI research assistant. Your task is to evaluate if the given summary answers the user's question.",
f"Question: {question}\n\nSummary:\n{summary}\n\nDoes this summary answer the question? If it does, write exactly: 'does answer the question'. If not, explain why."
)
return evaluation
def form_search_query(self, user_query: str) -> str:
"""Form a search query from the user's input."""
search_query = self._openai_request(
"You are an AI research assistant. Your task is to form an effective search query based on the user's question.",
f"User's question: {user_query}\n\nPlease provide a concise and effective search query to find relevant information."
)
return search_query
def form_final_answer(self, user_query: str, summary: str) -> str:
"""Form a final answer based on the user's query and the summary."""
final_answer = self._openai_request(
"You are an AI research assistant. Your task is to form a comprehensive answer to the user's question based on the provided summary.",
f"User's question: {user_query}\n\nSummary of research:\n{summary}\n\nPlease provide a comprehensive answer to the user's question based on this information."
)
print(f"{Fore.GREEN}Formed final answer.")
return final_answer
def refine_question(self, original_question: str, evaluation: str) -> str:
"""Refine the search question based on the evaluation."""
print(f"{Fore.CYAN}Refining...")
return self._openai_request(
"You are an AI research assistant. Your task is to refine a search query based on the original question and the evaluation of previous search results.",
f"Original question: {original_question}\n\nEvaluation of previous results: {evaluation}\n\nPlease provide a refined search query to find more relevant information."
)
def research(self, user_query: str, max_iterations: int = 5) -> str:
"""Perform research on the given question."""
print(f"{Fore.BLUE}Starting research for: {user_query}")
for iteration in range(max_iterations):
print(f"{Fore.YELLOW}Iteration {iteration + 1}/{max_iterations}")
search_query = self.form_search_query(user_query)
search_results = self.search(search_query)
# OPTIONAL: call the summarize method here to summarize the search results
combined_summary = "\n".join([result['description'] for result in search_results['content']])
evaluation = self.evaluate(user_query, combined_summary)
if "does answer the question" in evaluation.lower():
final_answer = self.form_final_answer(user_query, combined_summary)
return f"{Fore.GREEN}Final Answer:\n{final_answer}\n\nBased on:\n{combined_summary}"
user_query = self.refine_question(user_query, evaluation)
return f"{Fore.RED}Couldn't find a satisfactory answer after {max_iterations} iterations. Last summary:\n{combined_summary}"
def main():
agent = AIResearchAgent(OPENAI_API_KEY, SPIDER_API_KEY)
while True:
user_input = input("What would you like to research? (Type 'exit' to quit): ")
if user_input.lower() == 'exit':
break
result = agent.research(user_input)
print(result)
if __name__ == "__main__":
main()
If you liked this guide, consider checking out Spider on Twitter and follow me (the author):
- Spider Twitter: spider_rust
- William Espegren Twitter: @WilliamEspegren
Empower any project with AI-ready data
Join thousands of developers using Spider to power their data pipelines.