Common LangChain Recipes

Bhaskar S

*UPDATED*11/15/2025

Overview

In the article on LangChain , we covered the basics of LangChain framework and how to get started with it. In this article, we will provide some common code recipes for working with LangChain.

Installation and Setup

The installation and setup will be on a Ubuntu 24.04 LTS based Linux desktop. Ensure that Ollama is installed and setup on the desktop (see instructions).

In addition, ensure that the Python 3.1x programming language as well as the Jupyter Notebook package is installed and setup on the desktop.

IFinally, ensure that all the LangChain packages are properly setup on the desktop (see instructions).

Assuming that the ip address on the Linux desktop is 192.168.1.25, start the Ollama platform by executing the following command in the terminal window:

$ docker run --rm --name ollama -p 192.168.1.25:11434:11434 -v $HOME/.ollama:/root/.ollama ollama/ollama:0.12.9

If the linux desktop has Nvidia GPU with decent amount of VRAM (at least 16 GB) and has been enabled for use with docker (see instructions), then execute the following command instead to start Ollama:

$ docker run --rm --name ollama --gpus=all -p 192.168.1.25:11434:11434 -v $HOME/.ollama:/root/.ollama ollama/ollama:0.12.9

For the LLM models, we will be using the Microsoft Phi-4 Mini for the chat interactions, the IBM Granite 4 Micro model for the tools execution, and the Google Gemma 3 4B model for image processing.

Open a new terminal window and execute the following docker command to download the Google Gemma 3 4B model model:

$ docker exec -it ollama ollama run gemma3:4b

After the model download, to exit the user input, execute the following user prompt:

>>> /bye

Next, in the same terminal window, execute the following docker command to download the IBM Granite 4 Micro model:

$ docker exec -it ollama ollama run granite4:micro

After the model download, to exit the user input, execute the following user prompt:

>>> /bye

Finally, in the same terminal window, execute the following docker command to download the Microsoft Phi-4 Mini 4B model:

$ docker exec -it ollama ollama run phi4-mini

With this, we are ready to showcase the common code recipes using LangChain.

Code Recipes for LangChain

Create a file called recipes.env with the following environment variables defined:

LLM_TEMPERATURE=0.2
OLLAMA_CHAT_MODEL='ollama:phi4-mini:latest'
OLLAMA_TOOLS_MODEL='ollama:granite4:micro'
OLLAMA_IMAGE_MODEL='ollama:gemma3:4b'
OLLAMA_BASE_URL='http://192.168.1.25:11434'
CHROMA_DB_DIR='/home/polarsparc/.chromadb'
BOOKS_DATASET='./data/leadership_books.csv'
GPU_DATASET='./data/gpu_specs.csv'
RECEIPT_IMAGE='./data/test-receipt.jpg'
MPG_SQLITE_URL='sqlite:///./data/auto_mpg.db'

To load the environment variables and assign them to Python variable, execute the following code snippet:

from dotenv import dotenv_values

config = dotenv_values('recipes.env')

llm_temperature = float(config.get('LLM_TEMPERATURE'))
ollama_chat_model = config.get('OLLAMA_CHAT_MODEL')
ollama_tools_model = config.get('OLLAMA_TOOLS_MODEL')
ollama_image_model = config.get('OLLAMA_IMAGE_MODEL')
ollama_base_url = config.get('OLLAMA_BASE_URL')
gpu_dataset = config.get('GPU_DATASET')
chroma_db_dir = config.get('CHROMA_DB_DIR')
receipt_image = config.get('RECEIPT_IMAGE')
mpg_sqlite_url = config.get('MPG_SQLITE_URL')

Executing the above Python code snippet generates no output.

To initialize an instance of a chat model, a tools model, and an image processing model, execute the following code snippet:

from langchain.chat_models import init_chat_model

ollama_chat_llm = init_chat_model(ollama_chat_model, base_url=ollama_base_url, temperature=llm_temperature)
ollama_tools_llm = init_chat_model(ollama_tools_model, base_url=ollama_base_url, temperature=llm_temperature)
ollama_image_llm = init_chat_model(ollama_image_model, base_url=ollama_base_url, temperature=llm_temperature)

Executing the above Python code snippet generates no output.

To initialize an instance of vector embedding class corresponding to the chat model, execute the following code snippet:

from langchain_core.embeddings.embeddings import Embeddings
from langchain_ollama import OllamaEmbeddings

ollama_embedding: Embeddings = OllamaEmbeddings(base_url=ollama_base_url, model=ollama_chat_model[ollama_chat_model.index(':')+1:])

Executing the above Python code snippet generates no output.

Recipe 1 : Execute a simple LLM prompt

The following code snippet creates a simple prompt template, an LLM chain, and executes the chain to get a response from the LLM model:

from langchain_core.prompts import PromptTemplate

template = """
Question: {question}

Answer: Summarize the response in less than {tokens} words.
"""

prompt = PromptTemplate.from_template(template=template)

chain = prompt | ollama_chat_llm

result = chain.invoke({'question': 'describe the python langchain ai framework', 'tokens': 50})
print(result)

Executing the above Python code snippet generates the following typical output:

Output.1

content='The LangChain AI framework is designed for building applications with large language models, offering tools and components to streamline development processes.' additional_kwargs={} response_metadata={'model': 'phi4-mini:latest', 'created_at': '2025-11-11T17:50:43.474265533Z', 'done': True, 'done_reason': 'stop', 'total_duration': 15304528663, 'load_duration': 14966942499, 'prompt_eval_count': 28, 'prompt_eval_duration': 19753959, 'eval_count': 25, 'eval_duration': 280905325, 'model_name': 'phi4-mini:latest', 'model_provider': 'ollama'} id='lc_run--ff20604f-a758-46d1-bde2-ba9f11d08970-0' usage_metadata={'input_tokens': 28, 'output_tokens': 25, 'total_tokens': 53}

We have successfully demonstrated the Python code snippet for Recipe 1.

Recipe 2 : Execute a simple LLM chat using Messages

The following code snippet creates a human and system messages and invokes the chat model to get a response from the LLM chat model:

from langchain_core.messages import HumanMessage, SystemMessage

human_msg = HumanMessage(content='What is the current time in India if the current time in New York is 8 am EST?')
system_msg = SystemMessage(content='You are a helpful assistant.')
messages = [system_msg, human_msg]

result = ollama_chat_llm.invoke(messages)
result.pretty_print()

Executing the above Python code snippet generates the following typical output:

Output.2

================================== Ai Message ==================================

India typically follows Indian Standard Time (IST), which is UTC+5:30 hours ahead of Coordinated Universal Time (UTC). If it’s currently 8 AM Eastern Standard Time (EST) on October 1st, we need to convert this into IST.

First, let's find the time difference between EST and UTC. New York operates under either EDT or EST depending upon daylight saving adjustments; however, for simplicity's sake in our calculation we'll assume it's during standard time when there is no DST adjustment involved (UTC-5).

Now converting 8 AM EST to UTC:

08:00 - (-05) = 13:00 (1 PM)

Next we convert this from UTC to IST by adding the +5.30 hours difference.

13:00 + 5:30 = 18:30

We have successfully demonstrated the Python code snippet for Recipe 2.

Recipe 3 : Generate Structured Output from LLM

The following code snippet defines a data class that will conform to the structured output, instruct the LLM chat instance to generate a structured response, and invokes the LLM chat model to generate a structured response:

from pydantic import BaseModel

class GpuSpecs(BaseModel):
  name: str
  manufacturer: str
  chip: str
  cores: int
  memory: int

structured_llm = ollama_chat_llm.with_structured_output(GpuSpecs)

result = structured_llm.invoke('Get the GPU specs for NVidia RTX 3070 Ti GPU')
print(result)

Executing the above Python code snippet generates the following typical output:

Output.3

name='get_gpu_specs' manufacturer='NVidia' chip='RTX 3070 Ti' cores=4608 memory=12

We have successfully demonstrated the Python code snippet for Recipe 3.

Recipe 4 : Chat with LLM preserving History

The following code snippet creates a chat session, an in-memory chat history store, an LLM chain using the history store, and executes the chain to get responses from the LLM model:

from langchain_core.chat_history import BaseChatMessageHistory, InMemoryChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory

in_memory_store = {}

def get_chat_history(session_id: str) -> BaseChatMessageHistory:
  if session_id not in in_memory_store:
    in_memory_store[session_id] = InMemoryChatMessageHistory()
  return in_memory_store[session_id]

prompt = ChatPromptTemplate.from_messages([
  ('system', 'You are a helpful AI assistant!'),
  MessagesPlaceholder(variable_name='chat_history'),
  ('human', '{input}')
])

config = {'configurable': {'session_id': 'sid-1'}}

chain = prompt | ollama_chat_llm

chain_with_history = RunnableWithMessageHistory(chain, get_chat_history, input_messages_key='input', history_messages_key='chat_history')

result = chain_with_history.invoke({'input': 'Suggest the top 3 budget consumer mirrorless cameras as a bulleted list'}, config=config)
print(result.content)

print('-------------------------')

result = chain_with_history.invoke({'input': 'Not impressed with the suggestion, try again'}, config=config)
print(result.content)

Executing the above Python code snippet generates the following typical output:

Output.4

Certainly, here is an updated and accurate representation of three popular entry-level (budget) camera brands that produce high-quality mirrorless systems:

- Canon EOS M50
- Sony Alpha A6000/A6300 series
- Fujifilm X-T200

These models are well-regarded for their performance in the budget category. Please note, however, as my knowledge is up to 2023 and prices can fluctuate over time due to market conditions or new releases by manufacturers.

Canon EOS M50:
* Price: Approximately $1,000 - $1,300
* Features include a 24.2 MP APS-C CMOS sensor with DIGIC processing for fast performance.
* Offers features like Wi-Fi connectivity (built-in), an articulating touchscreen display and electronic viewfinder that can be flipped up.

Sony Alpha A6000/A6300 series:
* Price: Approximately $1,000 - $1,300
* Features include a 24.2 MP APS-C Exmor CMOS sensor with BIONZ X processing for excellent image quality.
* Offers features like Wi-Fi connectivity (built-in), an articulating touchscreen display and electronic viewfinder that can be flipped up.

Fujifilm X-T200:
* Price: Approximately $1,000 - $1,300
* Features include a 24.2 MP APS-C CMOS sensor with Fujifilm's unique film simulation modes.
* Offers features like Wi-Fi connectivity (built-in), an articulating touchscreen display and electronic viewfinder that can be flipped up.

These cameras are excellent choices for photographers looking to enter the mirrorless market without breaking their budget, offering a balance of quality image sensors, ease-of-use with touchscreens or EVFs/OVF displays, as well as modern connectivity options.
-------------------------
I apologize if my previous suggestions did not meet your expectations for top-tier value in mirrorless cameras within an entry-level budget range.

Here are three more refined choices that have been highly praised by both consumers and professionals alike:

- Sony Alpha A6000/A6100 series
- Fujifilm X-T30/X-E4 (X-T Series)
- Olympus OM-D E-M10 Mark III

These models continue to offer excellent value for money, combining robust features with affordability.

Sony Alpha A6000/A6100 series:
* Price: Approximately $1,000 - $1,300
* Features include a 24.2 MP APS-C Exmor CMOS sensor and BIONZ X image processor.
* Offers Wi-Fi connectivity (built-in), an articulating touchscreen display that can flip up to reveal the electronic viewfinder.

Fujifilm X-T30/X-E4:
* Price: Approximately $1,000 - $1,300
* Features include a 24.2 MP APS-C CMOS sensor with Fujifilm's unique film simulation modes for creative flexibility.
* Offers Wi-Fi connectivity (built-in), an articulating touchscreen display that can flip up to reveal the electronic viewfinder.

Olympus OM-D E-M10 Mark III:
* Price: Approximately $1,000 - $1,300
* Features include a 16.3 MP Four Thirds sensor with TruePic VIII processor.
* Offers Wi-Fi connectivity (built-in), an articulating touchscreen display that can flip up to reveal the electronic viewfinder.

These cameras are well-suited for photographers who want high-quality imaging capabilities without compromising on features like touchscreens, EVFs/OVF displays or modern wireless connections. Each of these models offers a unique set of characteristics and creative tools suitable for various photography styles from casual enthusiasts to serious hobbyists looking to step up their game in the mirrorless camera market.

I hope this revised list better aligns with your expectations regarding value-for-money options within an entry-level budget range, offering you both quality imaging capabilities as well as modern features that cater to today's photographers.

We have successfully demonstrated the Python code snippet for Recipe 4.

Recipe 5 : Q&A on a CSV file content using LLM

For this recipe, we will create a CSV file called gpu_specs.csv that will contain the GPU card specs for a handful of popular GPU cards. The following list the contents of the CSV file:

manufacturer,productName,memorySize,memoryClock,gpuChip
NVIDIA,GeForce RTX 4060,12,5888,AD104
NVIDIA,GeForce RTX 4070,12,7680,AD104
NVIDIA,GeForce RTX 4080,16,9728,AD103
NVIDIA,GeForce RTX 4090,24,17408,AD102
AMD,Radeon RX 6950 XT,16,5120,Navi 21
AMD,Radeon RX 7700 XT,8, 4096,Navi 33
AMD,Radeon RX 7800 XT,12,8192,Navi 32
AMD,Radeon RX 7900 XT,16,12288,Navi 31
NVIDIA,GeForce RTX 3060 Ti,8,4864,GA103S
NVIDIA,GeForce RTX 3070 Ti,8,5632,GA104
NVIDIA,GeForce RTX 3080,12,8960,GA102
NVIDIA,GeForce RTX 3090 Ti,24,10752,GA102
AMD,Radeon RX 6400,4,768,Navi 24
AMD,Radeon RX 6500 XT,4,1024,Navi 24
AMD,Radeon RX 6650 XT,8,2048,Navi 23
AMD,Radeon RX 6750 XT,12,2560,Navi 22
AMD,Radeon RX 6850M XT,12,2560,Navi 22
Intel,Arc A770,16,4096,DG2-512
Intel,Arc A780,16,4096,DG2-512

The following code snippet creates a CSV file loader, loads the rows from the file as documents into an in-memory vector store using the embedding class, creates a prompt template with the data from the CSV file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:

from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_classic.document_loaders import CSVLoader
from langchain_classic.chains import RetrievalQA

loader = CSVLoader(gpu_dataset, encoding='utf-8')

encoding = '\ufeff'
newline = '\n'
comma_space = ', '

documents = [Document(metadata=doc.metadata, page_content=doc.page_content.lstrip(encoding)
                      .replace(newline, comma_space)) for doc in loader.load()]

vector_store = InMemoryVectorStore(ollama_embedding)

vector_store.add_documents(documents)

template = """
Given the following context, answer the question based only on the provided context.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)

retriever = vector_store.as_retriever()

qa_chain = RetrievalQA.from_chain_type(llm=ollama_chat_llm, retriever=retriever, chain_type_kwargs={'prompt': prompt})

result = qa_chain.invoke({'query': 'memory size of RTX 3070 Ti'})
print(result)

Executing the above Python code snippet generates the following typical output:

Output.5

{'query': 'memory size of RTX 3070 Ti', 'result': 'The provided context does not include information about the memory size of an NVIDIA GeForce RTX 3070 Ti. Therefore, based on this specific given data alone, it is impossible to determine or answer what the memory size for that product would be. More detailed and relevant sources are needed in order to find out such specifications accurately.\n\n'}

We have successfully demonstrated the Python code snippet for Recipe 5.

Recipe 6 : Q&A on a PDF file content using LLM

For this recipe, we will use the Nvidia 3rd Quarter 2024 financial report to analyze it !!!

Also, ensure to install the additional Python module(s) by executing the following command:

$ pip install pypdf

The following code snippet creates a PDF file loader, loads the page chunks from the PDF file as embedded documents into the persistent vector store Chroma using the embedding class, creates a prompt template with the data from the PDF file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:

from langchain_chroma import Chroma
from langchain_classic.chains import RetrievalQA
from langchain_classic.document_loaders import PyPDFLoader

nvidia_q3_2024 = './data/NVIDIA-3Q-2024.pdf'

pdf_loader = PyPDFLoader(nvidia_q3_2024)

pdf_pages = pdf_loader.load_and_split()

vector_store = Chroma(collection_name='pdf_docs', embedding_function=ollama_embedding, persist_directory=chroma_db_dir)

vector_store.add_documents(pdf_pages)

template = """
Given the following context, answer the question based only on the provided context.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)

retriever = vector_store.as_retriever()

qa_chain = RetrievalQA.from_chain_type(llm=ollama_chat_llm, retriever=retriever, chain_type_kwargs={'prompt': prompt})

result = qa_chain.invoke({'query': 'what was the revenue in q1 2025'})
print(result)

print('-------------------------')

result = qa_chain.invoke({'query': 'what were the expenses in q4 2024'})
print(result)

Executing the above Python code snippet generates the following typical output:

Output.6

{'query': 'what was the revenue in q1 2025', 'result': 'The first-quarter (Q1) fiscal year 2025 revenue for NVIDIA Corporation was $26.0 billion.'}
-------------------------
{'query': 'what were the expenses in q4 2024', 'result': "The operating expenses for Q4 FY24 (fourth quarter of fiscal year 2024) are not explicitly stated within the provided context. However, it is mentioned that there was an increase from $2,210 million to $3,497 million between Q1 and Q4 in the same fiscal period.\n\nTo find out the exact expenses for Q4 FY24 specifically, one would need access to NVIDIA's full financial statements or notes accompanying those reports where detailed breakdowns of operating expenses are typically provided. Since this information is not included within your given context, I cannot provide you with a precise figure for Q4 2024 expenses from that specific data set alone."}

We have successfully demonstrated the Python code snippet for Recipe 6.

Recipe 7 : Basic usage of the ReAct Framework

The LangChain ReAct Framework is a technique that enables LangChain virtual agents to interact with LLMs through prompting, which mimics the reasoning (Re) and acting (Act) behavior of a human to solve problems in an environment, using various external tools. In other words, the ReAct framework enables LLMs to reason and act based on the situation in the environment.

For this recipe, we will demonstrate a virtual agent that mimics the behavior of a basic sysadmin !!!

The following code snippet creates a custom tool for executing shell commands, creates a ReAct prompt template, creates a ReAct agent, which enables the multi-step reasoning process for the ReAct agent, and invokes the virtual agent to get specific answers to the questions from the LLM model:

import subprocess
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
from langchain.tools import tool

@tool
def execute_shell_command(command: str) -> str:
  """Tool to execute shell commands
  Arg:
    command (str): Shell command to execute
  Returns:
    str: Output of the command execution
  """
  print(f'Executing shell command: {command}')

  try:
    output = subprocess.run(command, shell=True, check=True, text=True, capture_output=True)
    if output.returncode != 0:
      return f'Error executing shell command - {command}'
    return output.stdout
  except subprocess.CalledProcessError as e:
    print(e)

tools = [execute_shell_command]

prompt = '''
You are a helpful assistant. Be concise and accurate.
You have access to the following tools:
{tools}
Use these tools to answer the user's question.
Only respond with the output of the command execution.
'''

config = {'configurable': {'thread_id': 'thr-1'}}

agent = create_agent(ollama_tools_llm,
                     tools,
                     system_prompt=prompt,
                     checkpointer=InMemorySaver())

question_1 = 'Can you all the network interfaces on this system?'
question_2 = 'What is the primary Ethernet interface on this system?'

result = agent.invoke(
  {'messages': [
    {
      'role': 'user',
      'content': question_1
    }
  ]},
  config=config
)
result['messages'][-1].pretty_print()

print('-------------------------')

result = agent.invoke(
  {'messages': [
    {
      'role': 'user',
      'content': question_2
    }
  ]},
  config=config
)
result['messages'][-1].pretty_print()

Executing the above Python code snippet generates the following trimmed output:

Output.7

Executing shell command: ip addr show
================================== Ai Message ==================================

Here are the network interfaces on this system:

1. **lo** (Loopback)
   - Description: Localhost interface for internal communication within the system.
   - Status: UP
   - IP Addresses:
     - IPv4: 127.0.0.1/8
     - IPv6: ::1/128

2. **enp42s0**
   - Description: Ethernet interface (likely a physical network port).
   - Status: UP
   - MTU: 1500 bytes
   - IP Addresses:
     - IPv4: 192.168.1.25/24
     - IPv6: fe80::a85e:4dca:5db3:2919/64 (link-local)
   - MAC Address: 04:7c:16:17:0e:5a

3. **wlo1**
   - Description: Wi-Fi interface.
   - Status: DOWN
   - MTU: 1500 bytes
   - MAC Address: 50:c2:e8:ee:d1:73 (and aliased as `wlp41s0`)

4. **br-b83cfae028ea**
   - Description: Bridge interface.
   - Status: DOWN
   - MTU: 1500 bytes
   - IP Address: 172.19.0.1/16
   - MAC Address: b2:90:02:d7:1b:a6

5. **docker0**
   - Description: Docker bridge interface for container networking.
   - Status: UP
   - MTU: 1500 bytes
   - IP Addresses:
     - IPv4: 172.17.0.1/16
     - IPv6: fe80::d42c:83ff:fe59:1cca/64 (link-local)
   - MAC Address: d6:2c:83:59:1c:ca

6. **veth7ea82ba**
   - Description: Virtual Ethernet interface pair.
   - Status: UP
   - MTU: 1500 bytes
   - IP Addresses:
     - IPv6: fe80::dc34:1aff:fed9:b335/64 (link-local)
   - MAC Address: de:34:1a:d9:b3:35

These interfaces provide connectivity for various purposes, including local communication, network access via Ethernet or Wi-Fi, container networking with Docker, and bridging between different network segments.
-------------------------
================================== Ai Message ==================================

The primary Ethernet interface on this system is **enp42s0**. It has a status of UP, an IP address of 192.168.1.25/24, and a MAC address of 04:7c:16:17:0e:5a. This interface is likely the main network connection for accessing external networks or other devices on the local network.

We have successfully demonstrated the Python code snippet for Recipe 7.

Recipe 8 : Q&A on a CSV file content using the ReAct Framework

For this recipe, we will create a CSV file called gpu_specs.csv that will contain the GPU card specs for a handful of popular GPU cards. The following list the contents of the CSV file:

manufacturer,productName,memorySize,memoryClock,gpuChip
NVIDIA,GeForce RTX 4060,12,5888,AD104
NVIDIA,GeForce RTX 4070,12,7680,AD104
NVIDIA,GeForce RTX 4080,16,9728,AD103
NVIDIA,GeForce RTX 4090,24,17408,AD102
AMD,Radeon RX 6950 XT,16,5120,Navi 21
AMD,Radeon RX 7700 XT,8, 4096,Navi 33
AMD,Radeon RX 7800 XT,12,8192,Navi 32
AMD,Radeon RX 7900 XT,16,12288,Navi 31
NVIDIA,GeForce RTX 3060 Ti,8,4864,GA103S
NVIDIA,GeForce RTX 3070 Ti,8,5632,GA104
NVIDIA,GeForce RTX 3080,12,8960,GA102
NVIDIA,GeForce RTX 3090 Ti,24,10752,GA102
AMD,Radeon RX 6400,4,768,Navi 24
AMD,Radeon RX 6500 XT,4,1024,Navi 24
AMD,Radeon RX 6650 XT,8,2048,Navi 23
AMD,Radeon RX 6750 XT,12,2560,Navi 22
AMD,Radeon RX 6850M XT,12,2560,Navi 22
Intel,Arc A770,16,4096,DG2-512
Intel,Arc A780,16,4096,DG2-512

The following code snippet creates a CSV file loader, loads the rows from the file as documents into an in-memory vector store using the embedding class, creates a custom tool for retrieving documents relevant to the user query from the vector store, creates a prompt template, creates an agent, and invokes the virtual agent to get specific answer to the questions from the LLM model:

from langchain.agents import create_agent
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_classic.document_loaders import CSVLoader
from langchain_classic.chains import RetrievalQA
from langchain.tools import tool

loader = CSVLoader(gpu_dataset, encoding='utf-8')

encoding = '\ufeff'
newline = '\n'
comma_space = ', '

ollama_embedding: Embeddings = OllamaEmbeddings(base_url=ollama_base_url, model=ollama_tools_model[ollama_tools_model.index(':')+1:])

documents = [Document(metadata=doc.metadata, page_content=doc.page_content.lstrip(encoding)
                      .replace(newline, comma_space)) for doc in loader.load()]

vector_store = InMemoryVectorStore(ollama_embedding)

vector_store.add_documents(documents)

@tool(response_format="content_and_artifact")
def retrieve_context(query: str) -> tuple[str, list[Document]]:
  """Tool to retrieve information to help answer a query.
  Arg:
    query (str): User query to answer
  Returns:
    tuple: Output of the retrieved documents with the documents metadata
  """
  n_top = 2
  retrieved_docs = vector_store.similarity_search(query, k=n_top)
  serialized = "\n\n".join(
    (f'Source: {doc.metadata}\nContent: {doc.page_content}')
    for doc in retrieved_docs
  )
  return serialized, retrieved_docs

tools = [retrieve_context]

prompt = '''
You are a helpful assistant. Be concise and accurate.
You have access to the following tools:
{tools}
Use these tools to retrieves the context and answer the user's question.
Only respond with the output of the command execution.
'''

agent = create_agent(ollama_tools_llm,
                     tools,
                     system_prompt=prompt)

query = 'memory size of RTX 3070 Ti'

result = agent.invoke(
  {'messages': [
    {
      'role': 'user',
      'content': query
    }
  ]}
)
result['messages'][-1].pretty_print()

Executing the above Python code snippet generates the following typical output:

Output.8

================================== Ai Message ==================================

The memory size of the RTX 3070 Ti is not listed in the provided data. The closest match for an AMD Radeon RX card shows a memory size of 16 GB, while the NVIDIA GeForce RTX 3090 Ti has 24 GB. However, there's no specific entry for the RTX 3070 Ti in the dataset.

We have successfully demonstrated the Python code snippet for Recipe 8.

Recipe 9 : Image Processing using Multi Model

A Multi Model is a AI model that can process multiple types of data, such as text, images, and audio, etc.

For this recipe, we will demonstrate the Optical Character Recognition (OCR) capabilities of the Multi Model by processing the image of a Receipt of Transactions !!!

The following code snippet defines a method to convert a JPG image to base64 string, defines a method to create a chat prompt, creates an instance of chat running the multi model, creates a chain using the chat prompt and the chat instance, and finally invokes the chain passing in the prompt text and the image to process:

import base64
from io import BytesIO
from langchain_core.messages import HumanMessage
from PIL import Image

def jpg_to_base64(image):
  jpg_buffer = BytesIO()
  pil_image = Image.open(image)
  pil_image.save(jpg_buffer, format='JPEG')
  return base64.b64encode(jpg_buffer.getvalue()).decode('utf-8')

def create_chat_prompt(data):
  text = data['text']
  image = data['image']

  image_part = {
    'type': 'image_url',
    'image_url': f'data:image/jpeg;base64,{image}',
  }

  content_parts = []

  text_part = {'type': 'text', 'text': text}

  content_parts.append(image_part)
  content_parts.append(text_part)

  return [HumanMessage(content=content_parts)]

chain = create_chat_prompt | ollama_image_llm

result = chain.invoke({'text': 'Itemize all the transactions from the receipt image in detail', 'image': jpg_to_base64(receipt_image)})
print(result.content)

Executing the above Python code snippet generates the following trimmed output:

Output.9

Okay, here's a detailed breakdown of all the transactions from the receipt image:

**REY SKYWALKER #9876: Total Transactions**

*   **Feb 15**
    *   Wegmans #93PRINCTONNJ - $17.79
    *   Trader Joe's #607PRINCTONNJ - $2.69
    *   Shoprite LAWRNCVILLE S1LAWRENCVILLE - $30.16
    *   Wegmans #93PRINCTONNJ - $19.35
    *   Halo Farm LAWRENCVILLE - $13.96
*   **Feb 17**
    *   TJ MAX #0224LAWRENCVILLE - $21.31
    *   Patel Brothers East Windsorn - $77.75
    *   TJ MAX #828EAST WINDSORN - $90.58
    *   Wholefoods PRN 1018PRINCTONNJ - $6.48
*   **Feb 19**
    *   Amazon MKTPL*N606Z9FA3MZN.COM/BILLWA - $29.99
    *   Amazon MKTPL*L89WB21L3MZN.COM/BILLWA - $74.63

**Total: $258.76**

Let me know if you'd like me to perform any calculations or have any other questions about the receipt!

We have successfully demonstrated the Python code snippet for Recipe 9.

Recipe 10 : SQL Database Query Agent

For this recipe, we will levarage the Auto MPG dataset saved as a SQLite database, which can be downloaded from HERE !!!

The following code snippet creates an instance of the SQLite database using the provided dataset, creates a custom tool for executing SQL queries, creates a prompt template, creates an agent, and invokes the virtual agent to get specific answer to the SQL queries from the LLM model:

from dataclasses import dataclass
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
from langchain_community.utilities import SQLDatabase
from langgraph.runtime import get_runtime
from langchain.tools import tool

mpg_db = SQLDatabase.from_uri(mpg_sqlite_url)

@dataclass
class RuntimeDbContext:
  db: SQLDatabase

@tool
def execute_sql(query: str) -> str:
  """
  Execute SQL query and return results.
  Arg:
    query (str): SQL query to execute
  Returns:
    Query results as a string
  """
  runtime = get_runtime(RuntimeDbContext)
  db = runtime.context.db
  try:
    return db.run(query)
  except Exception as e:
    return f'Error: {str(e)}'

db_prompt = '''
You are a helpful SQLite assistant. You are able to answer questions about the Auto MPG database.
Rules:
- Think step-by-step
- When you need data, call the tool execute_sql with ONLY one SQL query as input
- DO NOT use INSERT, UPDATE, DELETE, or DROP statements
- Limit up to 5 rows of output
- If the tool returns an 'Error:', revise the SQL query and try again
'''

agent = create_agent(
  ollama_tools_llm,
  tools=[execute_sql],
  system_prompt=db_prompt,
  context_schema=RuntimeDbContext,
  checkpointer=InMemorySaver()
)

config = {'configurable': {'thread_id': 'thr-1'}}

question = 'what are the names of all the tables and their columns in the Auto MPG database?'

for step in agent.stream(
  {'messages': question},
  context=RuntimeDbContext(db=mpg_db),
  config=config,
  stream_mode='values',
):
  step['messages'][-1].pretty_print()

Executing the above Python code snippet generates the following typical output:

Output.10

================================ Human Message =================================

what are the names of all the tables and their columns in the Auto MPG database?
================================== Ai Message ==================================
Tool Calls:
  execute_sql (20dfbb60-fa8c-4a2c-bb81-55bc7ded1980)
 Call ID: 20dfbb60-fa8c-4a2c-bb81-55bc7ded1980
  Args:
    query: SELECT name FROM sqlite_master WHERE type='table';
================================= Tool Message =================================
Name: execute_sql

[('Origins',), ('Vehicles',)]
================================== Ai Message ==================================
Tool Calls:
  execute_sql (84985976-936b-4395-9219-e07fd717a859)
 Call ID: 84985976-936b-4395-9219-e07fd717a859
  Args:
    query: PRAGMA table_info(Origins);
================================= Tool Message =================================
Name: execute_sql

[(0, 'origin_id', 'BIGINT', 0, None, 1), (1, 'origin', 'TEXT', 0, None, 0)]
================================== Ai Message ==================================
Tool Calls:
  execute_sql (e85ed25c-18e3-49e6-971c-9500703e7663)
 Call ID: e85ed25c-18e3-49e6-971c-9500703e7663
  Args:
    query: PRAGMA table_info(Vehicles);
================================= Tool Message =================================
Name: execute_sql

[(0, 'vehicle_id', 'BIGINT', 0, None, 1), (1, 'mpg', 'FLOAT', 0, None, 0), (2, 'cylinders', 'BIGINT', 0, None, 0), (3, 'displacement', 'FLOAT', 0, None, 0), (4, 'horsepower', 'FLOAT', 0, None, 0), (5, 'weight', 'FLOAT', 0, None, 0), (6, 'acceleration', 'FLOAT', 0, None, 0), (7, 'model_year', 'BIGINT', 0, None, 0), (8, 'origin', 'BIGINT', 0, None, 0), (9, 'car_name', 'TEXT', 0, None, 0)]
================================== Ai Message ==================================

The Auto MPG database contains two tables:

1. **Origins**
   - Columns:
     - `origin_id` (INTEGER)
     - `origin` (TEXT)

2. **Vehicles**  
   - Columns:
     - `vehicle_id` (INTEGER)
     - `mpg` (REAL)
     - `cylinders` (INTEGER)
     - `displacement` (REAL)
     - `horsepower` (REAL)
     - `weight` (REAL)
     - `acceleration` (REAL)
     - `model_year` (INTEGER)
     - `origin` (INTEGER)
     - `car_name` (TEXT)

We have successfully demonstrated the Python code snippet for Recipe 10.

Recipe 11 : Human in the Loop Agent

For this recipe, we will levarage the Auto MPG dataset saved as a SQLite database, which can be downloaded from HERE !!!

The following code snippet creates a fictitious custom tool for updating a database, creates a prompt template, creates an agent with a middleware, and invokes the virtual agent, which waits for an approval from a human to proceed:

from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import Command
from langchain.tools import tool

@tool
def update_database(query: str) -> str:
  '''
  Tool to perform database update operations
  :param query: update query
  :return: result of the update operation
  '''
  print(f'Updating database using {query} ...')
  return 'success'

db_prompt = '''
You are a helpful database assistant. You are able to perform various database update operations.
'''

agent = create_agent(
  ollama_tools_llm,
  tools=[update_database],
  system_prompt=db_prompt,
  middleware=[HumanInTheLoopMiddleware(
    interrupt_on={
      "update_database": {'allowed_decisions': ['approve', 'reject']}
    },
    description_prefix='Tool execution pending approval'
  )],
  checkpointer=InMemorySaver()
)

config = {'configurable': {'thread_id': 'thr-1'}}

for step in agent.stream(
  {'messages': [
    {
      'role': 'user',
      'content': 'update the database with the following query: DROP TABLE Vehicles;'
    }
  ]},
  config=config,
  stream_mode='values',
):
  if step.get('__interrupt__'):
    print(step['__interrupt__'][-1])

print('-------------------------')

for step in agent.stream(
  Command(
    resume={'decisions': [{'type': 'reject', 'message': 'Sorry, not approving this update!'}]}
  ),
  config=config,
  stream_mode='values',
):
  if step.get('messages'):
    print(step['messages'][-1])

Executing the above Python code snippet generates the following typical output:

Output.11

Interrupt(value={'action_requests': [{'name': 'update_database', 'args': {'query': 'DROP TABLE Vehicles;'}, 'description': "Tool execution pending approval\n\nTool: update_database\nArgs: {'query': 'DROP TABLE Vehicles;'}"}], 'review_configs': [{'action_name': 'update_database', 'allowed_decisions': ['approve', 'reject']}]}, id='90e6023ab469209ff1a7d388880ac98b')
-------------------------
content='' additional_kwargs={} response_metadata={'model': 'granite4:micro', 'created_at': '2025-11-11T21:00:52.627077781Z', 'done': True, 'done_reason': 'stop', 'total_duration': 394903755, 'load_duration': 101198792, 'prompt_eval_count': 209, 'prompt_eval_duration': 35285144, 'eval_count': 23, 'eval_duration': 240206013, 'model_name': 'granite4:micro', 'model_provider': 'ollama'} id='lc_run--cf371675-1181-47a9-aadd-c6f27df6b0fb-0' tool_calls=[{'name': 'update_database', 'args': {'query': 'DROP TABLE Vehicles;'}, 'id': 'f72d2c07-5ed4-4b20-86ce-a1e8844588d5', 'type': 'tool_call'}] usage_metadata={'input_tokens': 209, 'output_tokens': 23, 'total_tokens': 232}
content='Sorry, not approving this update!' name='update_database' id='30f3f4a8-5e4d-4d25-8429-67a98ae07809' tool_call_id='f72d2c07-5ed4-4b20-86ce-a1e8844588d5' status='error'
content='The database update operation was not approved. Please check the permissions or try again later.' additional_kwargs={} response_metadata={'model': 'granite4:micro', 'created_at': '2025-11-11T21:01:07.638316348Z', 'done': True, 'done_reason': 'stop', 'total_duration': 336427979, 'load_duration': 93138311, 'prompt_eval_count': 251, 'prompt_eval_duration': 17295260, 'eval_count': 18, 'eval_duration': 206266072, 'model_name': 'granite4:micro', 'model_provider': 'ollama'} id='lc_run--6bb9e3d3-4452-48b9-8547-bb75a5a97fa7-0' usage_metadata={'input_tokens': 251, 'output_tokens': 18, 'total_tokens': 269}

We have successfully demonstrated the Python code snippet for Recipe 11.

References

Quick Primer on LangChain

LangChain Documentation