Quick Primer on Model Context Protocol (MCP)

Bhaskar S

*UPDATED*08/16/2025

Overview

When LLMs first made their first appearance, the Enterprise apps built using LLMs were restricted to the knowledge on what the LLMs where trained on. These apps were useful for a set of tasks such as, text generation, text sentiment analysis, text summarization, etc.

The next evolution for LLM apps was the integration with Enterprise data assets via the Vector Stores for contextual knowledge retrieval using RAGs.

As agentic frameworks like LangChain came along with support for tools integration for automating manual tasks, the LLM apps evolved to drive automation in the Enterprise environment.

The challenge however was that there was no industry standard for tools integration and every framework had its own approach to tools integration.

Enter the Model Context Protocol (or MCP) that has changed the landscape for tools integration.

Think of MCP as a industry standard layer on top of the other Enterprise services that allows any agentic framework (such as LangChain, LlamaIndex, etc) to consistently integrate with the Enterprise tools.

In other words, MCP is an open protocol that enables seamless integration between the LLM apps and the external data sources (databases, files, etc) and tools (github, servicenow, etc).

The MCP specification consists of the following core components:

MCP Server: connects to various external as well as internal data sources and tools for exposing specific capabilities to the agentic LLM apps. Think of these as service providers
MCP Client: connects and interacts with the MCP Server(s) in a standardized manner
MCP Host: the LLM app(s) that use the MCP Client to access the MCP Server(s)

Installation and Setup

The installation and setup will be on a Ubuntu 24.04 LTS based Linux desktop. Ensure that Python 3.x programming language is installed and setup on the desktop.

In addition, ensure that Ollama is installed and setup on the Linux desktop (refer to for instructions).

Assuming that the ip address on the Linux desktop is 192.168.1.25, start the Ollama platform by executing the following command in the terminal window:

$ docker run --rm --name ollama --network=host -p 192.168.1.25:11434:11434 -v $HOME/.ollama:/root/.ollama ollama/ollama:0.11.4

For the LLM model, we will be using the recently released IBM Granite-3.3 2B model.

Open a new terminal window and execute the following docker command to download the LLM model:

$ docker exec -it ollama ollama run granite3.3:2b

To install the necessary Python modules for this primer, execute the following command:

$ pip install authlib dotenv fastapi httpx langchain lanchain-core langchain-ollama langgraph langchain-mcp-adapters mcp

This completes all the installation and setup for the MCP hands-on demonstrations using Python.

Hands-on with MCP (using Python)

In the following sections, we will get our hands dirty with the MCP using Ollama and LangChain. So, without further ado, let us get started !!!

Create a file called .env with the following environment variables defined:

.env

LLM_TEMPERATURE=0.2
OLLAMA_MODEL='granite3.3:2b'
OLLAMA_BASE_URL='http://192.168.1.25:11434'
PY_PROJECT_DIR='/projects/python/MCP/'
MCP_BASE_URL='http://192.168.1.25:8000/mcp'
JWT_SECRET_KEY='0xB0R3D2CR3AT3AV3RYG00DS3CUR3K3Y'
JWT_ALGORITHM='HS256'
JWT_EXPIRATION_SECS=60

The following is a simple LangChain based ReACT app:

interest_client.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

import asyncio
import logging
import os

from dotenv import load_dotenv, find_dotenv
from langchain_core.tools import tool
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_client')

load_dotenv(find_dotenv())

llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')

ollama_chat_llm = ChatOllama(base_url=ollama_base_url,
                             model=ollama_model,
                             temperature=llm_temperature)

@tool
def dummy():
    """This is a dummy tool"""
    return None

async def main():
  tools = [dummy]

  # Initialize a ReACT agent
  agent = create_react_agent(ollama_chat_llm, tools)

  # Case - 1 : Simple interest definition
  agent_response_1 = await agent.ainvoke(
    {'messages': 'what is the simple interest ?'})
  logger.info(agent_response_1['messages'][::-1])

  # Case - 2 : Simple interest calculation
  agent_response_2 = await agent.ainvoke(
    {'messages': 'compute the simple interest for a principal of 1000 at rate 3.75 ?'})
  logger.info(agent_response_2['messages'][::-1])

  # Case - 3 : Compound interest calculation
  agent_response_3 = await agent.ainvoke(
    {'messages': 'compute the compound interest for a principal of 1000 at rate 4.25 ?'})
  logger.info(agent_response_3['messages'][::-1])

if __name__ == '__main__':
  asyncio.run(main())

To execute the above Python code, execute the following command in a terminal window:

$ python interest_client.py

The following would be the typical output:

Output.1

INFO 2025-08-16 19:20:21,603 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:20:21,605 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:20:21.602914444Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1457132498, 'load_duration': 1253303039, 'prompt_eval_count': 152, 'prompt_eval_duration': 123422099, 'eval_count': 7, 'eval_duration': 79538283, 'model_name': 'granite3.3:2b'}, id='run--4aeb6ca9-4428-4ca9-ae63-f1791813136d-0', usage_metadata={'input_tokens': 152, 'output_tokens': 7, 'total_tokens': 159}), HumanMessage(content='what is the simple interest ?', additional_kwargs={}, response_metadata={}, id='1535ce50-1f86-4447-92a2-baa59086439d')]
INFO 2025-08-16 19:20:21,673 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:20:21,674 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:20:21.67256802Z', 'done': True, 'done_reason': 'stop', 'total_duration': 65829682, 'load_duration': 11137243, 'prompt_eval_count': 167, 'prompt_eval_duration': 18703442, 'eval_count': 3, 'eval_duration': 35560119, 'model_name': 'granite3.3:2b'}, id='run--3dbe5edf-0cb0-4f97-827b-5510ad82bc39-0', usage_metadata={'input_tokens': 167, 'output_tokens': 3, 'total_tokens': 170}), HumanMessage(content='compute the simple interest for a principal of 1000 at rate 3.75 ?', additional_kwargs={}, response_metadata={}, id='ae20e0dc-deaf-4c07-b857-85174440f526')]
INFO 2025-08-16 19:20:21,725 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:20:21,726 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:20:21.725018008Z', 'done': True, 'done_reason': 'stop', 'total_duration': 49124052, 'load_duration': 8228915, 'prompt_eval_count': 167, 'prompt_eval_duration': 7458129, 'eval_count': 3, 'eval_duration': 32951641, 'model_name': 'granite3.3:2b'}, id='run--b6d78f51-61a6-4a4f-b61a-62bd793685d9-0', usage_metadata={'input_tokens': 167, 'output_tokens': 3, 'total_tokens': 170}), HumanMessage(content='compute the compound interest for a principal of 1000 at rate 4.25 ?', additional_kwargs={}, response_metadata={}, id='aaa1cc18-8a25-4bec-b75a-2e46f901b42a')]

It is evident from the above Output.1 that the LLM app was not able to compute either the simple interest or the compound interest.

Let us now build our first MCP Server for computing both the simple interest (for a year) and the compound interest (for a year).

The following is our first MCP Server code in Python:

interest_mcp_server.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from mcp.server.fastmcp import FastMCP

import logging

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_mcp_server')

mcp = FastMCP('InterestCalculator')

@mcp.tool()
def yearly_simple_interest(principal: float, rate:float) -> float:
  """Tool to compute simple interest rate for a year."""
  logger.info(f'Simple interest -> Principal: {principal}, Rate: {rate}')
  return principal * rate / 100.00

@mcp.tool()
def yearly_compound_interest(principal: float, rate:float) -> float:
  """Tool to compute compound interest rate for a year."""
  logger.info(f'Compound interest -> Principal: {principal}, Rate: {rate}')
  return principal * (1 + rate / 100.0)

if __name__ == '__main__':
  logger.info(f'Starting the interest MCP server...')
  mcp.run(transport='stdio')

There are two types of transports supported by the MCP specification which are as follows:

Standard IO (stdio): enables communication through standard input and output streams that is useful for integrations with command-line tools
Server Sent Events (sse): enables server to client streaming with HTTP POST requests that is useful for integrations with network enabled services

In our example, we choose the stdio transport.

The next step is to build a MCP Host (LLM app) that uses a MCP Client to access the above MCP Server for computing both the simple interest and the compound interest.

The following is our first MCP Host LLM app code in Python:

interest_mcp_client.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from dotenv import load_dotenv, find_dotenv
from langchain_ollama import ChatOllama
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

import asyncio
import logging
import os

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_mcp_client')

load_dotenv(find_dotenv())

home_dir = os.getenv('HOME')
llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')
py_project_dir = os.getenv('PY_PROJECT_DIR')

server_params = StdioServerParameters(
  command='python',
  # Full absolute path to mcp server
  args=[home_dir + py_project_dir + 'interest_mcp_server.py'],
)

ollama_chat_llm = ChatOllama(base_url=ollama_base_url,
                             model=ollama_model,
                             temperature=llm_temperature)

async def main():
  # Will launch the MCP server and communicate via stdio
  async with stdio_client(server_params) as (read, write):
    # Create a MCP client session
    async with ClientSession(read, write) as session:
      # Connect to the MCP stdio server
      await session.initialize()

      # List all registered MCP tools
      tools = await load_mcp_tools(session)
      logger.info(f'List of MCP Tools:')
      for tool in tools:
        logger.info(f'\tMCP Tool: {tool.name}')

      # Initialize a ReACT agent
      agent = create_react_agent(ollama_chat_llm, tools)

      # Case - 1 : Simple interest definition
      agent_response_1 = await agent.ainvoke(
        {'messages': 'explain the definition of simple interest ?'})
      logger.info(agent_response_1['messages'][::-1])

      # Case - 2 : Simple interest calculation
      agent_response_2 = await agent.ainvoke(
        {'messages': 'compute the simple interest for a principal of 1000 at rate 3.75 ?'})
      logger.info(agent_response_2['messages'][::-1])

      # Case - 3 : Compound interest calculation
      agent_response_3 = await agent.ainvoke(
        {'messages': 'compute the compound interest for a principal of 1000 at rate 4.25 ?'})
      logger.info(agent_response_3['messages'][::-1])

if __name__ == '__main__':
  asyncio.run(main())

To execute the above Python code, execute the following command in a terminal window:

$ python interest_mcp_client.py

The following would be the typical output:

Output.2

INFO 2025-08-16 19:30:04,594 - Starting the interest MCP server...
INFO 2025-08-16 19:30:04,600 - Processing request of type ListToolsRequest
INFO 2025-08-16 19:30:04,601 - List of MCP Tools:
INFO 2025-08-16 19:30:04,601 - 	MCP Tool: yearly_simple_interest
INFO 2025-08-16 19:30:04,601 - 	MCP Tool: yearly_compound_interest
INFO 2025-08-16 19:30:06,078 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:30:06,080 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:30:06.078158752Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1468597747, 'load_duration': 1258247094, 'prompt_eval_count': 247, 'prompt_eval_duration': 149575775, 'eval_count': 3, 'eval_duration': 60005999, 'model_name': 'granite3.3:2b'}, id='run--dfa7b142-9d96-4445-aa2d-2ae4e5f37265-0', usage_metadata={'input_tokens': 247, 'output_tokens': 3, 'total_tokens': 250}), HumanMessage(content='explain the definition of simple interest ?', additional_kwargs={}, response_metadata={}, id='30b83317-6474-4b1f-9806-fb9a07b38f3d')]
INFO 2025-08-16 19:30:06,412 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:30:06,425 - Processing request of type CallToolRequest
INFO 2025-08-16 19:30:06,425 - Simple interest -> Principal: 1000.0, Rate: 3.75
INFO 2025-08-16 19:30:06,480 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:30:06,751 - [AIMessage(content='The simple interest for a principal of 1000 at rate 3.75% per annum is 37.5.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:30:06.75021703Z', 'done': True, 'done_reason': 'stop', 'total_duration': 320796391, 'load_duration': 12294680, 'prompt_eval_count': 307, 'prompt_eval_duration': 21977964, 'eval_count': 31, 'eval_duration': 283575064, 'model_name': 'granite3.3:2b'}, id='run--30bb979e-ce23-48bf-ba9e-3350a8b73565-0', usage_metadata={'input_tokens': 307, 'output_tokens': 31, 'total_tokens': 338}), ToolMessage(content='37.5', name='yearly_simple_interest', id='bef3a7d5-c225-4867-b299-06ba8b242bd6', tool_call_id='8e64a163-07fc-4e4a-928e-9276fde276d0'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:30:06.421882999Z', 'done': True, 'done_reason': 'stop', 'total_duration': 340008665, 'load_duration': 13866878, 'prompt_eval_count': 261, 'prompt_eval_duration': 18568800, 'eval_count': 34, 'eval_duration': 307096796, 'model_name': 'granite3.3:2b'}, id='run--a6741564-ca2d-4a50-8f31-824aa0674e9f-0', tool_calls=[{'name': 'yearly_simple_interest', 'args': {'principal': 1000, 'rate': 3.75}, 'id': '8e64a163-07fc-4e4a-928e-9276fde276d0', 'type': 'tool_call'}], usage_metadata={'input_tokens': 261, 'output_tokens': 34, 'total_tokens': 295}), HumanMessage(content='compute the simple interest for a principal of 1000 at rate 3.75 ?', additional_kwargs={}, response_metadata={}, id='c3e14a51-bca4-413b-91fb-625652ef0904')]
INFO 2025-08-16 19:30:07,082 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:30:07,096 - Processing request of type CallToolRequest
INFO 2025-08-16 19:30:07,096 - Compound interest -> Principal: 1000.0, Rate: 4.25
INFO 2025-08-16 19:30:07,139 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 19:30:07,452 - [AIMessage(content='The compound interest for a principal of 1000 at rate 4.25% per annum over one year is $1042.5.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:30:07.451543833Z', 'done': True, 'done_reason': 'stop', 'total_duration': 350104650, 'load_duration': 11884457, 'prompt_eval_count': 309, 'prompt_eval_duration': 8644166, 'eval_count': 36, 'eval_duration': 325965191, 'model_name': 'granite3.3:2b'}, id='run--3e43847f-7495-4c98-8baa-2fccd8609e50-0', usage_metadata={'input_tokens': 309, 'output_tokens': 36, 'total_tokens': 345}), ToolMessage(content='1042.5', name='yearly_compound_interest', id='2a2a115e-c1cb-4ba1-a362-d669facb8848', tool_call_id='73f1c8ee-26c8-4582-a00f-e70f5c48db22'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-16T23:30:07.092730309Z', 'done': True, 'done_reason': 'stop', 'total_duration': 339616505, 'load_duration': 10269221, 'prompt_eval_count': 261, 'prompt_eval_duration': 21752244, 'eval_count': 34, 'eval_duration': 307149258, 'model_name': 'granite3.3:2b'}, id='run--783ad955-5051-49f9-8180-b506d4a48ce4-0', tool_calls=[{'name': 'yearly_compound_interest', 'args': {'principal': 1000, 'rate': 4.25}, 'id': '73f1c8ee-26c8-4582-a00f-e70f5c48db22', 'type': 'tool_call'}], usage_metadata={'input_tokens': 261, 'output_tokens': 34, 'total_tokens': 295}), HumanMessage(content='compute the compound interest for a principal of 1000 at rate 4.25 ?', additional_kwargs={}, response_metadata={}, id='46734154-fe65-478e-86b5-3c0849c51e0c')]

BINGO - it is evident from the above Output.2 that the LLM app was able to not only define what simple interest is, but also able to compute the simple interest and the compound interest using the tools exposed by the MCP server.

A typical Enterprise LLM agentic app invokes multiple MCP servers to perform a particular task. For our next example, the LLM host app will demonstrate how one can setup and use multiple tools.

The following is our second MCP Server code in Python that will invoke shell commands:

shell_mcp_server.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

import subprocess

from mcp.server.fastmcp import FastMCP

import logging

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('shell_mcp_server')

mcp = FastMCP('ShellCommandExecutor')

# DISCLAIMER: This is purely for demonstration purposes and NOT to be used in production environment

@mcp.tool()
def execute_shell_command(command: str) -> str:
  """Tool to execute shell commands"""
  logger.info(f'Executing shell command: {command}')
  try:
    result = subprocess.run(command,
                            shell=True,
                            check=True,
                            text=True,
                            capture_output=True)
    if result.returncode != 0:
      return f'Error executing shell command - {command}'
    return result.stdout
  except subprocess.CalledProcessError as e:
    logger.error(e)

if __name__ == '__main__':
  logger.info(f'Starting the shell executor MCP server...')
  mcp.run(transport='stdio')

The following is our second MCP Host LLM app code in Python using multiple tools:

multi_mcp_client.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from dotenv import load_dotenv, find_dotenv
from langchain_ollama import ChatOllama
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent

import asyncio
import logging
import os

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('multi_mcp_client')

load_dotenv(find_dotenv())

home_dir = os.getenv('HOME')
llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')
py_project_dir = os.getenv('PY_PROJECT_DIR')

config = {
  'interest_calculators': {
    'transport': 'stdio',
    'command': 'python',
    'args': [home_dir + py_project_dir + 'interest_mcp_server.py'],
  },
  'shell_command_executor': {
    'transport': 'stdio',
    'command': 'python',
    'args': [home_dir + py_project_dir + 'shell_mcp_server.py'],
  }
}

ollama_chat_llm = ChatOllama(base_url=ollama_base_url,
                             model=ollama_model,
                             temperature=llm_temperature)

async def main():
  # Initialize MCP client to connect to multiple MCP server(s)
  client = MultiServerMCPClient(config)

  # Get the list of all the registered tools
  tools = await client.get_tools()

  logger.info(f'Loaded Multiple MCP Tools -> {tools}')

  # Initialize a ReACT agent with multiple tools
  agent = create_react_agent(ollama_chat_llm, tools)

  # Case - 1 : Compound interest definition
  agent_response_1 = await agent.ainvoke(
    {'messages': 'explain the definition of compound interest'})
  logger.info(agent_response_1['messages'][::-1])

  # Case - 2 : Compound interest calculation
  agent_response_2 = await agent.ainvoke(
    {'messages': 'what is the compound interest for a principal of 1000 at rate 3.75 ?'})
  logger.info(agent_response_2['messages'][::-1])

  # Case - 3 : Execute a shell command
  agent_response_3 = await agent.ainvoke(
    {'messages': 'Execute the free shell command to find how much system memory'})
  logger.info(agent_response_3['messages'][::-1])

if __name__ == '__main__':
  asyncio.run(main())

To execute the above Python code, execute the following command in a terminal window:

$ python multi_mcp_client.py

The following would be the typical output:

Output.3

INFO 2025-08-16 20:15:50,960 - Starting the shell executor MCP server...
INFO 2025-08-16 20:15:50,962 - Starting the interest MCP server...
INFO 2025-08-16 20:15:50,966 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:15:50,967 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:15:51,026 - Loaded Multiple MCP Tools -> [StructuredTool(name='yearly_simple_interest', description='Tool to compute simple interest rate for a year.', args_schema={'properties': {'principal': {'title': 'Principal', 'type': 'number'}, 'rate': {'title': 'Rate', 'type': 'number'}}, 'required': ['principal', 'rate'], 'title': 'yearly_simple_interestArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x772fbbeb23e0>), StructuredTool(name='yearly_compound_interest', description='Tool to compute compound interest rate for a year.', args_schema={'properties': {'principal': {'title': 'Principal', 'type': 'number'}, 'rate': {'title': 'Rate', 'type': 'number'}}, 'required': ['principal', 'rate'], 'title': 'yearly_compound_interestArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x772fb849bf60>), StructuredTool(name='execute_shell_command', description='Tool to execute shell commands', args_schema={'properties': {'command': {'title': 'Command', 'type': 'string'}}, 'required': ['command'], 'title': 'execute_shell_commandArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x772fb842a3e0>)]
INFO 2025-08-16 20:15:52,557 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:15:52,559 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:15:52.556710556Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1516538055, 'load_duration': 1292170364, 'prompt_eval_count': 293, 'prompt_eval_duration': 153175502, 'eval_count': 3, 'eval_duration': 69927371, 'model_name': 'granite3.3:2b'}, id='run--ad78af28-b589-411f-a616-bf131edb64bd-0', usage_metadata={'input_tokens': 293, 'output_tokens': 3, 'total_tokens': 296}), HumanMessage(content='explain the definition of compound interest', additional_kwargs={}, response_metadata={}, id='030f9ae7-1f32-41c7-8106-241cea65b8a9')]
INFO 2025-08-16 20:15:52,894 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:15:53,242 - Starting the interest MCP server...
INFO 2025-08-16 20:15:53,247 - Processing request of type CallToolRequest
INFO 2025-08-16 20:15:53,247 - Compound interest -> Principal: 1000.0, Rate: 3.75
INFO 2025-08-16 20:15:53,249 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:15:53,338 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:15:53,651 - [AIMessage(content='The compound interest for a principal of 1000 at rate 3.75 per annum over one year is $1037.5.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:15:53.649863658Z', 'done': True, 'done_reason': 'stop', 'total_duration': 356813613, 'load_duration': 12707879, 'prompt_eval_count': 357, 'prompt_eval_duration': 14963906, 'eval_count': 35, 'eval_duration': 325611202, 'model_name': 'granite3.3:2b'}, id='run--fa46748f-d168-43dc-906d-b060f3d9e827-0', usage_metadata={'input_tokens': 357, 'output_tokens': 35, 'total_tokens': 392}), ToolMessage(content='1037.5', name='yearly_compound_interest', id='c1351145-5695-4356-a44d-cb11765a13df', tool_call_id='b7f7baed-c156-4c40-886e-15facc5d28be'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:15:52.903395557Z', 'done': True, 'done_reason': 'stop', 'total_duration': 341595430, 'load_duration': 15583401, 'prompt_eval_count': 309, 'prompt_eval_duration': 22799791, 'eval_count': 33, 'eval_duration': 302662023, 'model_name': 'granite3.3:2b'}, id='run--22c4ff5c-c646-44df-96dc-6675fee50b76-0', tool_calls=[{'name': 'yearly_compound_interest', 'args': {'principal': 1000, 'rate': 3.75}, 'id': 'b7f7baed-c156-4c40-886e-15facc5d28be', 'type': 'tool_call'}], usage_metadata={'input_tokens': 309, 'output_tokens': 33, 'total_tokens': 342}), HumanMessage(content='what is the compound interest for a principal of 1000 at rate 3.75 ?', additional_kwargs={}, response_metadata={}, id='b9b33197-7b8c-45c6-9c24-0156667342ad')]
INFO 2025-08-16 20:15:53,871 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:15:54,216 - Starting the shell executor MCP server...
INFO 2025-08-16 20:15:54,221 - Processing request of type CallToolRequest
INFO 2025-08-16 20:15:54,222 - Executing shell command: free -h
INFO 2025-08-16 20:15:54,226 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:15:54,324 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:15:55,037 - [AIMessage(content="The system has a total of 62 Gi memory, with 7.9 Gi currently in use and 31 Gi available for processes. Additionally, there's 153Mi shared memory and 24Gi buff/cache used for caching purposes. The swap space is empty at the moment, with 14Gi allocated but no data stored on it.", additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:15:55.036437772Z', 'done': True, 'done_reason': 'stop', 'total_duration': 764703272, 'load_duration': 13539071, 'prompt_eval_count': 389, 'prompt_eval_duration': 15281894, 'eval_count': 80, 'eval_duration': 731655968, 'model_name': 'granite3.3:2b'}, id='run--11d5288d-e1e0-44aa-9bed-221bd811f304-0', usage_metadata={'input_tokens': 389, 'output_tokens': 80, 'total_tokens': 469}), ToolMessage(content='               total        used        free      shared  buff/cache   available\nMem:            62Gi       7.9Gi        31Gi       153Mi        24Gi        54Gi\nSwap:           14Gi          0B        14Gi\n', name='execute_shell_command', id='c587933f-bd49-4340-8774-23c4978aa3b1', tool_call_id='81fb04e9-f23c-4db7-8e7f-f680d9ee2800'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:15:53.881588538Z', 'done': True, 'done_reason': 'stop', 'total_duration': 228487971, 'load_duration': 14330482, 'prompt_eval_count': 298, 'prompt_eval_duration': 6625396, 'eval_count': 23, 'eval_duration': 206990208, 'model_name': 'granite3.3:2b'}, id='run--487db455-b95a-445c-9caf-943e265d0a6e-0', tool_calls=[{'name': 'execute_shell_command', 'args': {'command': 'free -h'}, 'id': '81fb04e9-f23c-4db7-8e7f-f680d9ee2800', 'type': 'tool_call'}], usage_metadata={'input_tokens': 298, 'output_tokens': 23, 'total_tokens': 321}), HumanMessage(content='Execute the free shell command to find how much system memory', additional_kwargs={}, response_metadata={}, id='a698e400-a4e5-4708-9948-fff7615bfa3d')]

BOOM - it is evident from the above Output.3 that the LLM app was able to multiple tools exposed by the different MCP servers.

Until now we have used the stdio transport as the mode off communication between the MCP Client and the MCP Client. As indicated earlier, the other transport mode is the streamable-http transport.

The following is our MCP Server code in Python using the streamable-http transport:

interest_mcp_server2.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

import logging

from mcp.server.fastmcp import FastMCP

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_mcp_server2')

mcp = FastMCP('InterestCalculator-2', host='192.168.1.25', port=8000)

@mcp.tool()
def yearly_simple_interest(principal: float, rate:float) -> float:
  """Tool to compute simple interest rate for a year."""
  logger.info(f'Simple interest -> Principal: {principal}, Rate: {rate}')
  return principal * rate / 100.00

@mcp.tool()
def yearly_compound_interest(principal: float, rate:float) -> float:
  """Tool to compute compound interest rate for a year."""
  logger.info(f'Compound interest -> Principal: {principal}, Rate: {rate}')
  return principal * (1 + rate / 100.0)

if __name__ == "__main__":
  logger.info(f'Starting the interest calculator MCP server using HTTP ...')
  mcp.run(transport='streamable-http')

To execute the above Python code, execute the following command in a terminal window:

$ python interest_mcp_server2.py

The following would be the typical output:

Output.4

INFO 2025-08-16 20:23:28,319 - Starting the interest calculator MCP server using HTTP ...
INFO:     Started server process [100394]
INFO:     Waiting for application startup.
INFO 2025-08-16 20:23:28,331 - StreamableHTTP session manager started
INFO:     Application startup complete.
INFO:     Uvicorn running on http://192.168.1.25:8000 (Press CTRL+C to quit)

The following is our MCP Host LLM app code in Python, which invokes multiple tools, one of which is exposed as a network service:

multi_mcp_client2.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from dotenv import load_dotenv, find_dotenv
from langchain_ollama import ChatOllama
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent

import asyncio
import logging
import os

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('multi_mcp_client2')

load_dotenv(find_dotenv())

home_dir = os.getenv('HOME')
llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')
py_project_dir = os.getenv('PY_PROJECT_DIR')
mcp_base_url = os.getenv('MCP_BASE_URL')

config = {
  'interest_calculators': {
    'transport': 'streamable_http',
    'url': mcp_base_url,
  },
  'shell_command_executor': {
    'transport': 'stdio',
    'command': 'python',
    'args': [home_dir + py_project_dir + 'shell_mcp_server.py'],
  }
}

ollama_chat_llm = ChatOllama(base_url=ollama_base_url,
                             model=ollama_model,
                             temperature=llm_temperature)

async def main():
  # Initialize MCP client to connect to multiple MCP server(s)
  client = MultiServerMCPClient(config)

  # Get the list of all the registered tools
  tools = await client.get_tools()

  logger.info(f'Loaded Multiple MCP Tools -> {tools}')

  # Initialize a ReACT agent with multiple tools
  agent = create_react_agent(ollama_chat_llm, tools)

  # Case - 1 : Compound interest definition
  agent_response_1 = await agent.ainvoke(
    {'messages': 'explain the definition of compound interest'})
  logger.info(agent_response_1['messages'][::-1])

  # Case - 2 : Compound interest calculation
  agent_response_2 = await agent.ainvoke(
    {'messages': 'what is the compound interest for a principal of 1000 at rate 3.75 ?'})
  logger.info(agent_response_2['messages'][::-1])

  # Case - 3 : Execute a shell command
  agent_response_3 = await agent.ainvoke(
    {'messages': 'Execute the free shell command to find how much system memory'})
  logger.info(agent_response_3['messages'][::-1])

if __name__ == '__main__':
  asyncio.run(main())

To execute the above Python code, execute the following command in a terminal window:

$ python multi_mcp_client2.py

The following would be the typical output:

Output.5

INFO 2025-08-16 20:25:14,780 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:14,780 - Received session ID: d38659ff7d0d45ffa0c3499573054c7b
INFO 2025-08-16 20:25:14,781 - Negotiated protocol version: 2025-06-18
INFO 2025-08-16 20:25:14,783 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 202 Accepted"
INFO 2025-08-16 20:25:14,784 - HTTP Request: GET http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:14,784 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:14,786 - HTTP Request: DELETE http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:15,107 - Starting the shell executor MCP server...
INFO 2025-08-16 20:25:15,112 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:25:15,157 - Loaded Multiple MCP Tools -> [StructuredTool(name='yearly_simple_interest', description='Tool to compute simple interest rate for a year.', args_schema={'properties': {'principal': {'title': 'Principal', 'type': 'number'}, 'rate': {'title': 'Rate', 'type': 'number'}}, 'required': ['principal', 'rate'], 'title': 'yearly_simple_interestArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x75a7f7b602c0>), StructuredTool(name='yearly_compound_interest', description='Tool to compute compound interest rate for a year.', args_schema={'properties': {'principal': {'title': 'Principal', 'type': 'number'}, 'rate': {'title': 'Rate', 'type': 'number'}}, 'required': ['principal', 'rate'], 'title': 'yearly_compound_interestArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x75a7f7b61080>), StructuredTool(name='execute_shell_command', description='Tool to execute shell commands', args_schema={'properties': {'command': {'title': 'Command', 'type': 'string'}}, 'required': ['command'], 'title': 'execute_shell_commandArguments', 'type': 'object'}, response_format='content_and_artifact', coroutine=.call_tool at 0x75a7fb5b63e0>)]
INFO 2025-08-16 20:25:16,665 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:16,667 - [AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:25:16.664859411Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1493443858, 'load_duration': 1289038640, 'prompt_eval_count': 293, 'prompt_eval_duration': 133439701, 'eval_count': 3, 'eval_duration': 70118457, 'model_name': 'granite3.3:2b'}, id='run--c34fc3f1-2805-4ce8-8bce-8ad7e010cdac-0', usage_metadata={'input_tokens': 293, 'output_tokens': 3, 'total_tokens': 296}), HumanMessage(content='explain the definition of compound interest', additional_kwargs={}, response_metadata={}, id='0ce5b6ad-d447-497d-b19f-9215a2036dd0')]
INFO 2025-08-16 20:25:17,002 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,041 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,041 - Received session ID: 5630a1021ee64545a0b30800ea023c99
INFO 2025-08-16 20:25:17,042 - Negotiated protocol version: 2025-06-18
INFO 2025-08-16 20:25:17,045 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 202 Accepted"
INFO 2025-08-16 20:25:17,045 - HTTP Request: GET http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,047 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,051 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,054 - HTTP Request: DELETE http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,097 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,424 - [AIMessage(content='The compound interest for a principal of 1000 at rate 3.75% per annum over one year is $1037.50.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:25:17.42283919Z', 'done': True, 'done_reason': 'stop', 'total_duration': 365561612, 'load_duration': 14554440, 'prompt_eval_count': 357, 'prompt_eval_duration': 8626905, 'eval_count': 37, 'eval_duration': 339212193, 'model_name': 'granite3.3:2b'}, id='run--55d5923d-d964-4eba-8fea-fb1a40f809bc-0', usage_metadata={'input_tokens': 357, 'output_tokens': 37, 'total_tokens': 394}), ToolMessage(content='1037.5', name='yearly_compound_interest', id='99aed036-9995-4bcc-8899-97be74ba1ca6', tool_call_id='8e3b2fcf-2907-4416-bb14-a7144e120df8'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:25:17.012344803Z', 'done': True, 'done_reason': 'stop', 'total_duration': 342865193, 'load_duration': 11765171, 'prompt_eval_count': 309, 'prompt_eval_duration': 22520911, 'eval_count': 33, 'eval_duration': 308035357, 'model_name': 'granite3.3:2b'}, id='run--9b17e405-9891-4217-aa36-3ebeda317df4-0', tool_calls=[{'name': 'yearly_compound_interest', 'args': {'principal': 1000, 'rate': 3.75}, 'id': '8e3b2fcf-2907-4416-bb14-a7144e120df8', 'type': 'tool_call'}], usage_metadata={'input_tokens': 309, 'output_tokens': 33, 'total_tokens': 342}), HumanMessage(content='what is the compound interest for a principal of 1000 at rate 3.75 ?', additional_kwargs={}, response_metadata={}, id='4c1a0d37-ae19-4a28-8e95-dfce3c740320')]
INFO 2025-08-16 20:25:17,653 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:17,999 - Starting the shell executor MCP server...
INFO 2025-08-16 20:25:18,004 - Processing request of type CallToolRequest
INFO 2025-08-16 20:25:18,005 - Executing shell command: free -h
INFO 2025-08-16 20:25:18,009 - Processing request of type ListToolsRequest
INFO 2025-08-16 20:25:18,118 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-16 20:25:18,771 - [AIMessage(content="The system has a total of 62 Gi memory, with 8.0 Gi currently in use and 31 Gi available for processes. Additionally, there's 153Mi of shared memory and 24Gi of buffered/cached memory. The swap space is empty at the moment, with 14Gi available.", additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:25:18.769959306Z', 'done': True, 'done_reason': 'stop', 'total_duration': 712721106, 'load_duration': 11893199, 'prompt_eval_count': 389, 'prompt_eval_duration': 25637608, 'eval_count': 73, 'eval_duration': 670447412, 'model_name': 'granite3.3:2b'}, id='run--4b0b4741-dd0a-4592-9597-f71bea38edd2-0', usage_metadata={'input_tokens': 389, 'output_tokens': 73, 'total_tokens': 462}), ToolMessage(content='               total        used        free      shared  buff/cache   available\nMem:            62Gi       8.0Gi        31Gi       153Mi        24Gi        54Gi\nSwap:           14Gi          0B        14Gi\n', name='execute_shell_command', id='c1c9ba95-8800-4100-9f73-a7acc5436cdc', tool_call_id='05b5f439-8724-447e-99ae-3120830b3976'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T00:25:17.664049762Z', 'done': True, 'done_reason': 'stop', 'total_duration': 236810650, 'load_duration': 14083194, 'prompt_eval_count': 298, 'prompt_eval_duration': 8979711, 'eval_count': 23, 'eval_duration': 213207601, 'model_name': 'granite3.3:2b'}, id='run--ebd2d568-b83f-497e-a97b-60fbf3b134ff-0', tool_calls=[{'name': 'execute_shell_command', 'args': {'command': 'free -h'}, 'id': '05b5f439-8724-447e-99ae-3120830b3976', 'type': 'tool_call'}], usage_metadata={'input_tokens': 298, 'output_tokens': 23, 'total_tokens': 321}), HumanMessage(content='Execute the free shell command to find how much system memory', additional_kwargs={}, response_metadata={}, id='545a8fb3-4bcc-4e76-be76-fa9f171a5201')]

WALLA - it is evident from the above Output.5 that the LLM app was able to sucessfully able to communicate with the MCP server using both transport modes !!!

Until now, the MCP servers were unprotected (without any kind of authentication enforcement), with open access to a MCP client. In the next example, we will demonstrate how one can use JWT token to allow/deny access to a MCP client.

The following is the JWT Manager code in Python using the authlib module:

jwt_manager.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from authlib.jose import JoseError, JsonWebToken
from dotenv import load_dotenv, find_dotenv

import datetime
import logging
import os

# Logging Configuration
logging.basicConfig(
  format='%(asctime)s | %(levelname)s -> %(message)s',
  datefmt='%Y-%m-%d %H:%M:%S',
  level=logging.INFO
)

# Logger
logger = logging.getLogger('jwt_manager')

# JWT class with method for generating and verifying JWT key
class JWTManager(object):
  def __init__(self):
    load_dotenv(find_dotenv())
    self.jwt_secret_key = os.getenv('JWT_SECRET_KEY')
    self.jwt_algorithm = os.getenv('JWT_ALGORITHM')
    self.jwt_expiration_secs = int(os.getenv('JWT_EXPIRATION_SECS'))

  def expiry_time_secs(self):
    """This method returns the expiration time."""
    return self.jwt_expiration_secs

  def create_token(self, subject: str) -> str:
    """This method creates a new JWT token."""
    logger.info(f'Creating JWT token for subject {subject}')
    now = datetime.datetime.now(datetime.UTC)
    payload = {
      'sub': subject,
      'iat': int(now.timestamp()),
      'exp': int((now + datetime.timedelta(seconds=self.jwt_expiration_secs)).timestamp()),
      'iss': 'polarsparc.com',
      'aud': 'polarsparc.com/mcp',
    }
    jws = JsonWebToken(self.jwt_algorithm)
    return jws.encode({'alg': self.jwt_algorithm}, payload, self.jwt_secret_key).decode('utf-8')

  def verify_token(self, token: str) -> bool:
    """This method verifies a JWT token."""
    logger.info(f'Verifying JWT token')
    valid = True
    jws = JsonWebToken(self.jwt_algorithm)
    try:
        jws.decode(token, self.jwt_secret_key)
    except JoseError as e:
        valid = False
    return valid

The following is the JWT Service code in Python, which exposes an URL endpoint for the JWT token creation (and validation) using the fastapi module:

jwt_service.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from typing import Any
from jwt_manager import JWTManager

import logging
import uvicorn

# Logging Configuration
logging.basicConfig(
  format='%(asctime)s | %(levelname)s -> %(message)s',
  datefmt='%Y-%m-%d %H:%M:%S',
  level=logging.INFO
)

# Logger
logger = logging.getLogger('jwt_service')

# FastAPI app
app = FastAPI()

# JWT manager
jwt = JWTManager()

# Utility method(s)
def unauthorized(message: str):
  response = JSONResponse({'error': message}, status_code=401)
  response.headers['WWW-Authenticate'] = 'Basic realm="Bearer token required"'
  return response

# FastAPI route handlers

# Usage: curl -X POST http://192.168.1.25:8080/api/generate -H 'Content-Type: application/json' -d '{"subject": "user@polarsparc.com"}'
@app.post('/api/generate')
async def generate(request: Request) -> JSONResponse:
  logger.info(f'Generating new token for request: {request}')
  try:
    body = await request.json()
  except Any:
    body = {}
  subject = body.get('subject', 'unknown') if isinstance(body, dict) else 'unknown'
  token = jwt.create_token(subject)
  return JSONResponse({'jwt_token': token, 'expires_in': jwt.expiry_time_secs()})

# Usage: curl http://192.168.1.25:8080/api/hello -H "Authorization: Bearer 
@app.get('/api/validate')
async def validate(request: Request) -> JSONResponse:
  logger.info(f'Validating JWT token, request: {request}')
  auth_header = request.headers.get('Authorization', '')
  if not auth_header.startswith('Bearer '):
    return unauthorized('Missing or malformed Authorization Bearer header')
  token = auth_header[len('Bearer '):].strip()
  if not jwt.verify_token(token):
    return unauthorized('Invalid JWT token')
  return JSONResponse({'message': 'Success'})

# Run with uvicorn server using app
if __name__ == "__main__":
  uvicorn.run(app, host='192.168.1.25', port=8080)

Note that the JWT service will be hosted at the endpoint 192.168.1.25:8080.

To execute the above Python code, execute the following command in a terminal window:

$ python jwt_service.py

The following would be the typical output:

Output.6

INFO:     Started server process [12148]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://192.168.1.25:8080 (Press CTRL+C to quit)

The following is our JWT protected MCP Server code in Python, which uses the streamable-http transport:

interest_mcp_server3.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from mcp.server.auth.provider import AccessToken, TokenVerifier
from mcp.server.auth.settings import AuthSettings
from mcp.server.fastmcp import FastMCP
from jwt_manager import JWTManager

import logging

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_mcp_server3')

# Simple custom token verifier
class SimpleTokenVerifier(TokenVerifier):
  def __init__(self):
    self.jwt = JWTManager()

  async def verify_token(self, token: str) -> AccessToken | None:
    if self.jwt.verify_token(token):
      return AccessToken(token=token, client_id='user@polarsparc.com', scopes=['user'])
    return None

mcp = FastMCP('InterestCalculator-3',
              host='192.168.1.25',
              port=8000,
              token_verifier=SimpleTokenVerifier(),
              auth=AuthSettings(issuer_url='http://192.168.1.25:8000',
                                resource_server_url='http://192.168.1.25:8000'))

@mcp.tool()
def yearly_simple_interest(principal: float, rate:float) -> float:
  """Tool to compute simple interest rate for a year."""
  logger.info(f'Simple interest -> Principal: {principal}, Rate: {rate}')
  return principal * rate / 100.00

@mcp.tool()
def yearly_compound_interest(principal: float, rate:float) -> float:
  """Tool to compute compound interest rate for a year."""
  logger.info(f'Compound interest -> Principal: {principal}, Rate: {rate}')
  return principal * (1 + rate / 100.0)

if __name__ == "__main__":
  logger.info(f'Starting the protected interest calculator MCP server using HTTP ...')
  mcp.run(transport='streamable-http')

To execute the above Python code, execute the following command in a terminal window:

$ python interest_mcp_server3.py

The output would be similar to that of Output.4 and hence will not repeat it here.

The following is our MCP Host LLM app code in Python, which invokes a protected tool exposed via a HTTP service:

multi_mcp_client3.py

#
# @Author: Bhaskar S
# @Blog:   https://polarsparc.github.io
# @Date:   16 August 2025
#

from dotenv import load_dotenv, find_dotenv
from langchain_ollama import ChatOllama
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

import asyncio
import httpx
import logging
import os

logging.basicConfig(format='%(levelname)s %(asctime)s - %(message)s',
                    level=logging.INFO)

logger = logging.getLogger('interest_mcp_client')

load_dotenv(find_dotenv())

llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')
mcp_base_url = os.getenv('MCP_BASE_URL')

ollama_chat_llm = ChatOllama(base_url=ollama_base_url,
                             model=ollama_model,
                             temperature=llm_temperature)

async def main():
  valid = bool(os.getenv('VALID'))
  token = '123'
  if valid:
    # Make HTTP request to get a JWT token
    response = httpx.post('http://192.168.1.25:8080/api/generate', data='{"subject": "user@polarsparc.com"}')
    token = response.json()['jwt_token']

  # Will launch the MCP server and communicate via stdio
  async with streamablehttp_client(mcp_base_url, headers={'Authorization': 'Bearer '+token}) as (read, write, _):
    # Create a MCP client session
    async with ClientSession(read, write) as session:
      # Connect to the MCP stdio server
      await session.initialize()

      # List all registered MCP tools
      tools = await load_mcp_tools(session)
      logger.info(f'List of MCP Tools:')
      for tool in tools:
        logger.info(f'\tMCP Tool: {tool.name}')

      # Initialize a ReACT agent
      agent = create_react_agent(ollama_chat_llm, tools)

      # Case - 1 : Simple interest definition
      agent_response_1 = await agent.ainvoke(
        {'messages': 'explain the definition of simple interest ?'})
      logger.info(agent_response_1['messages'][::-1])

      # Case - 2 : Simple interest calculation
      agent_response_2 = await agent.ainvoke(
        {'messages': 'compute the simple interest for a principal of 1000 at rate 3.75 ?'})
      logger.info(agent_response_2['messages'][::-1])

      # Case - 3 : Compound interest calculation
      agent_response_3 = await agent.ainvoke(
        {'messages': 'compute the compound interest for a principal of 1000 at rate 4.25 ?'})
      logger.info(agent_response_3['messages'][::-1])

if __name__ == '__main__':
  try:
    asyncio.run(main())
  except Exception as e:
    logger.error(e)

To execute the above Python code (with an invalid token), execute the following command in a terminal window:

$ python interest_mcp_client3.py

The following would be the typical output:

Output.7

INFO 2025-08-17 08:19:50,874 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 401 Unauthorized"
ERROR 2025-08-17 08:19:51,102 - unhandled errors in a TaskGroup (1 sub-exception)

Once again, to execute the above Python code (with an valid token), execute the following command in a terminal window:

$ VALID=True python interest_mcp_client3.py

The following would be the typical output:

Output.8

INFO 2025-08-17 08:20:17,536 - HTTP Request: POST http://192.168.1.25:8080/api/generate "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:17,558 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:17,558 - Received session ID: e868e7ea574b4a6780a7c9acc849b88c
INFO 2025-08-17 08:20:17,558 - Negotiated protocol version: 2025-06-18
INFO 2025-08-17 08:20:17,561 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 202 Accepted"
INFO 2025-08-17 08:20:17,561 - HTTP Request: GET http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:17,562 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:17,563 - List of MCP Tools:
INFO 2025-08-17 08:20:17,563 - 	MCP Tool: yearly_simple_interest
INFO 2025-08-17 08:20:17,563 - 	MCP Tool: yearly_compound_interest
INFO 2025-08-17 08:20:17,637 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:18,229 - [AIMessage(content="Simple interest is calculated on the initial principal amount only. It's determined by multiplying the principal, rate, and time (in years) using the formula: Simple Interest = Principal x Rate x Time. The rate is typically expressed as a yearly percentage and does not change over the life of the loan or investment.", additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T12:20:18.227718947Z', 'done': True, 'done_reason': 'stop', 'total_duration': 656390282, 'load_duration': 12340150, 'prompt_eval_count': 247, 'prompt_eval_duration': 41217018, 'eval_count': 66, 'eval_duration': 602321947, 'model_name': 'granite3.3:2b'}, id='run--48ec0bce-20bf-4324-894e-24ebaa273668-0', usage_metadata={'input_tokens': 247, 'output_tokens': 66, 'total_tokens': 313}), HumanMessage(content='explain the definition of simple interest ?', additional_kwargs={}, response_metadata={}, id='73e4a4f1-33d3-4ec1-9d9f-444d0c0fd5f7')]
INFO 2025-08-17 08:20:18,545 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:18,559 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:18,601 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:18,881 - [AIMessage(content='The simple interest for a principal of 1000 at rate 3.75% per annum is 37.5.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T12:20:18.880227187Z', 'done': True, 'done_reason': 'stop', 'total_duration': 317195121, 'load_duration': 11934293, 'prompt_eval_count': 307, 'prompt_eval_duration': 8634267, 'eval_count': 31, 'eval_duration': 293748873, 'model_name': 'granite3.3:2b'}, id='run--46152029-0d6f-4851-96f7-763a56953ef3-0', usage_metadata={'input_tokens': 307, 'output_tokens': 31, 'total_tokens': 338}), ToolMessage(content='37.5', name='yearly_simple_interest', id='de4ed659-6a55-4b2e-bddb-05238641c496', tool_call_id='4f9e33dd-fe74-42d1-94f9-247b9896e67d'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T12:20:18.554593738Z', 'done': True, 'done_reason': 'stop', 'total_duration': 323198128, 'load_duration': 12281900, 'prompt_eval_count': 261, 'prompt_eval_duration': 8379429, 'eval_count': 33, 'eval_duration': 302083542, 'model_name': 'granite3.3:2b'}, id='run--34c77dda-77fb-4d10-9dd6-5b2e0ff2d974-0', tool_calls=[{'name': 'yearly_simple_interest', 'args': {'principal': 1000, 'rate': 3.75}, 'id': '4f9e33dd-fe74-42d1-94f9-247b9896e67d', 'type': 'tool_call'}], usage_metadata={'input_tokens': 261, 'output_tokens': 33, 'total_tokens': 294}), HumanMessage(content='compute the simple interest for a principal of 1000 at rate 3.75 ?', additional_kwargs={}, response_metadata={}, id='b63ec687-747d-4c22-bbac-b14c4ae3c1f3')]
INFO 2025-08-17 08:20:19,200 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:19,215 - HTTP Request: POST http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:19,262 - HTTP Request: POST http://192.168.1.25:11434/api/chat "HTTP/1.1 200 OK"
INFO 2025-08-17 08:20:19,583 - [AIMessage(content='The compound interest for a principal of 1000 at rate 4.25% per annum over one year is $1042.5.', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T12:20:19.582794272Z', 'done': True, 'done_reason': 'stop', 'total_duration': 363018637, 'load_duration': 14603073, 'prompt_eval_count': 309, 'prompt_eval_duration': 10043967, 'eval_count': 36, 'eval_duration': 334569664, 'model_name': 'granite3.3:2b'}, id='run--77078b25-a80d-4a66-9d41-6f0a229c5a93-0', usage_metadata={'input_tokens': 309, 'output_tokens': 36, 'total_tokens': 345}), ToolMessage(content='1042.5', name='yearly_compound_interest', id='65b2d670-3867-4c0a-be61-4a9a3cf64dc1', tool_call_id='3bcd368b-d9fb-4497-a087-385ab6240536'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'granite3.3:2b', 'created_at': '2025-08-17T12:20:19.211207428Z', 'done': True, 'done_reason': 'stop', 'total_duration': 327466317, 'load_duration': 11437887, 'prompt_eval_count': 261, 'prompt_eval_duration': 6376604, 'eval_count': 34, 'eval_duration': 309202350, 'model_name': 'granite3.3:2b'}, id='run--68cea6ef-4657-43ee-906c-4998c0a369e3-0', tool_calls=[{'name': 'yearly_compound_interest', 'args': {'principal': 1000, 'rate': 4.25}, 'id': '3bcd368b-d9fb-4497-a087-385ab6240536', 'type': 'tool_call'}], usage_metadata={'input_tokens': 261, 'output_tokens': 34, 'total_tokens': 295}), HumanMessage(content='compute the compound interest for a principal of 1000 at rate 4.25 ?', additional_kwargs={}, response_metadata={}, id='fa2f3d6a-438b-4b10-841c-ad8118123611')]
INFO 2025-08-17 08:20:19,585 - HTTP Request: DELETE http://192.168.1.25:8000/mcp "HTTP/1.1 200 OK"

BAM - it is evident from the above Output.7 and Output.8 that the MCP client behaved in the right way when trying to access a protect MCP server !!!

With this, we conclude the various hands-on demonstrations on using the MCP framework for building and deploying agentic LLM apps !!!

References

Model Context Protocol Documentation