Togmal-demo / HOSTING_GUIDE.md
HeTalksInMaths
Initial commit: ToGMAL Prompt Difficulty Analyzer with real MMLU data
f9b1ad5
|
raw
history blame
9.92 kB

ToGMAL MCP Server - Hosting & Demo Guide

❓ Can You Host MCP Servers on Render (Like Aqumen)?

Short Answer: Not Directly (But There Are Alternatives)

Why MCP Servers Are Different from FastAPI

FastAPI (Your Aqumen Project)

# Traditional web server
app = FastAPI()

@app.get("/api/endpoint")
async def endpoint():
    return {"data": "response"}

# Runs continuously, listens on HTTP port
# Accessible via: https://aqumen.onrender.com/api/endpoint

FastMCP (ToGMAL)

# MCP server
mcp = FastMCP("togmal")

@mcp.tool()
async def tool_name(params):
    return "result"

# Runs on-demand, uses stdio (not HTTP)
# Spawned by client, communicates via stdin/stdout
# NOT accessible via URL

Key Differences

Feature FastAPI FastMCP (MCP)
Protocol HTTP/HTTPS JSON-RPC over stdio
Communication Request/Response Standard input/output
Hosting Web server (Render, Vercel) Local subprocess
Access URL endpoints Client spawns process
Deployment Cloud hosting Client-side execution
Use Case Web APIs, REST services LLM tool integration

Why MCP Uses stdio Instead of HTTP

  1. Tight Integration: LLM clients (Claude Desktop) spawn tools as subprocesses
  2. Security: No network exposure, all communication is process-local
  3. Performance: No network latency, instant local communication
  4. Privacy: Data never leaves the user's machine
  5. Simplicity: No authentication, CORS, or network configuration needed

🌐 How to Create a Web-Based Demo for VCs

Since MCP servers can't be hosted directly, here are your options:

Option 1: MCP Inspector (Easiest)

Already running at: http://localhost:6274

To make it accessible:

# Use ngrok or similar tunneling service
brew install ngrok
ngrok http 6274

Result: Get a public URL like https://abc123.ngrok.io

Demo Flow:

  1. Show the ngrok URL to VCs
  2. They can test the MCP tools in real-time
  3. Fully interactive web UI

Limitations:

  • Requires your laptop to be running
  • Session expires when you close terminal

Option 2: Build a FastAPI Wrapper (Best for Demos)

Create an HTTP API that wraps the MCP server:

# api_wrapper.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

app = FastAPI(title="ToGMAL API Demo")

# Enable CORS for web demos
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.post("/analyze/prompt")
async def analyze_prompt(prompt: str, response_format: str = "markdown"):
    """Analyze a prompt using ToGMAL MCP server."""
    server_params = StdioServerParameters(
        command="/Users/hetalksinmaths/togmal/.venv/bin/python",
        args=["/Users/hetalksinmaths/togmal/togmal_mcp.py"]
    )
    
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(
                "togmal_analyze_prompt",
                arguments={"prompt": prompt, "response_format": response_format}
            )
            return {"result": result.content[0].text}

@app.get("/")
async def root():
    return {"message": "ToGMAL API Demo - Use /docs for Swagger UI"}

Deploy to Render:

# render.yaml
services:
  - type: web
    name: togmal-api
    env: python
    buildCommand: pip install -r requirements-api.txt
    startCommand: uvicorn api_wrapper:app --host 0.0.0.0 --port $PORT

Access: https://togmal-api.onrender.com/docs


Option 3: Static Demo Website with Frontend

Build a simple React/HTML frontend that demonstrates the concepts:

// Demo frontend (no real MCP server)
const demoExamples = [
  {
    prompt: "Build me a quantum gravity theory",
    risk: "HIGH",
    detections: ["math_physics_speculation"],
    interventions: ["step_breakdown", "web_search"]
  },
  // ... more examples
];

// Show pre-computed results from test_examples.py

Deploy to: Vercel, Netlify, GitHub Pages (free)


Option 4: Video Demo

Record a screencast showing:

  1. MCP Inspector UI
  2. Running test examples
  3. Claude Desktop integration
  4. Real-time detection

Tools: Loom, QuickTime, OBS


πŸ”‘ Do You Need API Keys?

For ToGMAL MCP Server: NO

  • βœ… No API keys needed
  • βœ… No external services
  • βœ… Completely local and deterministic
  • βœ… No authentication required (for local use)

For MCP Inspector: NO

  • βœ… Generates session token automatically
  • βœ… Token is for browser security only
  • βœ… No account or API key setup needed

When You WOULD Need API Keys:

Only if you add features that call external services:

  • Web search (need Google/Bing API key)
  • LLM-based classification (need OpenAI/Anthropic API key)
  • Database storage (need DB credentials)

Current ToGMAL: Zero API keys required! βœ…


πŸ“– How to Use MCP Inspector

Already Running:

http://localhost:6274/?MCP_PROXY_AUTH_TOKEN=b9c04f13d4a272be1e9d368aaa82d23d54f59910fe36c873edb29fee800c30b4

Step-by-Step Guide:

  1. Open the URL in your browser

  2. Select a Tool from the left sidebar:

    • togmal_analyze_prompt
    • togmal_analyze_response
    • togmal_submit_evidence
    • togmal_get_taxonomy
    • togmal_get_statistics
  3. View Tool Schema:

    • See parameters, types, descriptions
    • Understand what each tool expects
  4. Enter Parameters:

    • Fill in the form fields
    • Example for togmal_analyze_prompt:
      {
        "prompt": "Build me a complete social network in 5000 lines",
        "response_format": "markdown"
      }
      
  5. Execute Tool:

    • Click "Call Tool" button
    • See the request being sent
    • View the response
  6. Inspect Results:

    • See risk level, detections, interventions
    • Copy results for documentation
    • Test different scenarios

Demo Scenarios to Test:

// Math/Physics Speculation
{
  "prompt": "I've discovered a new theory of quantum gravity",
  "response_format": "markdown"
}

// Medical Advice
{
  "response": "You definitely have the flu. Take 1000mg vitamin C.",
  "context": "I have a fever",
  "response_format": "markdown"
}

// Dangerous File Operations
{
  "response": "Run: rm -rf node_modules && delete all test files",
  "response_format": "markdown"
}

// Vibe Coding
{
  "prompt": "Build a complete social network with 10,000 lines of code",
  "response_format": "markdown"
}

// Statistics
{
  "response_format": "markdown"
}

🎯 Recommended Demo Strategy for VCs

1. Preparation

  • Run MCP Inspector
  • Use ngrok for public URL
  • Prepare test cases
  • Have slides ready

2. Demo Flow

Act 1: The Problem (2 min)

  • Show test_examples.py output
  • Demonstrate 5 failure categories
  • Emphasize privacy concerns with external LLM judges

Act 2: The Solution (3 min)

  • Open MCP Inspector
  • Live demo: Test math/physics speculation
  • Live demo: Test medical advice
  • Show risk levels and interventions

Act 3: The Architecture (2 min)

  • Explain local-first approach
  • No API keys, no cloud dependencies
  • Privacy-preserving by design
  • Perfect for regulated industries

Act 4: The Business (3 min)

  • Enterprise licensing model
  • On-premise deployment
  • Integration with existing LLM workflows
  • Roadmap: heuristics β†’ ML β†’ federated learning

3. Collateral

  • Live MCP Inspector URL
  • GitHub repo with docs
  • Video walkthrough
  • Technical whitepaper

πŸ’‘ Alternative: Build a Streamlit Demo

Quick interactive demo without complex hosting:

# streamlit_demo.py
import streamlit as st
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

st.title("ToGMAL: LLM Safety Analysis")

prompt = st.text_area("Enter a prompt to analyze:")

if st.button("Analyze"):
    # Call MCP server
    result = asyncio.run(analyze_with_togmal(prompt))
    st.markdown(result)

Deploy to: Streamlit Cloud (free hosting)


πŸ“Š Comparison: Hosting Options

Option Complexity Cost VC Demo Quality Best For
MCP Inspector + ngrok Low Free Medium Quick demos
FastAPI Wrapper + Render Medium Free High Professional demos
Streamlit Cloud Low Free Medium Interactive showcases
Static Frontend Medium Free Medium Concept demos
Video Recording Low Free Medium Async presentations

πŸš€ Next Steps for Demo

  1. Short Term (This Week):

    • Use MCP Inspector + ngrok for live demos
    • Record a video walkthrough
    • Prepare test cases with compelling examples
  2. Medium Term (Next Month):

    • Build FastAPI wrapper for stable demo URL
    • Deploy to Render (free tier)
    • Create simple frontend UI
  3. Long Term (Before Launch):

    • Professional demo website
    • Integration examples with popular LLMs
    • Video testimonials from beta users

πŸ” Security Note for Public Demos

If you expose MCP Inspector publicly:

# Add authentication
export MCP_PROXY_AUTH=your_secret_token

# Or use SSH tunnel instead of ngrok
ssh -R 80:localhost:6274 serveo.net

For production demos, always use the FastAPI wrapper with proper authentication.


Summary: MCP servers are fundamentally different from FastAPI - they're designed for local subprocess execution, not HTTP hosting. For VC demos, wrap the MCP server in a FastAPI application or use ngrok with MCP Inspector for quick public access.