HonestAI / API_DOCUMENTATION.md
JatsTheAIGen's picture
Add context mode endpoints (fresh vs relevant) and update API documentation
c3a42ce
|
raw
history blame
28.6 kB
# Flask API Documentation
## Overview
The Research AI Assistant API provides a RESTful interface for interacting with an AI-powered research assistant. The API uses local GPU models for inference and supports conversational interactions with context management.
**Base URL (HF Spaces):** `https://jatinautonomouslabs-research-ai-assistant-api.hf.space`
**Alternative Base URL:** `https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API`
**API Version:** 1.0
**Content-Type:** `application/json`
> **Note:** For Hugging Face Spaces Docker deployments, use the `.hf.space` domain format. The space name is converted to lowercase with hyphens.
## Features
- 🤖 **AI-Powered Responses** - Local GPU model inference (Tesla T4)
- 💬 **Conversational Context** - Maintains conversation history and user context
- 🔒 **CORS Enabled** - Ready for web integration
-**Async Processing** - Efficient request handling
- 📊 **Transparent Reasoning** - Returns reasoning chains and performance metrics
---
## Authentication
Currently, the API does not require authentication. However, for production use, you should:
1. Set `HF_TOKEN` environment variable for Hugging Face model access
2. Implement API key authentication if needed
---
## Endpoints
### 1. Get API Information
**Endpoint:** `GET /`
**Description:** Returns API information, version, and available endpoints.
**Request:**
```http
GET / HTTP/1.1
Host: huggingface.co
```
**Response:**
```json
{
"name": "AI Assistant Flask API",
"version": "1.0",
"status": "running",
"orchestrator_ready": true,
"features": {
"local_gpu_models": true,
"max_workers": 4,
"hardware": "NVIDIA T4 Medium"
},
"endpoints": {
"health": "GET /api/health",
"chat": "POST /api/chat",
"initialize": "POST /api/initialize",
"context_mode_get": "GET /api/context/mode",
"context_mode_set": "POST /api/context/mode"
}
}
```
**Status Codes:**
- `200 OK` - Success
---
### 2. Health Check
**Endpoint:** `GET /api/health`
**Description:** Checks if the API and orchestrator are ready to handle requests.
**Request:**
```http
GET /api/health HTTP/1.1
Host: huggingface.co
```
**Response:**
```json
{
"status": "healthy",
"orchestrator_ready": true
}
```
**Status Codes:**
- `200 OK` - API is healthy
- `orchestrator_ready: true` - Ready to process requests
- `orchestrator_ready: false` - Still initializing
**Example Response (Initializing):**
```json
{
"status": "initializing",
"orchestrator_ready": false
}
```
---
### 3. Chat Endpoint
**Endpoint:** `POST /api/chat`
**Description:** Send a message to the AI assistant and receive a response with reasoning and context.
**Request Headers:**
```http
Content-Type: application/json
```
**Request Body:**
```json
{
"message": "Explain quantum entanglement in simple terms",
"history": [
["User message 1", "Assistant response 1"],
["User message 2", "Assistant response 2"]
],
"session_id": "session-123",
"user_id": "user-456"
}
```
**Request Fields:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `message` | string | ✅ Yes | User's message/question (max 10,000 characters) |
| `history` | array | ❌ No | Conversation history as array of `[user, assistant]` pairs |
| `session_id` | string | ❌ No | Unique session identifier for context continuity |
| `user_id` | string | ❌ No | User identifier (defaults to "anonymous") |
| `context_mode` | string | ❌ No | Context retrieval mode: `"fresh"` (no user context) or `"relevant"` (only relevant context). Defaults to `"fresh"` if not set. |
**Response (Success):**
```json
{
"success": true,
"message": "Quantum entanglement is when two particles become linked...",
"history": [
["Explain quantum entanglement", "Quantum entanglement is when two particles become linked..."]
],
"reasoning": {
"intent": "educational_query",
"steps": ["Understanding request", "Gathering information", "Synthesizing response"],
"confidence": 0.95
},
"performance": {
"response_time_ms": 2345,
"tokens_generated": 156,
"model_used": "mistralai/Mistral-7B-Instruct-v0.2"
}
}
```
**Response Fields:**
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Whether the request was successful |
| `message` | string | AI assistant's response |
| `history` | array | Updated conversation history including the new exchange |
| `reasoning` | object | AI reasoning process and confidence metrics |
| `performance` | object | Performance metrics (response time, tokens, model used) |
**Status Codes:**
- `200 OK` - Request processed successfully
- `400 Bad Request` - Invalid request (missing message, empty message, too long, wrong type)
- `500 Internal Server Error` - Server error processing request
- `503 Service Unavailable` - Orchestrator not ready (still initializing)
**Error Response:**
```json
{
"success": false,
"error": "Message is required",
"message": "Error processing your request. Please try again."
}
```
**Context Mode Feature:**
The `context_mode` parameter controls how user context is retrieved and used:
- **`"fresh"`** (default): No user context is included. Each conversation starts fresh, ideal for:
- General questions requiring no prior context
- Avoiding context contamination
- Faster responses (no context retrieval overhead)
- **`"relevant"`**: Only relevant user context is included based on relevance classification. The system:
- Analyzes all previous interactions for the session
- Classifies which interactions are relevant to the current query
- Includes only relevant context summaries
- Ideal for:
- Follow-up questions that build on previous conversations
- Maintaining continuity within a research session
- Personalized responses based on user history
**Example with Context Mode:**
```json
{
"message": "Can you remind me what we discussed about quantum computing?",
"session_id": "session-123",
"user_id": "user-456",
"context_mode": "relevant"
}
```
---
### 4. Initialize Orchestrator
**Endpoint:** `POST /api/initialize`
**Description:** Manually trigger orchestrator initialization (useful if initialization failed on startup).
**Request:**
```http
POST /api/initialize HTTP/1.1
Host: huggingface.co
Content-Type: application/json
```
**Request Body:**
```json
{}
```
**Response (Success):**
```json
{
"success": true,
"message": "Orchestrator initialized successfully"
}
```
**Response (Failure):**
```json
{
"success": false,
"message": "Initialization failed. Check logs for details."
}
```
**Status Codes:**
- `200 OK` - Initialization successful
- `500 Internal Server Error` - Initialization failed
---
### 5. Get Context Mode
**Endpoint:** `GET /api/context/mode`
**Description:** Retrieve the current context retrieval mode for a session.
**Request:**
```http
GET /api/context/mode?session_id=session-123 HTTP/1.1
Host: huggingface.co
```
**Query Parameters:**
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `session_id` | string | ✅ Yes | Session identifier |
**Response (Success):**
```json
{
"success": true,
"session_id": "session-123",
"context_mode": "fresh",
"description": {
"fresh": "No user context included - starts fresh each time",
"relevant": "Only relevant user context included based on relevance classification"
}
}
```
**Response Fields:**
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Whether the request was successful |
| `session_id` | string | Session identifier |
| `context_mode` | string | Current mode: `"fresh"` or `"relevant"` |
| `description` | object | Description of each mode |
**Status Codes:**
- `200 OK` - Success
- `400 Bad Request` - Missing `session_id` parameter
- `500 Internal Server Error` - Server error
- `503 Service Unavailable` - Orchestrator not ready or context mode not available
**Error Response:**
```json
{
"success": false,
"error": "session_id query parameter is required"
}
```
---
### 6. Set Context Mode
**Endpoint:** `POST /api/context/mode`
**Description:** Set the context retrieval mode for a session (fresh or relevant).
**Request Headers:**
```http
Content-Type: application/json
```
**Request Body:**
```json
{
"session_id": "session-123",
"mode": "relevant",
"user_id": "user-456"
}
```
**Request Fields:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `session_id` | string | ✅ Yes | Session identifier |
| `mode` | string | ✅ Yes | Context mode: `"fresh"` or `"relevant"` |
| `user_id` | string | ❌ No | User identifier (defaults to "anonymous") |
**Response (Success):**
```json
{
"success": true,
"session_id": "session-123",
"context_mode": "relevant",
"message": "Context mode set successfully"
}
```
**Response Fields:**
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Whether the request was successful |
| `session_id` | string | Session identifier |
| `context_mode` | string | The mode that was set |
| `message` | string | Success message |
**Status Codes:**
- `200 OK` - Context mode set successfully
- `400 Bad Request` - Invalid request (missing fields, invalid mode)
- `500 Internal Server Error` - Server error or failed to set mode
- `503 Service Unavailable` - Orchestrator not ready or context mode not available
**Error Response:**
```json
{
"success": false,
"error": "mode must be 'fresh' or 'relevant'"
}
```
**Usage Notes:**
- The context mode persists for the session until changed
- Setting `mode` to `"relevant"` enables relevance classification, which analyzes all previous interactions to include only relevant context
- Setting `mode` to `"fresh"` disables context retrieval, providing faster responses without user history
- The mode can also be set per-request via the `context_mode` parameter in `/api/chat`
---
## Code Examples
### Python
```python
import requests
import json
BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"
# Check health
def check_health():
response = requests.get(f"{BASE_URL}/api/health")
return response.json()
# Send chat message
def send_message(message, session_id=None, user_id=None, history=None, context_mode=None):
payload = {
"message": message,
"session_id": session_id,
"user_id": user_id or "anonymous",
"history": history or []
}
if context_mode:
payload["context_mode"] = context_mode
response = requests.post(
f"{BASE_URL}/api/chat",
json=payload,
headers={"Content-Type": "application/json"}
)
if response.status_code == 200:
return response.json()
else:
raise Exception(f"API Error: {response.status_code} - {response.text}")
# Example usage
if __name__ == "__main__":
# Check if API is ready
health = check_health()
print(f"API Status: {health}")
if health.get("orchestrator_ready"):
# Send a message
result = send_message(
message="What is machine learning?",
session_id="my-session-123",
user_id="user-456"
)
print(f"Response: {result['message']}")
print(f"Reasoning: {result.get('reasoning', {})}")
# Set context mode to relevant for follow-up
import requests
requests.post(
f"{BASE_URL}/api/context/mode",
json={
"session_id": "my-session-123",
"mode": "relevant",
"user_id": "user-456"
}
)
# Continue conversation with relevant context
history = result['history']
result2 = send_message(
message="Can you explain neural networks?",
session_id="my-session-123",
user_id="user-456",
history=history,
context_mode="relevant"
)
print(f"Follow-up Response: {result2['message']}")
```
### JavaScript (Fetch API)
```javascript
const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space';
// Check health
async function checkHealth() {
const response = await fetch(`${BASE_URL}/api/health`);
return await response.json();
}
// Get context mode for a session
async function getContextMode(sessionId) {
const response = await fetch(`${BASE_URL}/api/context/mode?session_id=${sessionId}`);
if (!response.ok) {
throw new Error(`API Error: ${response.status}`);
}
return await response.json();
}
// Set context mode for a session
async function setContextMode(sessionId, mode, userId = null) {
const payload = {
session_id: sessionId,
mode: mode
};
if (userId) {
payload.user_id = userId;
}
const response = await fetch(`${BASE_URL}/api/context/mode`, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
const error = await response.json();
throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
}
return await response.json();
}
// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = [], contextMode = null) {
const payload = {
message: message,
session_id: sessionId,
user_id: userId || 'anonymous',
history: history
};
if (contextMode) {
payload.context_mode = contextMode;
}
const response = await fetch(`${BASE_URL}/api/chat`, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
const error = await response.json();
throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
}
return await response.json();
}
// Example usage
async function main() {
try {
// Check if API is ready
const health = await checkHealth();
console.log('API Status:', health);
if (health.orchestrator_ready) {
// Send a message
const result = await sendMessage(
'What is machine learning?',
'my-session-123',
'user-456'
);
console.log('Response:', result.message);
console.log('Reasoning:', result.reasoning);
// Continue conversation with relevant context
await setContextMode('my-session-123', 'relevant', 'user-456');
const result2 = await sendMessage(
'Can you explain neural networks?',
'my-session-123',
'user-456',
result.history,
'relevant'
);
console.log('Follow-up Response:', result2.message);
// Check current context mode
const modeInfo = await getContextMode('my-session-123');
console.log('Current context mode:', modeInfo.context_mode);
}
} catch (error) {
console.error('Error:', error);
}
}
main();
```
### cURL
```bash
# Check health
curl -X GET "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health"
# Get context mode
curl -X GET "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/context/mode?session_id=my-session-123"
# Set context mode to relevant
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/context/mode" \
-H "Content-Type: application/json" \
-d '{
"session_id": "my-session-123",
"mode": "relevant",
"user_id": "user-456"
}'
# Send chat message
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What is machine learning?",
"session_id": "my-session-123",
"user_id": "user-456",
"context_mode": "relevant",
"history": []
}'
# Continue conversation
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "Can you explain neural networks?",
"session_id": "my-session-123",
"user_id": "user-456",
"history": [
["What is machine learning?", "Machine learning is a subset of artificial intelligence..."]
]
}'
```
### Node.js (Axios)
```javascript
const axios = require('axios');
const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space';
// Check health
async function checkHealth() {
const response = await axios.get(`${BASE_URL}/api/health`);
return response.data;
}
// Get context mode
async function getContextMode(sessionId) {
const response = await axios.get(`${BASE_URL}/api/context/mode`, {
params: { session_id: sessionId }
});
return response.data;
}
// Set context mode
async function setContextMode(sessionId, mode, userId = null) {
const payload = {
session_id: sessionId,
mode: mode
};
if (userId) payload.user_id = userId;
const response = await axios.post(`${BASE_URL}/api/context/mode`, payload);
return response.data;
}
// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = [], contextMode = null) {
try {
const payload = {
message: message,
session_id: sessionId,
user_id: userId || 'anonymous',
history: history
};
if (contextMode) payload.context_mode = contextMode;
const response = await axios.post(`${BASE_URL}/api/chat`, payload, {
headers: {
'Content-Type': 'application/json'
}
});
return response.data;
} catch (error) {
if (error.response) {
throw new Error(`API Error: ${error.response.status} - ${error.response.data.error || error.response.data.message}`);
}
throw error;
}
}
// Example usage
(async () => {
try {
const health = await checkHealth();
console.log('API Status:', health);
if (health.orchestrator_ready) {
// Set context mode to relevant
await setContextMode('my-session-123', 'relevant', 'user-456');
const result = await sendMessage(
'What is machine learning?',
'my-session-123',
'user-456',
[],
'relevant'
);
console.log('Response:', result.message);
// Check current mode
const modeInfo = await getContextMode('my-session-123');
console.log('Context mode:', modeInfo.context_mode);
}
} catch (error) {
console.error('Error:', error.message);
}
})();
```
---
## Error Handling
### Common Error Responses
#### 400 Bad Request
**Missing Message:**
```json
{
"success": false,
"error": "Message is required"
}
```
**Empty Message:**
```json
{
"success": false,
"error": "Message cannot be empty"
}
```
**Message Too Long:**
```json
{
"success": false,
"error": "Message too long. Maximum length is 10000 characters"
}
```
**Invalid Type:**
```json
{
"success": false,
"error": "Message must be a string"
}
```
#### 503 Service Unavailable
**Orchestrator Not Ready:**
```json
{
"success": false,
"error": "Orchestrator not ready",
"message": "AI system is initializing. Please try again in a moment."
}
```
**Solution:** Wait a few seconds and retry, or check the `/api/health` endpoint.
#### 500 Internal Server Error
**Generic Error:**
```json
{
"success": false,
"error": "Error message here",
"message": "Error processing your request. Please try again."
}
```
---
## Best Practices
### 1. Session Management
- **Use consistent session IDs** for maintaining conversation context
- **Generate unique session IDs** per user conversation thread
- **Include conversation history** in subsequent requests for better context
```python
# Good: Maintains context
session_id = "user-123-session-1"
history = []
# First message
result1 = send_message("What is AI?", session_id=session_id, history=history)
history = result1['history']
# Follow-up message (includes context)
result2 = send_message("Can you explain more?", session_id=session_id, history=history)
```
### 2. Error Handling
Always implement retry logic for 503 errors:
```python
import time
def send_message_with_retry(message, max_retries=3, retry_delay=2):
for attempt in range(max_retries):
try:
result = send_message(message)
return result
except Exception as e:
if "503" in str(e) and attempt < max_retries - 1:
time.sleep(retry_delay)
continue
raise
```
### 3. Health Checks
Check API health before sending requests:
```python
def is_api_ready():
try:
health = check_health()
return health.get("orchestrator_ready", False)
except:
return False
if is_api_ready():
# Send request
result = send_message("Hello")
else:
print("API is not ready yet")
```
### 4. Rate Limiting
- **No explicit rate limits** are currently enforced
- **Recommended:** Implement client-side rate limiting (e.g., 1 request per second)
- **Consider:** Implementing request queuing for high-volume applications
### 5. Message Length
- **Maximum:** 10,000 characters per message
- **Recommended:** Keep messages concise for faster processing
- **For long content:** Split into multiple messages or summarize
### 6. Context Management
- **Include history** in requests to maintain conversation context
- **Session IDs** help track conversations across multiple requests
- **User IDs** enable personalization and user-specific context
---
## Integration Examples
### React Component
```jsx
import React, { useState, useEffect } from 'react';
const AIAssistant = () => {
const [message, setMessage] = useState('');
const [history, setHistory] = useState([]);
const [loading, setLoading] = useState(false);
const [sessionId] = useState(`session-${Date.now()}`);
const sendMessage = async () => {
if (!message.trim()) return;
setLoading(true);
try {
const response = await fetch('https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: message,
session_id: sessionId,
user_id: 'user-123',
history: history
})
});
const data = await response.json();
if (data.success) {
setHistory(data.history);
setMessage('');
}
} catch (error) {
console.error('Error:', error);
} finally {
setLoading(false);
}
};
return (
<div>
<div className="chat-history">
{history.map(([user, assistant], idx) => (
<div key={idx}>
<div><strong>You:</strong> {user}</div>
<div><strong>Assistant:</strong> {assistant}</div>
</div>
))}
</div>
<input
value={message}
onChange={(e) => setMessage(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
disabled={loading}
/>
<button onClick={sendMessage} disabled={loading}>
{loading ? 'Sending...' : 'Send'}
</button>
</div>
);
};
```
### Python CLI Tool
```python
#!/usr/bin/env python3
import requests
import sys
BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"
class ChatCLI:
def __init__(self):
self.session_id = f"cli-session-{hash(__file__)}"
self.history = []
def chat(self, message):
response = requests.post(
f"{BASE_URL}/api/chat",
json={
"message": message,
"session_id": self.session_id,
"user_id": "cli-user",
"history": self.history
}
)
if response.status_code == 200:
data = response.json()
self.history = data['history']
return data['message']
else:
return f"Error: {response.status_code} - {response.text}"
def run(self):
print("AI Assistant CLI (Type 'exit' to quit)")
print("=" * 50)
while True:
user_input = input("\nYou: ").strip()
if user_input.lower() in ['exit', 'quit']:
break
print("Assistant: ", end="", flush=True)
response = self.chat(user_input)
print(response)
if __name__ == "__main__":
cli = ChatCLI()
cli.run()
```
---
## Response Times
- **Typical Response:** 2-10 seconds
- **First Request:** May take longer due to model loading (10-30 seconds)
- **Subsequent Requests:** Faster due to cached models (2-5 seconds)
**Factors Affecting Response Time:**
- Message length
- Model loading (first request)
- GPU availability
- Concurrent requests
---
## Troubleshooting
### Common Issues
#### 404 Not Found
**Problem:** Getting 404 when accessing the API
**Solutions:**
1. **Verify the Space is running:**
- Check the Hugging Face Space page to ensure it's built and running
- Wait for the initial build to complete (5-10 minutes)
2. **Check URL format:**
- ✅ Correct: `https://jatinautonomouslabs-research-ai-assistant-api.hf.space`
- ❌ Wrong: `https://jatinautonomouslabs-research_ai_assistant_api.hf.space` (underscores)
- ✅ Alternative: `https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API`
3. **Verify endpoint paths:**
- Health: `GET /api/health`
- Chat: `POST /api/chat`
- Root: `GET /`
4. **Test with root endpoint first:**
```bash
curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/
```
#### 503 Service Unavailable
**Problem:** Orchestrator not ready
**Solutions:**
1. Wait 30-60 seconds for initialization
2. Check `/api/health` endpoint
3. Use `/api/initialize` to manually trigger initialization
#### CORS Errors
**Problem:** CORS errors in browser
**Solutions:**
- The API has CORS enabled for all origins
- If issues persist, check browser console for specific errors
- Ensure you're using the correct base URL
### Testing API Connectivity
**Quick Health Check:**
```bash
# Test root endpoint
curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/
# Test health endpoint
curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health
```
**Python Test Script:**
```python
import requests
BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"
# Test root
try:
response = requests.get(f"{BASE_URL}/", timeout=10)
print(f"Root endpoint: {response.status_code} - {response.json()}")
except Exception as e:
print(f"Root endpoint failed: {e}")
# Test health
try:
response = requests.get(f"{BASE_URL}/api/health", timeout=10)
print(f"Health endpoint: {response.status_code} - {response.json()}")
except Exception as e:
print(f"Health endpoint failed: {e}")
```
## Support
For issues, questions, or contributions:
- **Repository:** [GitHub Repository URL]
- **Hugging Face Space:** [https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API](https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API)
---
## Changelog
### Version 1.0 (Current)
- Initial API release
- Chat endpoint with context management
- Health check endpoint
- Local GPU model inference
- CORS enabled for web integration
---
## License
This API is provided as-is. Please refer to the main project README for license information.