Spaces:

JatinAutonomousLabs
/

Research_AI_Assistant

Sleeping

JatsTheAIGen commited on Nov 7

Commit

291e38e

1 Parent(s): a58b1f9

feat: Add ZeroGPU per-user mode (Option B: Multi-tenant)

- Add ZeroGPUUserManager for per-user account management
- Implement automatic user registration and approval
- Add user mapping database table for local-to-API user mapping
- Update LLM router to support both service account and per-user modes
- Add per-user mode configuration options
- Update ZeroGPU client to accept pre-existing tokens
- Add comprehensive documentation for per-user mode

Per-user mode features:
- Automatic user registration on first use
- Per-user usage tracking and statistics
- Per-user rate limits
- Better audit trail per user
- Multi-tenant support

Configuration:
- Set ZERO_GPU_PER_USER_MODE=true to enable
- Requires ZERO_GPU_ADMIN_EMAIL and ZERO_GPU_ADMIN_PASSWORD
- Falls back to service account mode if per-user fails

Database:
- Creates zero_gpu_user_mapping table
- Stores user credentials, tokens, and mapping
- Automatic token refresh and management

Files changed (7) hide show

ZEROGPU_PER_USER_MODE.md +238 -0
app.py +26 -4
config.py +4 -0
flask_api_standalone.py +26 -4
src/llm_router.py +84 -31
zero_gpu_client.py +12 -4
zero_gpu_user_manager.py +411 -0

ZEROGPU_PER_USER_MODE.md ADDED Viewed

	@@ -0,0 +1,238 @@

+# ZeroGPU Per-User Mode (Option B: Multi-tenant)
+## Overview
+Per-user mode creates a separate ZeroGPU API account for each application user, providing:
+- ✅ **Per-user usage tracking** - Track usage statistics per user
+- ✅ **Per-user rate limits** - Individual rate limits per user
+- ✅ **Better audit trail** - Each user's requests logged separately
+- ✅ **Multi-tenant support** - Ideal for multi-tenant applications
+## Configuration
+### Environment Variables
+```bash
+# Enable ZeroGPU API
+USE_ZERO_GPU=true
+# Enable per-user mode (Option B)
+ZERO_GPU_PER_USER_MODE=true
+# ZeroGPU API base URL
+ZERO_GPU_API_URL=http://your-pod-ip:8000
+# Admin credentials (for creating/approving users)
+ZERO_GPU_ADMIN_EMAIL=admin@example.com
+ZERO_GPU_ADMIN_PASSWORD=your-admin-password
+# Database path (for user mapping storage)
+DB_PATH=sessions.db
+```
+### Service Account Mode (Option A)
+If `ZERO_GPU_PER_USER_MODE=false` or not set, the system uses service account mode:
+```bash
+USE_ZERO_GPU=true
+ZERO_GPU_PER_USER_MODE=false  # or omit
+ZERO_GPU_API_URL=http://your-pod-ip:8000
+ZERO_GPU_EMAIL=service@example.com
+ZERO_GPU_PASSWORD=your-password
+```
+## How It Works
+### User Registration Flow
+1. **First Request**: When a user makes their first request:
+   - System checks if ZeroGPU account exists for this user
+   - If not, automatically registers new user with ZeroGPU API
+   - Generates unique email: `user_{hash}@zerogpu.local`
+   - Generates secure random password
+   - Auto-approves user via admin API
+   - Stores mapping in local database
+2. **Subsequent Requests**:
+   - System retrieves user's ZeroGPU client from cache or database
+   - Uses user's tokens for API calls
+   - Automatically refreshes tokens when needed
+### Database Schema
+The system creates a `zero_gpu_user_mapping` table:
+```sql
+CREATE TABLE zero_gpu_user_mapping (
+    local_user_id TEXT PRIMARY KEY,
+    api_user_id INTEGER,
+    api_email TEXT UNIQUE,
+    api_password_hash TEXT,
+    access_token TEXT,
+    refresh_token TEXT,
+    token_expires_at TIMESTAMP,
+    is_approved INTEGER DEFAULT 0,
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+    last_used TIMESTAMP,
+    usage_stats_cache TEXT
+);
+```
+### User Mapping
+- **Local User ID**: Your application's user identifier (e.g., "Admin_J", "User123")
+- **API User ID**: ZeroGPU API's internal user ID
+- **API Email**: Generated email for ZeroGPU account
+- **Tokens**: Stored for authentication (auto-refreshed)
+## Usage
+### Automatic User Creation
+Users are automatically created on first use:
+```python
+# In your orchestrator or agent code
+result = await llm_router.route_inference(
+    task_type="general_reasoning",
+    prompt="What is machine learning?",
+    user_id="User123"  # Pass user_id for per-user mode
+)
+```
+### Getting User Statistics
+```python
+from zero_gpu_user_manager import ZeroGPUUserManager
+# Initialize manager (usually done in LLMRouter)
+user_manager = ZeroGPUUserManager(
+    base_url="http://your-pod-ip:8000",
+    admin_email="admin@example.com",
+    admin_password="password",
+    db_path="sessions.db"
+)
+# Get usage stats for a user
+stats = user_manager.get_user_stats("User123")
+# Returns: {
+#     "user_id": 1,
+#     "total_requests": 150,
+#     "total_tokens": 45000,
+#     "requests_by_task": {...},
+#     ...
+# }
+```
+## Integration Points
+### LLM Router
+The LLM router automatically handles per-user clients:
+```python
+# In src/llm_router.py
+async def route_inference(self, task_type: str, prompt: str,
+                         context: Optional[List[Dict]] = None,
+                         user_id: Optional[str] = None, **kwargs):
+    # ...
+    # Automatically gets or creates user client if per-user mode enabled
+    if self.zero_gpu_mode == "per_user" and user_id:
+        client = await self.zero_gpu_user_manager.get_or_create_user_client(user_id)
+    # ...
+```
+### Orchestrator Integration
+Update orchestrator to pass user_id:
+```python
+# In orchestrator_engine.py
+result = await self.llm_router.route_inference(
+    task_type="general_reasoning",
+    prompt=prompt,
+    user_id=self.current_user_id  # Pass user_id
+)
+```
+## Advantages
+1. **Per-User Tracking**: Each user's usage tracked separately
+2. **Rate Limiting**: Per-user rate limits (60/min, 1000/hour, 10000/day)
+3. **Audit Trail**: Complete audit trail per user
+4. **Multi-Tenant**: Ideal for SaaS applications
+5. **Usage Analytics**: Per-user usage statistics available
+## Considerations
+1. **User Management Overhead**: Each user requires a ZeroGPU account
+2. **Token Storage**: Need to securely store user tokens
+3. **Password Management**: Generated passwords stored as hashes
+4. **Approval Workflow**: Users auto-approved via admin API
+## Security
+- **Password Storage**: Passwords stored as SHA-256 hashes
+- **Token Management**: Tokens auto-refreshed, stored securely
+- **Email Generation**: Deterministic but unique emails per user
+- **Admin Access**: Admin credentials required for user approval
+## Migration from Service Account
+To migrate from service account (Option A) to per-user (Option B):
+1. Set `ZERO_GPU_PER_USER_MODE=true`
+2. Set `ZERO_GPU_ADMIN_EMAIL` and `ZERO_GPU_ADMIN_PASSWORD`
+3. Restart application
+4. Users will be automatically created on first use
+## Troubleshooting
+### User Not Created
+- Check admin credentials are correct
+- Verify ZeroGPU API is accessible
+- Check database permissions
+- Review logs for registration errors
+### Token Refresh Issues
+- Tokens auto-refresh on expiry
+- If refresh fails, user will be re-authenticated
+- Check ZeroGPU API availability
+### Performance
+- User clients are cached in memory
+- Database lookups are fast (indexed)
+- First request per user may be slower (registration)
+## Example Configuration
+```python
+# config.py
+zero_gpu_config = {
+    "enabled": True,
+    "base_url": "http://your-pod-ip:8000",
+    "per_user_mode": True,  # Enable per-user mode
+    "admin_email": "admin@example.com",
+    "admin_password": "secure-password",
+    "db_path": "sessions.db"
+}
+```
+## API Endpoints Used
+- `POST /register` - Register new user
+- `POST /login` - Login and get tokens
+- `POST /admin/approve-user` - Approve user (admin)
+- `GET /usage/stats` - Get usage statistics
+- `POST /chat` - Make inference request
+---
+**Status**: ✅ Implemented
+**Mode**: Per-User Accounts (Multi-tenant)
+**Fallback**: Service Account mode if per-user fails

app.py CHANGED Viewed

@@ -2028,14 +2028,36 @@ def initialize_orchestrator():
         zero_gpu_config = None
         try:
             from config import settings
-            if settings.zero_gpu_enabled and settings.zero_gpu_email and settings.zero_gpu_password:
                 zero_gpu_config = {
                     "enabled": True,
                     "base_url": settings.zero_gpu_base_url,
-                    "email": settings.zero_gpu_email,
-                    "password": settings.zero_gpu_password
                 }
-                logger.info("ZeroGPU API enabled in configuration")
         except Exception as e:
             logger.debug(f"Could not load ZeroGPU config: {e}")

         zero_gpu_config = None
         try:
             from config import settings
+            if settings.zero_gpu_enabled:
                 zero_gpu_config = {
                     "enabled": True,
                     "base_url": settings.zero_gpu_base_url,
+                    "per_user_mode": settings.zero_gpu_per_user_mode
                 }
+                if settings.zero_gpu_per_user_mode:
+                    # Option B: Per-user accounts (multi-tenant)
+                    if settings.zero_gpu_admin_email and settings.zero_gpu_admin_password:
+                        zero_gpu_config.update({
+                            "admin_email": settings.zero_gpu_admin_email,
+                            "admin_password": settings.zero_gpu_admin_password,
+                            "db_path": settings.db_path
+                        })
+                        logger.info("ZeroGPU API enabled in per-user mode (multi-tenant)")
+                    else:
+                        logger.warning("ZeroGPU per-user mode enabled but admin credentials not provided")
+                        zero_gpu_config = None
+                else:
+                    # Option A: Service account (single-tenant)
+                    if settings.zero_gpu_email and settings.zero_gpu_password:
+                        zero_gpu_config.update({
+                            "email": settings.zero_gpu_email,
+                            "password": settings.zero_gpu_password
+                        })
+                        logger.info("ZeroGPU API enabled in service account mode")
+                    else:
+                        logger.warning("ZeroGPU enabled but credentials not provided")
+                        zero_gpu_config = None
         except Exception as e:
             logger.debug(f"Could not load ZeroGPU config: {e}")

config.py CHANGED Viewed

@@ -41,6 +41,10 @@ class Settings(BaseSettings):
     zero_gpu_base_url: str = os.getenv("ZERO_GPU_API_URL", "http://localhost:8000")
     zero_gpu_email: str = os.getenv("ZERO_GPU_EMAIL", "")
     zero_gpu_password: str = os.getenv("ZERO_GPU_PASSWORD", "")
     class Config:
         env_file = ".env"

     zero_gpu_base_url: str = os.getenv("ZERO_GPU_API_URL", "http://localhost:8000")
     zero_gpu_email: str = os.getenv("ZERO_GPU_EMAIL", "")
     zero_gpu_password: str = os.getenv("ZERO_GPU_PASSWORD", "")
+    # Per-user mode (Option B: Multi-tenant)
+    zero_gpu_per_user_mode: bool = os.getenv("ZERO_GPU_PER_USER_MODE", "false").lower() == "true"
+    zero_gpu_admin_email: str = os.getenv("ZERO_GPU_ADMIN_EMAIL", "")
+    zero_gpu_admin_password: str = os.getenv("ZERO_GPU_ADMIN_PASSWORD", "")
     class Config:
         env_file = ".env"

flask_api_standalone.py CHANGED Viewed

@@ -59,14 +59,36 @@ def initialize_orchestrator():
         zero_gpu_config = None
         try:
             from config import settings
-            if settings.zero_gpu_enabled and settings.zero_gpu_email and settings.zero_gpu_password:
                 zero_gpu_config = {
                     "enabled": True,
                     "base_url": settings.zero_gpu_base_url,
-                    "email": settings.zero_gpu_email,
-                    "password": settings.zero_gpu_password
                 }
-                logger.info("ZeroGPU API enabled in configuration")
         except Exception as e:
             logger.debug(f"Could not load ZeroGPU config: {e}")

         zero_gpu_config = None
         try:
             from config import settings
+            if settings.zero_gpu_enabled:
                 zero_gpu_config = {
                     "enabled": True,
                     "base_url": settings.zero_gpu_base_url,
+                    "per_user_mode": settings.zero_gpu_per_user_mode
                 }
+                if settings.zero_gpu_per_user_mode:
+                    # Option B: Per-user accounts (multi-tenant)
+                    if settings.zero_gpu_admin_email and settings.zero_gpu_admin_password:
+                        zero_gpu_config.update({
+                            "admin_email": settings.zero_gpu_admin_email,
+                            "admin_password": settings.zero_gpu_admin_password,
+                            "db_path": settings.db_path
+                        })
+                        logger.info("ZeroGPU API enabled in per-user mode (multi-tenant)")
+                    else:
+                        logger.warning("ZeroGPU per-user mode enabled but admin credentials not provided")
+                        zero_gpu_config = None
+                else:
+                    # Option A: Service account (single-tenant)
+                    if settings.zero_gpu_email and settings.zero_gpu_password:
+                        zero_gpu_config.update({
+                            "email": settings.zero_gpu_email,
+                            "password": settings.zero_gpu_password
+                        })
+                        logger.info("ZeroGPU API enabled in service account mode")
+                    else:
+                        logger.warning("ZeroGPU enabled but credentials not provided")
+                        zero_gpu_config = None
         except Exception as e:
             logger.debug(f"Could not load ZeroGPU config: {e}")

src/llm_router.py CHANGED Viewed

@@ -13,8 +13,10 @@ class LLMRouter:
         self.health_status = {}
         self.use_local_models = use_local_models
         self.local_loader = None
-        self.zero_gpu_client = None
         self.use_zero_gpu = False
         logger.info("LLMRouter initialized")
         if hf_token:
@@ -24,32 +26,63 @@ class LLMRouter:
         # Initialize ZeroGPU client if configured
         if zero_gpu_config and zero_gpu_config.get("enabled", False):
-            try:
-                from zero_gpu_client import ZeroGPUChatClient
-                base_url = zero_gpu_config.get("base_url", os.getenv("ZERO_GPU_API_URL", "http://localhost:8000"))
-                email = zero_gpu_config.get("email", os.getenv("ZERO_GPU_EMAIL", ""))
-                password = zero_gpu_config.get("password", os.getenv("ZERO_GPU_PASSWORD", ""))
-                if email and password:
-                    self.zero_gpu_client = ZeroGPUChatClient(base_url, email, password)
-                    self.use_zero_gpu = True
-                    logger.info("✓ ZeroGPU API client initialized")
-                    # Wait for API to be ready (non-blocking, will fallback if not ready)
-                    try:
-                        if not self.zero_gpu_client.wait_for_ready(timeout=10):
-                            logger.warning("ZeroGPU API not ready, will use HF fallback")
                             self.use_zero_gpu = False
-                    except Exception as e:
-                        logger.warning(f"Could not verify ZeroGPU API readiness: {e}. Will use HF fallback.")
-                        self.use_zero_gpu = False
-                else:
-                    logger.warning("ZeroGPU enabled but credentials not provided")
-            except ImportError:
-                logger.warning("zero_gpu_client not available, ZeroGPU disabled")
-            except Exception as e:
-                logger.warning(f"Could not initialize ZeroGPU client: {e}. Falling back to HF API.")
-                self.use_zero_gpu = False
         # Initialize local model loader if enabled
         if self.use_local_models:
@@ -67,10 +100,17 @@ class LLMRouter:
                 self.use_local_models = False
                 self.local_loader = None
-    async def route_inference(self, task_type: str, prompt: str, context: Optional[List[Dict]] = None, **kwargs):
         """
         Smart routing based on task specialization
         Tries local models first, then ZeroGPU API, falls back to HF Inference API if needed
         """
         logger.info(f"Routing inference for task: {task_type}")
         model_config = self._select_model(task_type)
@@ -95,9 +135,9 @@ class LLMRouter:
                 logger.debug("Exception details:", exc_info=True)
         # Try ZeroGPU API if enabled
-        if self.use_zero_gpu and self.zero_gpu_client:
             try:
-                result = await self._call_zero_gpu_endpoint(task_type, prompt, context, **kwargs)
                 if result is not None:
                     logger.info(f"Inference complete for {task_type} (ZeroGPU API)")
                     return result
@@ -194,7 +234,7 @@ class LLMRouter:
             logger.error(f"Error calling local embedding model: {e}", exc_info=True)
             return None
-    async def _call_zero_gpu_endpoint(self, task_type: str, prompt: str, context: Optional[List[Dict]] = None, **kwargs) -> Optional[str]:
         """
         Call ZeroGPU API endpoint
@@ -202,12 +242,25 @@ class LLMRouter:
             task_type: Task type (e.g., "intent_classification", "general_reasoning")
             prompt: User prompt/message
             context: Optional conversation context
             **kwargs: Additional generation parameters
         Returns:
             Generated text response or None if failed
         """
-        if not self.zero_gpu_client:
             return None
         try:
@@ -254,7 +307,7 @@ class LLMRouter:
                 generation_params["system_prompt"] = kwargs['system_prompt']
             # Call ZeroGPU API
-            response = self.zero_gpu_client.chat(
                 message=prompt,
                 task=zero_gpu_task,
                 context=context_messages,

         self.health_status = {}
         self.use_local_models = use_local_models
         self.local_loader = None
+        self.zero_gpu_client = None  # Service account client (Option A)
+        self.zero_gpu_user_manager = None  # Per-user manager (Option B)
         self.use_zero_gpu = False
+        self.zero_gpu_mode = "service_account"  # "service_account" or "per_user"
         logger.info("LLMRouter initialized")
         if hf_token:
         # Initialize ZeroGPU client if configured
         if zero_gpu_config and zero_gpu_config.get("enabled", False):
+            # Check if per-user mode is enabled
+            per_user_mode = zero_gpu_config.get("per_user_mode", False)
+            if per_user_mode:
+                # Option B: Per-User Accounts (Multi-tenant)
+                try:
+                    from zero_gpu_user_manager import ZeroGPUUserManager
+                    base_url = zero_gpu_config.get("base_url", os.getenv("ZERO_GPU_API_URL", "http://localhost:8000"))
+                    admin_email = zero_gpu_config.get("admin_email", os.getenv("ZERO_GPU_ADMIN_EMAIL", ""))
+                    admin_password = zero_gpu_config.get("admin_password", os.getenv("ZERO_GPU_ADMIN_PASSWORD", ""))
+                    db_path = zero_gpu_config.get("db_path", os.getenv("DB_PATH", "sessions.db"))
+                    if admin_email and admin_password:
+                        self.zero_gpu_user_manager = ZeroGPUUserManager(
+                            base_url, admin_email, admin_password, db_path
+                        )
+                        self.use_zero_gpu = True
+                        self.zero_gpu_mode = "per_user"
+                        logger.info("✓ ZeroGPU per-user mode enabled (multi-tenant)")
+                    else:
+                        logger.warning("ZeroGPU per-user mode enabled but admin credentials not provided")
+                except ImportError:
+                    logger.warning("zero_gpu_user_manager not available, falling back to service account mode")
+                    per_user_mode = False
+                except Exception as e:
+                    logger.warning(f"Could not initialize ZeroGPU user manager: {e}. Falling back to service account mode.")
+                    per_user_mode = False
+            if not per_user_mode:
+                # Option A: Service Account (Single-tenant)
+                try:
+                    from zero_gpu_client import ZeroGPUChatClient
+                    base_url = zero_gpu_config.get("base_url", os.getenv("ZERO_GPU_API_URL", "http://localhost:8000"))
+                    email = zero_gpu_config.get("email", os.getenv("ZERO_GPU_EMAIL", ""))
+                    password = zero_gpu_config.get("password", os.getenv("ZERO_GPU_PASSWORD", ""))
+                    if email and password:
+                        self.zero_gpu_client = ZeroGPUChatClient(base_url, email, password)
+                        self.use_zero_gpu = True
+                        self.zero_gpu_mode = "service_account"
+                        logger.info("✓ ZeroGPU API client initialized (service account mode)")
+                        # Wait for API to be ready (non-blocking, will fallback if not ready)
+                        try:
+                            if not self.zero_gpu_client.wait_for_ready(timeout=10):
+                                logger.warning("ZeroGPU API not ready, will use HF fallback")
+                                self.use_zero_gpu = False
+                        except Exception as e:
+                            logger.warning(f"Could not verify ZeroGPU API readiness: {e}. Will use HF fallback.")
                             self.use_zero_gpu = False
+                    else:
+                        logger.warning("ZeroGPU enabled but credentials not provided")
+                except ImportError:
+                    logger.warning("zero_gpu_client not available, ZeroGPU disabled")
+                except Exception as e:
+                    logger.warning(f"Could not initialize ZeroGPU client: {e}. Falling back to HF API.")
+                    self.use_zero_gpu = False
         # Initialize local model loader if enabled
         if self.use_local_models:
                 self.use_local_models = False
                 self.local_loader = None
+    async def route_inference(self, task_type: str, prompt: str, context: Optional[List[Dict]] = None, user_id: Optional[str] = None, **kwargs):
         """
         Smart routing based on task specialization
         Tries local models first, then ZeroGPU API, falls back to HF Inference API if needed
+        Args:
+            task_type: Task type (e.g., "intent_classification", "general_reasoning")
+            prompt: User prompt/message
+            context: Optional conversation context
+            user_id: Optional user ID for per-user ZeroGPU accounts (Option B)
+            **kwargs: Additional generation parameters
         """
         logger.info(f"Routing inference for task: {task_type}")
         model_config = self._select_model(task_type)
                 logger.debug("Exception details:", exc_info=True)
         # Try ZeroGPU API if enabled
+        if self.use_zero_gpu:
             try:
+                result = await self._call_zero_gpu_endpoint(task_type, prompt, context, user_id, **kwargs)
                 if result is not None:
                     logger.info(f"Inference complete for {task_type} (ZeroGPU API)")
                     return result
             logger.error(f"Error calling local embedding model: {e}", exc_info=True)
             return None
+    async def _call_zero_gpu_endpoint(self, task_type: str, prompt: str, context: Optional[List[Dict]] = None, user_id: Optional[str] = None, **kwargs) -> Optional[str]:
         """
         Call ZeroGPU API endpoint
             task_type: Task type (e.g., "intent_classification", "general_reasoning")
             prompt: User prompt/message
             context: Optional conversation context
+            user_id: Optional user ID for per-user accounts (Option B)
             **kwargs: Additional generation parameters
         Returns:
             Generated text response or None if failed
         """
+        # Get appropriate client based on mode
+        client = None
+        if self.zero_gpu_mode == "per_user" and self.zero_gpu_user_manager and user_id:
+            # Option B: Per-user accounts
+            client = await self.zero_gpu_user_manager.get_or_create_user_client(user_id)
+            if not client:
+                logger.warning(f"Could not get ZeroGPU client for user {user_id}, falling back to service account")
+                client = self.zero_gpu_client
+        else:
+            # Option A: Service account
+            client = self.zero_gpu_client
+        if not client:
             return None
         try:
                 generation_params["system_prompt"] = kwargs['system_prompt']
             # Call ZeroGPU API
+            response = client.chat(
                 message=prompt,
                 task=zero_gpu_task,
                 context=context_messages,

zero_gpu_client.py CHANGED Viewed

@@ -15,7 +15,7 @@ logger = logging.getLogger(__name__)
 class ZeroGPUChatClient:
     """Client for ZeroGPU Chat API with automatic token refresh"""
-    def __init__(self, base_url: str, email: str, password: str):
         """
         Initialize ZeroGPU API client
@@ -23,16 +23,24 @@ class ZeroGPUChatClient:
             base_url: Base URL of ZeroGPU API (e.g., "http://your-pod-ip:8000")
             email: User email for authentication
             password: User password for authentication
         """
         self.base_url = base_url.rstrip('/')
         self.email = email
         self.password = password
-        self.access_token = None
-        self.refresh_token = None
         self._last_token_refresh = None
         logger.info(f"Initializing ZeroGPU client for {self.base_url}")
-        self.login(email, password)
     def login(self, email: str, password: str):
         """Login and get authentication tokens"""

 class ZeroGPUChatClient:
     """Client for ZeroGPU Chat API with automatic token refresh"""
+    def __init__(self, base_url: str, email: str, password: str, access_token: str = None, refresh_token: str = None):
         """
         Initialize ZeroGPU API client
             base_url: Base URL of ZeroGPU API (e.g., "http://your-pod-ip:8000")
             email: User email for authentication
             password: User password for authentication
+            access_token: Optional pre-existing access token
+            refresh_token: Optional pre-existing refresh token
         """
         self.base_url = base_url.rstrip('/')
         self.email = email
         self.password = password
+        self.access_token = access_token
+        self.refresh_token = refresh_token
         self._last_token_refresh = None
         logger.info(f"Initializing ZeroGPU client for {self.base_url}")
+        # If tokens provided, use them; otherwise login
+        if access_token and refresh_token:
+            self._last_token_refresh = time.time()
+            logger.info("Using provided tokens")
+        else:
+            self.login(email, password)
     def login(self, email: str, password: str):
         """Login and get authentication tokens"""

zero_gpu_user_manager.py ADDED Viewed

	@@ -0,0 +1,411 @@

+# zero_gpu_user_manager.py
+"""
+ZeroGPU User Management - Per-User Accounts (Multi-tenant)
+Manages mapping between local users and ZeroGPU API user accounts
+"""
+import logging
+import sqlite3
+import hashlib
+import secrets
+from typing import Optional, Dict, Any
+from datetime import datetime
+from zero_gpu_client import ZeroGPUChatClient
+logger = logging.getLogger(__name__)
+class ZeroGPUUserManager:
+    """Manages per-user ZeroGPU API accounts with automatic registration and token management"""
+    def __init__(self, base_url: str, admin_email: str, admin_password: str, db_path: str = "sessions.db"):
+        """
+        Initialize user manager
+        Args:
+            base_url: ZeroGPU API base URL
+            admin_email: Admin email for creating/approving users
+            admin_password: Admin password
+            db_path: Path to database for user mapping storage
+        """
+        self.base_url = base_url
+        self.admin_email = admin_email
+        self.admin_password = admin_password
+        self.db_path = db_path
+        self.admin_client = None
+        self.user_clients = {}  # Cache of ZeroGPU clients per user
+        # Initialize admin client for user management operations
+        try:
+            self.admin_client = ZeroGPUChatClient(base_url, admin_email, admin_password)
+            logger.info("✓ ZeroGPU admin client initialized")
+        except Exception as e:
+            logger.error(f"Failed to initialize admin client: {e}")
+            raise
+        # Initialize database
+        self._init_database()
+    def _init_database(self):
+        """Initialize database tables for user mapping"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            # Create user mapping table
+            cursor.execute("""
+                CREATE TABLE IF NOT EXISTS zero_gpu_user_mapping (
+                    local_user_id TEXT PRIMARY KEY,
+                    api_user_id INTEGER,
+                    api_email TEXT UNIQUE,
+                    api_password_hash TEXT,
+                    access_token TEXT,
+                    refresh_token TEXT,
+                    token_expires_at TIMESTAMP,
+                    is_approved INTEGER DEFAULT 0,
+                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                    last_used TIMESTAMP,
+                    usage_stats_cache TEXT
+                )
+            """)
+            # Create index for faster lookups
+            cursor.execute("""
+                CREATE INDEX IF NOT EXISTS idx_zero_gpu_email
+                ON zero_gpu_user_mapping(api_email)
+            """)
+            conn.commit()
+            conn.close()
+            logger.info("✓ ZeroGPU user mapping database initialized")
+        except Exception as e:
+            logger.error(f"Failed to initialize user mapping database: {e}")
+            raise
+    def _generate_user_credentials(self, local_user_id: str) -> Dict[str, str]:
+        """
+        Generate ZeroGPU API credentials for a local user
+        Args:
+            local_user_id: Local application user ID
+        Returns:
+            Dictionary with email, password, and password hash
+        """
+        # Generate deterministic but unique email based on local user ID
+        # Format: user_{hash}@zerogpu.local
+        user_hash = hashlib.sha256(local_user_id.encode()).hexdigest()[:16]
+        email = f"user_{user_hash}@zerogpu.local"
+        # Generate secure random password
+        password = secrets.token_urlsafe(32)
+        password_hash = hashlib.sha256(password.encode()).hexdigest()
+        return {
+            "email": email,
+            "password": password,
+            "password_hash": password_hash
+        }
+    def _get_user_mapping(self, local_user_id: str) -> Optional[Dict[str, Any]]:
+        """Get user mapping from database"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            cursor.execute("""
+                SELECT local_user_id, api_user_id, api_email, api_password_hash,
+                       access_token, refresh_token, token_expires_at, is_approved,
+                       last_used, usage_stats_cache
+                FROM zero_gpu_user_mapping
+                WHERE local_user_id = ?
+            """, (local_user_id,))
+            row = cursor.fetchone()
+            conn.close()
+            if row:
+                return {
+                    "local_user_id": row[0],
+                    "api_user_id": row[1],
+                    "api_email": row[2],
+                    "api_password_hash": row[3],
+                    "access_token": row[4],
+                    "refresh_token": row[5],
+                    "token_expires_at": row[6],
+                    "is_approved": bool(row[7]),
+                    "last_used": row[8],
+                    "usage_stats_cache": row[9]
+                }
+            return None
+        except Exception as e:
+            logger.error(f"Error getting user mapping: {e}")
+            return None
+    def _save_user_mapping(self, local_user_id: str, api_user_id: int, api_email: str,
+                          password_hash: str, access_token: str = None,
+                          refresh_token: str = None):
+        """Save user mapping to database"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            cursor.execute("""
+                INSERT OR REPLACE INTO zero_gpu_user_mapping
+                (local_user_id, api_user_id, api_email, api_password_hash,
+                 access_token, refresh_token, token_expires_at, last_used)
+                VALUES (?, ?, ?, ?, ?, ?, datetime('now', '+15 minutes'), datetime('now'))
+            """, (local_user_id, api_user_id, api_email, password_hash,
+                  access_token, refresh_token))
+            conn.commit()
+            conn.close()
+        except Exception as e:
+            logger.error(f"Error saving user mapping: {e}")
+    def _update_user_tokens(self, local_user_id: str, access_token: str, refresh_token: str):
+        """Update user tokens in database"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            cursor.execute("""
+                UPDATE zero_gpu_user_mapping
+                SET access_token = ?, refresh_token = ?,
+                    token_expires_at = datetime('now', '+15 minutes'),
+                    last_used = datetime('now')
+                WHERE local_user_id = ?
+            """, (access_token, refresh_token, local_user_id))
+            conn.commit()
+            conn.close()
+        except Exception as e:
+            logger.error(f"Error updating user tokens: {e}")
+    def _update_approval_status(self, local_user_id: str, is_approved: bool):
+        """Update user approval status"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            cursor.execute("""
+                UPDATE zero_gpu_user_mapping
+                SET is_approved = ?
+                WHERE local_user_id = ?
+            """, (1 if is_approved else 0, local_user_id))
+            conn.commit()
+            conn.close()
+        except Exception as e:
+            logger.error(f"Error updating approval status: {e}")
+    async def get_or_create_user_client(self, local_user_id: str) -> Optional[ZeroGPUChatClient]:
+        """
+        Get or create ZeroGPU client for a local user
+        Args:
+            local_user_id: Local application user ID
+        Returns:
+            ZeroGPUChatClient instance or None if failed
+        """
+        # Check cache first
+        if local_user_id in self.user_clients:
+            client = self.user_clients[local_user_id]
+            # Verify client is still valid
+            if client.health_check():
+                return client
+            else:
+                # Remove invalid client from cache
+                del self.user_clients[local_user_id]
+        # Get user mapping
+        mapping = self._get_user_mapping(local_user_id)
+        if mapping:
+            # User exists, try to create client
+            # Note: We store password hash, but need password for login
+            # Solution: Store password encrypted or regenerate deterministically
+            try:
+                # For now, we'll need to regenerate the password deterministically
+                # This is acceptable since we control the generation
+                creds = self._generate_user_credentials(local_user_id)
+                # Verify the hash matches (security check)
+                if creds["password_hash"] != mapping["api_password_hash"]:
+                    logger.error(f"Password hash mismatch for user {local_user_id}")
+                    return None
+                # Create client with regenerated password
+                client = ZeroGPUChatClient(
+                    self.base_url,
+                    mapping["api_email"],
+                    creds["password"],  # Regenerated deterministically
+                    mapping.get("access_token"),
+                    mapping.get("refresh_token")
+                )
+                # Cache client
+                self.user_clients[local_user_id] = client
+                return client
+            except Exception as e:
+                logger.error(f"Error creating client for existing user: {e}")
+                return None
+        else:
+            # New user - register with ZeroGPU API
+            return await self._register_new_user(local_user_id)
+    async def _register_new_user(self, local_user_id: str) -> Optional[ZeroGPUChatClient]:
+        """
+        Register a new user with ZeroGPU API
+        Args:
+            local_user_id: Local application user ID
+        Returns:
+            ZeroGPUChatClient instance or None if failed
+        """
+        try:
+            # Generate credentials
+            creds = self._generate_user_credentials(local_user_id)
+            # Register user with ZeroGPU API
+            import requests
+            response = requests.post(
+                f"{self.base_url}/register",
+                json={
+                    "full_name": f"User {local_user_id}",
+                    "email": creds["email"],
+                    "mobile": f"+1{hash(local_user_id) % 10000000000:010d}",  # Generate fake mobile
+                    "password": creds["password"]
+                },
+                timeout=10
+            )
+            if response.status_code == 200:
+                user_data = response.json()
+                api_user_id = user_data["id"]
+                is_approved = user_data.get("is_approved", False)
+                # If not auto-approved, approve via admin endpoint
+                if not is_approved and self.admin_client:
+                    try:
+                        # Approve user via admin API
+                        admin_response = requests.post(
+                            f"{self.base_url}/admin/approve-user",
+                            headers={"Authorization": f"Bearer {self.admin_client.access_token}"},
+                            json={
+                                "user_id": api_user_id,
+                                "approve": True,
+                                "notes": f"Auto-approved for local user {local_user_id}"
+                            },
+                            timeout=10
+                        )
+                        if admin_response.status_code == 200:
+                            is_approved = True
+                            logger.info(f"Auto-approved ZeroGPU user {api_user_id} for local user {local_user_id}")
+                    except Exception as e:
+                        logger.warning(f"Could not auto-approve user: {e}")
+                # Login to get tokens
+                login_response = requests.post(
+                    f"{self.base_url}/login",
+                    json={
+                        "email": creds["email"],
+                        "password": creds["password"]
+                    },
+                    timeout=10
+                )
+                if login_response.status_code == 200:
+                    login_data = login_response.json()
+                    # Create client
+                    client = ZeroGPUChatClient(
+                        self.base_url,
+                        creds["email"],
+                        creds["password"]
+                    )
+                    # Save mapping (store password hash, not plain password)
+                    self._save_user_mapping(
+                        local_user_id,
+                        api_user_id,
+                        creds["email"],
+                        creds["password_hash"],
+                        login_data["access_token"],
+                        login_data["refresh_token"]
+                    )
+                    # Cache client
+                    self.user_clients[local_user_id] = client
+                    logger.info(f"✓ Registered and logged in ZeroGPU user for local user: {local_user_id}")
+                    return client
+                else:
+                    logger.error(f"Failed to login after registration: {login_response.text}")
+                    return None
+            else:
+                # User might already exist, try to login
+                if response.status_code == 400:
+                    logger.info(f"User {creds['email']} may already exist, attempting login...")
+                    login_response = requests.post(
+                        f"{self.base_url}/login",
+                        json={
+                            "email": creds["email"],
+                            "password": creds["password"]
+                        },
+                        timeout=10
+                    )
+                    if login_response.status_code == 200:
+                        login_data = login_response.json()
+                        user_info = requests.get(
+                            f"{self.base_url}/me",
+                            headers={"Authorization": f"Bearer {login_data['access_token']}"},
+                            timeout=10
+                        )
+                        if user_info.status_code == 200:
+                            user_data = user_info.json()
+                            client = ZeroGPUChatClient(
+                                self.base_url,
+                                creds["email"],
+                                creds["password"]
+                            )
+                            self._save_user_mapping(
+                                local_user_id,
+                                user_data["id"],
+                                creds["email"],
+                                creds["password_hash"],
+                                login_data["access_token"],
+                                login_data["refresh_token"]
+                            )
+                            self.user_clients[local_user_id] = client
+                            return client
+                logger.error(f"Failed to register user: {response.text}")
+                return None
+        except Exception as e:
+            logger.error(f"Error registering new user: {e}", exc_info=True)
+            return None
+    def get_user_stats(self, local_user_id: str) -> Optional[Dict[str, Any]]:
+        """Get usage statistics for a user"""
+        mapping = self._get_user_mapping(local_user_id)
+        if not mapping or not mapping.get("api_user_id"):
+            return None
+        # Get client
+        client = self.user_clients.get(local_user_id)
+        if not client:
+            return None
+        try:
+            stats = client.get_usage_stats(days=30)
+            return stats
+        except Exception as e:
+            logger.error(f"Error getting user stats: {e}")
+            return None
+    def cleanup_expired_clients(self):
+        """Remove expired clients from cache"""
+        # Simple cleanup - remove clients that haven't been used recently
+        # In production, implement more sophisticated cache management
+        pass