safe-challenge-2025
/

example-submission

Model card Files Files and versions

xet

Community

sachin sharma commited on Nov 7

Commit

0b393b6

1 Parent(s): 9d9449a

updated README.md

Browse files

Files changed (2) hide show

.dockerignore +38 -0
README.md +197 -272

.dockerignore ADDED Viewed

	@@ -0,0 +1,38 @@

+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.Python
+*.so
+*.egg
+*.egg-info
+dist
+build
+.venv
+venv
+ENV
+env
+.git
+.gitignore
+.idea
+.vscode
+.claude
+*.md
+README.md
+Dockerfile
+.dockerignore
+test_*.http
+test_results
+scripts/test_datasets
+.pytest_cache
+.coverage
+htmlcov
+*.log
+.DS_Store
+.python-version

README.md CHANGED Viewed

@@ -1,89 +1,50 @@
-# ML Inference Service (FastAPI)
-A FastAPI-based inference server designed to make it easy to serve your ML models. The repo includes a complete working example using ResNet-18 for image classification, but the architecture is built to be model-agnostic. You implement a simple abstract base class, and everything else just works.
-Key features:
-- Abstract InferenceService class that you subclass for your model
-- Example ResNet-18 implementation showing how to do it
-- FastAPI application with clean separation (routes → controller → service)
-- Model loaded once at startup and reused across requests
-- Background threading for inference so the server stays responsive
-- Type-safe request/response handling with Pydantic
-- Single generic endpoint that works with any model
-## What you get
-The service exposes a single endpoint `POST /predict` that accepts a base64-encoded image and returns:
-- `prediction` - the predicted class label
-- `confidence` - softmax probability for the prediction
-- `predicted_label` - numeric class index
-- `model` - identifier for which model produced this prediction
-- `mediaType` - echoed from the request
-The inference runs in a background thread using asyncio so long-running model predictions don't block the server from handling other requests.
-## Project Layout
-```
-ml-inference-service/
-├─ main.py                    # Entry point
-├─ app/
-│  ├─ core/
-│  │  ├─ app.py               # Everything: config, DI, lifespan, app factory
-│  │  └─ logging.py           # Logger setup
-│  ├─ api/
-│  │  ├─ models.py            # Pydantic request/response schemas
-│  │  ├─ controllers.py       # HTTP → service orchestration
-│  │  └─ routes/
-│  │     └─ prediction.py     # POST /predict endpoint
-│  └─ services/
-│     ├─ base.py              # Abstract InferenceService class
-│     └─ inference.py         # ResNetInferenceService (example implementation)
-├─ models/
-│  └─ microsoft/
-│     └─ resnet-18/           # Model files (preserves org structure)
-├─ scripts/
-│  ├─ generate_test_datasets.py
-│  ├─ test_datasets.py
-│  └─ test_datasets/
-├─ requirements.txt
-└─ test_main.http             # Example HTTP request
-```
-The key change from a typical FastAPI app is that `app/core/app.py` consolidates configuration, dependency injection, lifecycle management, and the app factory into one file. This avoids the complexity of managing global variables across multiple modules.
-## Quickstart
-1) Install dependencies (Python 3.9+)
 ```bash
 python -m venv .venv
-source .venv/bin/activate   # Windows: .venv\Scripts\activate
 pip install -r requirements.txt
-```
-2) Download the example model
-```bash
 bash scripts/model_download.bash
-```
-This downloads ResNet-18 from Hugging Face and saves it to `models/microsoft/resnet-18/` (note the org structure is preserved).
-3) Run the server
-```bash
 uvicorn main:app --reload
 ```
-Server starts on `http://127.0.0.1:8000`.
-4) Test the API
-Use `test_main.http` from your IDE or curl:
 ```bash
-curl -X POST http://127.0.0.1:8000/predict \
   -H "Content-Type: application/json" \
   -d '{
     "image": {
       "mediaType": "image/jpeg",
-      "data": "<base64-encoded-bytes>"
     }
   }'
 ```
@@ -92,45 +53,77 @@ Example response:
 ```json
 {
   "prediction": "tiger cat",
-  "confidence": 0.9971,
   "predicted_label": 282,
   "model": "microsoft/resnet-18",
   "mediaType": "image/jpeg"
 }
 ```
-## Integrating Your Own Model
-To use your own model, you implement the `InferenceService` abstract base class. The rest of the infrastructure (API routes, controllers, dependency injection) is already generic and works with any implementation.
-### Step 1: Implement the InferenceService ABC
-Create a new file `app/services/your_model_service.py`:
 ```python
 from app.services.base import InferenceService
 from app.api.models import ImageRequest, PredictionResponse
 class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
     def __init__(self, model_name: str):
         self.model_name = model_name
-        self.model_path = os.path.join("models", model_name)
         self.model = None
         self._is_loaded = False
     async def load_model(self) -> None:
-        # Load your model here
         self.model = load_your_model(self.model_path)
         self._is_loaded = True
     async def predict(self, request: ImageRequest) -> PredictionResponse:
-        # Offload to background thread (important for performance)
         return await asyncio.to_thread(self._predict_sync, request)
     def _predict_sync(self, request: ImageRequest) -> PredictionResponse:
-        # Decode image, run inference, return typed response
         image = decode_base64_image(request.image.data)
         result = self.model(image)
         return PredictionResponse(
             prediction=result.label,
             confidence=result.confidence,
@@ -144,275 +137,184 @@ class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
         return self._is_loaded
 ```
-The key points:
-- Subclass `InferenceService[RequestType, ResponseType]` with your request/response types
-- Implement three methods: `load_model()`, `predict()`, and `is_loaded` property
-- Use `asyncio.to_thread()` to offload CPU-intensive inference to a background thread
-- Return typed Pydantic models, not dicts
-### Step 2: Register your service at startup
-Edit `app/core/app.py` and find the lifespan function (around line 134):
 ```python
-# Replace this:
 service = ResNetInferenceService(model_name="microsoft/resnet-18")
-# With this:
 service = YourModelService(model_name="your-org/your-model")
 ```
-That's it. The same `/predict` endpoint now serves your model.
-### Model file structure
-Your model files should be organized as:
 ```
 models/
 └── your-org/
     └── your-model/
         ├── config.json
         ├── weights.bin
-        └── ... other files
 ```
-The full org/model structure is preserved - no more dropping the org prefix.
-### Example: Swapping ResNet for ViT
-```python
-# app/services/vit_service.py
-from transformers import ViTForImageClassification, ViTImageProcessor
-class ViTService(InferenceService[ImageRequest, PredictionResponse]):
-    async def load_model(self) -> None:
-        self.processor = ViTImageProcessor.from_pretrained(self.model_path)
-        self.model = ViTForImageClassification.from_pretrained(self.model_path)
-        self._is_loaded = True
-    # ... implement predict() following the pattern above
-```
-Then in `app/core/app.py`:
-```python
-service = ViTService(model_name="google/vit-base-patch16-224")
-```
-No other changes needed - the routes, controller, and dependency injection are all model-agnostic.
-## Validating your setup
-When you start the server, the logs should show:
-```
-INFO: Starting ML Inference Service...
-INFO: Initializing ResNet service with local model: models/microsoft/resnet-18
-INFO: Loading ResNet model from: models/microsoft/resnet-18
-INFO: ResNet model loaded successfully
-INFO: Startup completed successfully
-```
-If you see errors like `Model directory not found`, check that your model files exist at the expected path with the full org/model structure.
-## Request & Response Shapes
-### Request
-```json
-{
-  "image": {
-    "mediaType": "image/jpeg",
-    "data": "<base64-encoded image bytes>"
-  }
-}
 ```
-### Response
-```json
-{
-  "prediction": "string label",
-  "confidence": 0.0,
-  "predicted_label": 0,
-  "model": "your-org/your-model",
-  "mediaType": "image/jpeg"
-}
-```
-## Configuration
-Settings are defined in `app/core/app.py` in the `Settings` class. The defaults are:
-- `app_name` - "ML Inference Service"
-- `app_version` - "0.1.0"
-- `debug` - False
-- `host` - "0.0.0.0"
-- `port` - 8000
-You can override these via environment variables or a `.env` file. If you want to make the model configurable via environment variable, add it to the Settings class:
-```python
-class Settings(BaseSettings):
-    # ... existing fields ...
-    model_name: str = Field("microsoft/resnet-18")
-# Then in the lifespan function:
-service = ResNetInferenceService(model_name=settings.model_name)
 ```
 ## Deployment
-For development:
 ```bash
 uvicorn main:app --reload
 ```
-For production, use gunicorn with uvicorn workers:
 ```bash
 gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
 ```
 The service runs on CPU by default. For GPU inference, install CUDA-enabled PyTorch and modify your service to move tensors to the GPU device.
-## PyArrow Test Datasets
-This project includes a comprehensive **PyArrow-based dataset generation system** designed specifically for academic challenges and ML model validation. The system generates **100 standardized test datasets** that allow participants to validate their models against consistent, reproducible test cases.
-### File Structure
 ```
-standard_test_001.parquet         # Actual test data (images, requests, responses)
-standard_test_001_metadata.json   # Human-readable description and stats
 ```
-### Dataset Categories (25 each = 100 total)
-#### 1. **Standard Test Cases** (`standard_test_*.parquet`)
-**Purpose**: Baseline functionality validation
-**Content**: Normal images with expected successful predictions
-- **Image Types**: Random patterns, geometric shapes, gradients, text overlays, solid colors
-- **Formats**: JPEG, PNG with proper MIME types
-- **Sizes**: 224x224, 256x256, 299x299, 384x384 (common ML input sizes)
-- **Expected Behavior**: HTTP 200 responses with valid prediction structure
-#### 2. **Edge Case Tests** (`edge_case_*.parquet`)
-**Purpose**: Robustness and error handling validation
-**Content**: Challenging scenarios that test model resilience
-- **Tiny Images**: 32x32, 1x1 pixels (tests preprocessing robustness)
-- **Huge Images**: 2048x2048 (tests memory management and resizing)
-- **Extreme Aspect Ratios**: 1000x50 (tests preprocessing assumptions)
-- **Corrupted Data**: Invalid base64, malformed requests (tests error handling)
-- **Expected Behavior**: Graceful degradation, proper error responses
-#### 3. **Performance Benchmarks** (`performance_test_*.parquet`)
-**Purpose**: Latency and throughput measurement
-**Content**: Varying batch sizes for performance profiling
-- **Batch Sizes**: 1, 5, 10, 25, 50, 100 images per test
-- **Latency Tracking**: Expected max response times based on batch size
-- **Throughput Metrics**: Requests per second under different loads
-- **Expected Behavior**: Consistent performance within acceptable bounds
-#### 4. **Model Comparison** (`model_comparison_*.parquet`)
-**Purpose**: Cross-model validation and benchmarking
-**Content**: Identical inputs tested across different model architectures
-- **Model Types**: ResNet-18/50, ViT, ConvNext, Swin Transformer
-- **Consistent Inputs**: Same 10 base images per dataset
-- **Comparative Analysis**: Enables direct performance comparison between models
-- **Expected Behavior**: Architecture-specific but structurally consistent responses
-### Generation Process
-The dataset generation follows a **deterministic, reproducible approach**:
-#### Step 1: Synthetic Image Creation
-```python
-# Why synthetic images instead of real photos?
-# 1. Copyright-free for academic distribution
-# 3. Programmatically generated edge cases
-def create_synthetic_image(width, height, image_type):
-    if image_type == "random":
-        # RGB noise - tests model noise robustness
-        array = np.random.randint(0, 256, (height, width, 3))
-    elif image_type == "geometric":
-        # Shapes and patterns - tests feature detection
-        # ... geometric pattern generation
-    # ... other synthetic types
-```
-#### Step 2: API Request Structure Generation
-```python
-# Matches exact API format for drop-in testing
 {
-    "image": {
-        "mediaType": "image/jpeg",  # Proper MIME types
-        "data": "<base64-encoded-image>"  # Standard encoding
-    }
 }
 ```
-#### Step 3: Expected Response Generation
-```python
-# Realistic prediction responses with proper structure
 {
-    "prediction": "tiger_cat",           # ImageNet-style labels
-    "confidence": 0.8742,                # Realistic confidence scores
-    "predicted_label": 282,              # Numeric label indices
-    "model": "microsoft/resnet-18",      # Model identification
-    "mediaType": "image/jpeg"            # Echo input format
 }
 ```
-#### Step 4: PyArrow Table Creation
-```python
-# Columnar storage for efficient querying
-table = pa.table({
-    "dataset_id": [...],        # Unique dataset identifier
-    "image_id": [...],          # Individual image identifier
-    "api_request": [...],       # JSON-serialized requests
-    "expected_response": [...], # JSON-serialized expected responses
-    "test_category": [...],     # Category classification
-    "difficulty": [...],        # Complexity indicator
-    # ... additional metadata columns
-})
-```
-### Usage Guide
-**1. Generate Test Datasets**
 ```bash
-# Create all 100 datasets (~2-5 minutes depending on hardware)
 python scripts/generate_test_datasets.py
-# What this creates:
-# - scripts/test_datasets/*.parquet (actual test data)
-# - scripts/test_datasets/*_metadata.json (human-readable info)
-# - scripts/test_datasets/datasets_summary.json (overview)
 ```
-**2. Validate API**
 ```bash
-# Start your ML service
 uvicorn main:app --reload
 # Quick test (5 samples per dataset)
 python scripts/test_datasets.py --quick
-# Full validation (all samples)
 python scripts/test_datasets.py
-# Category-specific testing
 python scripts/test_datasets.py --category edge_case
-python scripts/test_datasets.py --category performance
 ```
-### Testing Output and Metrics
-The test runner provides comprehensive validation metrics:
 ```
 DATASET TESTING SUMMARY
@@ -427,7 +329,7 @@ Test duration: 45.2s
 Performance:
   Avg latency: 123.4ms
   Median latency: 98.7ms
-  Min latency: 45.2ms
   Max latency: 2,341.0ms
   Requests/sec: 27.6
@@ -436,6 +338,29 @@ Category breakdown:
   edge_case: 25 datasets, 76.8% avg success
   performance: 25 datasets, 91.1% avg success
   model_comparison: 25 datasets, 89.3% avg success
-Failed datasets: edge_case_023, edge_case_019, performance_012
 ```

+# ML Inference Service
+FastAPI service for serving ML models over HTTP. Comes with ResNet-18 for image classification out of the box, but you can swap in any model you want.
+## Quick Start
+**Local development:**
 ```bash
+# Install dependencies
 python -m venv .venv
+source .venv/bin/activate
 pip install -r requirements.txt
+# Download the example model
 bash scripts/model_download.bash
+# Run it
 uvicorn main:app --reload
 ```
+Server runs on `http://127.0.0.1:8000`. Check `/docs` for the interactive API documentation.
+**Docker:**
+```bash
+# Build
+docker build -t ml-inference-service:test .
+# Run
+docker run -d --name ml-inference-test -p 8000:8000 ml-inference-service:test
+# Check logs
+docker logs -f ml-inference-test
+# Stop
+docker stop ml-inference-test && docker rm ml-inference-test
+```
+## Testing the API
 ```bash
+# Using curl
+curl -X POST http://localhost:8000/predict \
   -H "Content-Type: application/json" \
   -d '{
     "image": {
       "mediaType": "image/jpeg",
+      "data": "<base64-encoded-image>"
     }
   }'
 ```
 ```json
 {
   "prediction": "tiger cat",
+  "confidence": 0.394,
   "predicted_label": 282,
   "model": "microsoft/resnet-18",
   "mediaType": "image/jpeg"
 }
 ```
+## Project Structure
+```
+ml-inference-service/
+├── main.py                      # Entry point
+├── app/
+│   ├── core/
+│   │   ├── app.py               # App factory, config, DI, lifecycle
+│   │   └── logging.py           # Logging setup
+│   ├── api/
+│   │   ├── models.py            # Request/response schemas
+│   │   ├── controllers.py       # Business logic
+│   │   └── routes/
+│   │       └── prediction.py    # POST /predict
+│   └── services/
+│       ├── base.py              # Abstract InferenceService class
+│       └── inference.py         # ResNet implementation
+├── models/
+│   └── microsoft/
+│       └── resnet-18/           # Model weights and config
+├── scripts/
+│   ├── model_download.bash
+│   ├── generate_test_datasets.py
+│   └── test_datasets.py
+├── Dockerfile                   # Multi-stage build
+├── .env.example                 # Environment config template
+└── requirements.txt
+```
+The key design decision here is that `app/core/app.py` consolidates everything—config, dependency injection, lifecycle, and the app factory. This avoids the mess of managing global state across multiple files.
+## How to Plug In Your Own Model
+The whole service is built around one abstract base class: `InferenceService`. Implement it for your model, and everything else just works.
+### Step 1: Create Your Service Class
 ```python
+# app/services/your_model_service.py
 from app.services.base import InferenceService
 from app.api.models import ImageRequest, PredictionResponse
+import asyncio
 class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
     def __init__(self, model_name: str):
         self.model_name = model_name
+        self.model_path = f"models/{model_name}"
         self.model = None
         self._is_loaded = False
     async def load_model(self) -> None:
+        """Load your model here. Called once at startup."""
         self.model = load_your_model(self.model_path)
         self._is_loaded = True
     async def predict(self, request: ImageRequest) -> PredictionResponse:
+        """Run inference. Offload heavy work to thread pool."""
         return await asyncio.to_thread(self._predict_sync, request)
     def _predict_sync(self, request: ImageRequest) -> PredictionResponse:
+        """Actual inference happens here."""
         image = decode_base64_image(request.image.data)
         result = self.model(image)
         return PredictionResponse(
             prediction=result.label,
             confidence=result.confidence,
         return self._is_loaded
 ```
+**Important:** Use `asyncio.to_thread()` to run CPU-heavy inference in a background thread. This keeps the server responsive while your model is working.
+### Step 2: Register Your Service
+Open `app/core/app.py` and find the lifespan function:
 ```python
+# Change this line:
 service = ResNetInferenceService(model_name="microsoft/resnet-18")
+# To this:
 service = YourModelService(model_name="your-org/your-model")
 ```
+That's it. The `/predict` endpoint now serves your model.
+### Model Files
+Put your model files under `models/` with the full org/model structure:
 ```
 models/
 └── your-org/
     └── your-model/
         ├── config.json
         ├── weights.bin
+        └── (other files)
 ```
+No renaming, no dropping the org prefix—it just mirrors the Hugging Face structure.
+## Configuration
+Settings are managed via environment variables or a `.env` file. See `.env.example` for all available options.
+**Default values:**
+- `APP_NAME`: "ML Inference Service"
+- `APP_VERSION`: "0.1.0"
+- `DEBUG`: false
+- `HOST`: "0.0.0.0"
+- `PORT`: 8000
+- `MODEL_NAME`: "microsoft/resnet-18"
+**To customize:**
+```bash
+# Copy the example
+cp .env.example .env
+# Edit values
+vim .env
 ```
+Or set environment variables directly:
+```bash
+export MODEL_NAME="google/vit-base-patch16-224"
+uvicorn main:app --reload
 ```
 ## Deployment
+**Development:**
 ```bash
 uvicorn main:app --reload
 ```
+**Production:**
 ```bash
 gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
 ```
 The service runs on CPU by default. For GPU inference, install CUDA-enabled PyTorch and modify your service to move tensors to the GPU device.
+**Docker:**
+- Multi-stage build keeps the image small
+- Runs as non-root user (`appuser`)
+- Python dependencies installed in user site-packages
+- Model files baked into the image
+## What Happens When You Start the Server
 ```
+INFO: Starting ML Inference Service...
+INFO: Initializing ResNet service: models/microsoft/resnet-18
+INFO: Loading model from models/microsoft/resnet-18
+INFO: Model loaded: 1000 classes
+INFO: Startup completed successfully
+INFO: Uvicorn running on http://0.0.0.0:8000
 ```
+If you see "Model directory not found", check that your model files exist at the expected path with the full org/model structure.
+## API Reference
+**Endpoint:** `POST /predict`
+**Request:**
+```json
 {
+  "image": {
+    "mediaType": "image/jpeg",  // or "image/png"
+    "data": "<base64-encoded-image>"
+  }
 }
 ```
+**Response:**
+```json
 {
+  "prediction": "string",      // Human-readable label
+  "confidence": 0.0,           // Softmax probability
+  "predicted_label": 0,        // Numeric class index
+  "model": "org/model-name",   // Model identifier
+  "mediaType": "image/jpeg"    // Echoed from request
 }
 ```
+**Docs:**
+- Swagger UI: `http://localhost:8000/docs`
+- ReDoc: `http://localhost:8000/redoc`
+- OpenAPI JSON: `http://localhost:8000/openapi.json`
+## PyArrow Test Datasets
+We've included a test dataset system for validating your model. It generates 100 standardized test cases covering normal inputs, edge cases, performance benchmarks, and model comparisons.
+### Generate Datasets
 ```bash
 python scripts/generate_test_datasets.py
 ```
+This creates:
+- `scripts/test_datasets/*.parquet` - Test data (images, requests, expected responses)
+- `scripts/test_datasets/*_metadata.json` - Human-readable descriptions
+- `scripts/test_datasets/datasets_summary.json` - Overview of all datasets
+### Run Tests
 ```bash
+# Start your service first
 uvicorn main:app --reload
 # Quick test (5 samples per dataset)
 python scripts/test_datasets.py --quick
+# Full validation
 python scripts/test_datasets.py
+# Test specific category
 python scripts/test_datasets.py --category edge_case
 ```
+### Dataset Categories (25 datasets each)
+**1. Standard Tests** (`standard_test_*.parquet`)
+- Normal images: random patterns, shapes, gradients
+- Common sizes: 224x224, 256x256, 299x299, 384x384
+- Formats: JPEG, PNG
+- Purpose: Baseline validation
+**2. Edge Cases** (`edge_case_*.parquet`)
+- Tiny images (32x32, 1x1)
+- Huge images (2048x2048)
+- Extreme aspect ratios (1000x50)
+- Corrupted data, malformed requests
+- Purpose: Test error handling
+**3. Performance Benchmarks** (`performance_test_*.parquet`)
+- Batch sizes: 1, 5, 10, 25, 50, 100 images
+- Latency and throughput tracking
+- Purpose: Performance profiling
+**4. Model Comparisons** (`model_comparison_*.parquet`)
+- Same inputs across different architectures
+- Models: ResNet-18/50, ViT, ConvNext, Swin
+- Purpose: Cross-model benchmarking
+### Test Output
 ```
 DATASET TESTING SUMMARY
 Performance:
   Avg latency: 123.4ms
   Median latency: 98.7ms
+  p95 latency: 342.1ms
   Max latency: 2,341.0ms
   Requests/sec: 27.6
   edge_case: 25 datasets, 76.8% avg success
   performance: 25 datasets, 91.1% avg success
   model_comparison: 25 datasets, 89.3% avg success
+```
+## Common Issues
+**Port 8000 already in use:**
+```bash
+# Find what's using it
+lsof -i :8000
+# Or just use a different port
+uvicorn main:app --port 8080
 ```
+**Model not loading:**
+- Check the path: models should be in `models/<org>/<model-name>/`
+- Make sure you ran `bash scripts/model_download.bash`
+- Check logs for the exact error
+**Slow inference:**
+- Inference runs on CPU by default
+- For GPU: install CUDA PyTorch and modify service to use GPU device
+- Consider using smaller models or quantization
+## License
+MIT