example-submission / README.md
jobinmthomas7's picture
Update README.md
c98bf5e verified
|
raw
history blame
26 kB
---
license: apache-2.0
---
# Example Submission for the SAFE: Image Edit Detection and Localization Challenge 2025
This project provides a starting point for implementing a submission to the [SAFE: Image Edit Detection and Localization Challenge 2025](https://app.dyff.io/challenges/dc509a8c771b492b90c43012fde9a04f). You do not need to use this code to participate in the challenge.
## Clone this repository
To use the code and tools in this repository, [clone it](https://huggingface.co/docs/hub/en/repositories-getting-started#cloning-repositories) with `git`:
```
git clone https://huggingface.co/safe-challenge-2025/example-submission
```
# How to participate
To participate in the challenge, you need to do three things:
1. Visit the [challenge home page](https://app.dyff.io/challenges/dc509a8c771b492b90c43012fde9a04f) and sign up using the linked registration form. After verifying your email, you will receive access credentials for the submission platform.
2. Implement your detector model. You can [use this repository](#how-to-implement-a-detector) as a starting point, but you don't have to.
3. [Submit your detector model](#how-to-make-a-submission) for evaluation. You can build your submission package yourself and submit it [using a CLI tool](#submitting-using-the-dyff-api) (preferred), or you can [build your submission in a HuggingFace Space](#submitting-a-docker-huggingface-space) and submit the Space using a web form.
# How to make a submission
The infrastructure for the challenge runs on [DSRI's Dyff platform](https://app.dyff.io). Submissions to the challenge must be in the form of a containerized web service that serves a simple JSON HTTP API.
If you're comfortable building a Docker image yourself, the [preferred way](#submitting-using-the-dyff-api) to make a submission is to upload and submit a built image using the [Dyff client](https://docs.dyff.io/python-api/dyff.client/).
Alternatively, you can create a Docker [HuggingFace Space](https://huggingface.co/new-space?sdk=docker) and [create submissions from the space](#submitting-a-docker-huggingface-space) using a [webform](https://challenge.dyff.io/submit). The advantage of using an HF Space is that it builds the Docker image for you. However, HF Spaces also have some limitations that you'll need to account for.
## General considerations
* Your submission will run **without Internet access** during evaluation. All of the files required to run your submission must be packaged along with it. You can either include files in the Docker image, or upload the files as a separate package and mount them in your application container during execution.
# Submitting using the Dyff API
If you're able to build a Docker image for your submission yourself, the preferred way to make submissions is via the Dyff API. We provide a command line tool (`challenge-cli.py`) in this repository to simplify the submission process.
In the [terminology of Dyff](https://docs.dyff.io/tutorial/), the thing that you're submitting is an `InferenceService`. You can think of an `InferenceService` as a recipe for spinning up a Docker container that runs an HTTP server that serves an inference API. To create a new submission, you need to upload the Docker image that the service should run, and, optionally, a volume of files such as neural network weights that will be mounted in the container.
## Install the Dyff SDK
You need Python 3.10+ (3.12 recommended). We recommend you install into a virtual environment. If you're using this repository, you can install `dyff` and a few other useful dependencies as described in the [Quick Start](#quick-start) section:
```
# After installing the `uv` tool:
make setup
source venv/bin/activate
```
Or, you can install just `dyff` into a `venv` like this:
```
python3 -m venv venv
source venv/bin/activate
python3 -m pip install --upgrade dyff
```
## Install `skopeo`
To upload Docker images via the Dyff API, you need to have the [`skopeo` tool](https://github.com/containers/skopeo) in your PATH.
## Prepare the submission data
Before creating a submission, you need to build the Docker image that you want to submit locally. For example, running the `make docker-build` command in this repository will build a Docker image in your local Docker daemon with the name `safe-challenge-2025/example-submission:latest`. You can check that the image exists using the `docker images` command:
```
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
safe-challenge-2025/example-submission latest b86a46d856f0 3 hours ago 1.86GB
...
```
If your submission includes large data files such as neural network weights, we recommend that you upload these separately from the Docker image and then arrange for them to be mounted in the running container at run-time. You can upload a local directory recursively to the Dyff platform. Once uploaded, you will get the ID of a Dyff `Model` resources that you can reference in your `InferenceService`.
If you're uploading your large files separately as a `Model`, you'll need to tell Dyff where to mount them in your container. When testing your system locally, you can use the `-v/--volume` flag with `docker run` to [mount a local directory](https://docs.docker.com/engine/storage/volumes/) in the container. Then, just make sure to specify the same mount path when creating your `InferenceService` in Dyff.
## Use the `challenge-cli` tool
This repository contains a CLI script that simplifies the submission process. Usage is like this:
```
$ python3 challenge-cli.py
Usage: challenge-cli.py [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
submit
upload-submission
```
You create a new submission in two steps. First, you upload the submission files and create an `InferenceService`. Then, you create the actual `Submission` resource to tell the Dyff platform that you want to submit the `InferenceService` for the challenge.
### Upload submission files
To upload submission files, use the `upload-submission` command:
```
DYFF_API_TOKEN=<your token> python3 challenge-cli.py upload-submission [OPTIONS]
```
Notice that we're providing an access token via an environment variable.
This command creates a Dyff `Artifact` resource corresponding to your Docker image, an optional Dyff `Model` resource containing your uploaded model files, and a Dyff `InferenceService` resource that references the `Artifact` and the `Model`.
You need to provide your Account ID with `--account`, a name for your system with `--name`, and a Docker image name + tag with `--image`. The tool will create a Dyff `Artifact` resource representing the Docker image.
If your system serves the inference endpoint at a route other than `predict`, use the `--endpoint` flag to specify the correct route.
If you are uploading your large files in a separate data volume, use the `--volume` flag to specify the directory tree to upload, and use the `--volume-mount` flag to set the path where this direcctory should be mounted in the running container. When you use the flags, the tool creates a Dyff `Model` resource representing the uploaded files.
You can also use the `--artifact` and `--model` flags to provide the ID of an `Artifact` or `Model` that already exists instead of creating a new one. For example, if you always use the same Docker image but you mount different model weights in it for different submissions, you can create the Docker image `Artifact` once, and then reference its ID with `--artifact` to avoid uploading it again.
### Submit your system for evaluation
After uploading the submission files, you create the actual `Submission` resource with:
```
DYFF_API_TOKEN=<your token> python3 challenge-cli.py submit [OPTIONS]
```
When submitting, you provide the ID of the `InferenceService` you created in the previous step in `--service`. You also provide your Account ID in `--account`, your Team ID for the challenge in `--team`, and the Task ID that you're submitting to in `--task`.
# Submitting a Docker HuggingFace Space
If you can't build a Docker image yourself, or if the steps above seem too confusing, you can make a submission without interacting with the Dyff API directly by submitting from a HuggingFace Space. HF Spaces that use the Docker SDK will build a Docker image from the contents of the repository associated with the Space when the Space is run. You can then grant the Dyff platform permission to pull this image and use a web form to trigger a new submission.
These are the steps to prepare a HF Space for making submissions to the challenge:
1. Create a new HuggingFace [**Organization**](https://huggingface.co/organizations/new) (**not a user account**) for your challenge team. **The length of your combined Organization name + Space name must be less than 47 characters** due to a limitation of the HuggingFace API.
2. Create a new `Space` within your `Organization`. The Space must use the [Docker SDK](https://huggingface.co/new-space?sdk=docker). **Private Spaces are OK and they will work with the submission process.** **The length of your combined Organization name + Space name must be less than 47 characters** due to a limitation of the HuggingFace API.
3. Create a file called `DYFF_TEAM` in the root directory of your HF Space. The contents of the file should be your Team ID (not your Account ID). This file allows our infrastructure to verify that your Team controls this HF Space.
4. Create a `Dockerfile` in your Space that builds your challenge submission image.
5. Run the Space; this will build the Docker image.
To make a challenge submission from your Space:
1. Add the [official SAFE Challenge user account](https://huggingface.co/safe-challenge-2025-submissions) as a Member of your organization with `read` permissions. **Make sure you are adding the correct user account;** the account name is `safe-challenge-2025-submissions`. This grants permission to our infrastructure to pull the Docker image built by your Space.
2. When you're ready to submit, use the [submission web form](https://challenge.dyff.io/submit) and enter the URL of your Space and the branch that you want to submit.
## Handling large models
There is a size limitation on Space repositories. If your submission contains large files (such as neural network weights), it may be too large to store in the space. In this case, you need to fetch your files from somewhere else **during the Docker build process**.
This means that your Dockerfile should contain something like this:
```
COPY download-my-model.sh ./
RUN ./download-my-model.sh
```
One convenient option is to create a seperate [HuggingFace Model repository](https://huggingface.co/new?owner=my-challenge-org) and use `git clone` in your Dockerfile to fetch the repository files.
## Handling private models
If access credentials are required to download your model files, you should provide them using the [Secrets feature](https://huggingface.co/docs/hub/spaces-overview#managing-secrets) of HuggingFace Spaces. **Do not hard-code credentials in your Dockerfile or anywhere else in your Space or Organization!**
Access credentials are necessary if you want to clone a private HuggingFace Model repository during your Docker build process.
Access the secrets as described in the [Secrets > Buildtime section](https://huggingface.co/docs/hub/spaces-sdks-docker#secrets). Remember that you can't download files at run-time because your system will not have access to the Internet.
# How to implement a detector
To implement a new detector that you can submit to the challenge, you need to implement an HTTP server that serves the required JSON API for inference requests. This repository contains a template that you can use as a starting point for implementing a detector in Python. You should be able to adapt this template easily to support common model formats such as neural networks built with PyTorch.
You are also free to build detectors with any other technologies and software stacks that you want, but you may have to figure out packaging on your own. All that's required of your submission is that it runs in a Docker container and that it supports the [required inference API](#api-reference).
## Quick Start
**Install `uv`:**
https://docs.astral.sh/uv/getting-started/installation/
**Local development:**
```bash
# Install dependencies
make setup
source venv/bin/activate
# Download the example model
make download
# Run it
make serve
```
In a second terminal:
```bash
# Process an example input
./prompt.sh cat.json
```
The server runs on `http://127.0.0.1:8000`. Check `/docs` for the interactive API documentation.
**Docker:**
```bash
# Build
make docker-build
# Run
make docker-run
```
The Docker container also runs the server at `http://127.0.0.1:8000`.
## What Happens When You Start the Server
```
INFO: Starting ML Inference Service...
INFO: Initializing ResNet service: models/microsoft/resnet-18
INFO: Loading model from models/microsoft/resnet-18
INFO: Model loaded: 1000 classes
INFO: Startup completed successfully
INFO: Uvicorn running on http://0.0.0.0:8000
```
If you see "Model directory not found", check that your model files exist at the expected path with the full org/model structure.
## Testing the API
By default, the server serves the inference API at `/predict`:
```bash
# Using curl
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"image": {
"mediaType": "image/jpeg",
"data": "<base64-encoded-image-data>"
}
}'
```
Example response:
```json
{
"logprobs": [-0.859380304813385,-1.2701971530914307,-2.1918208599090576,-1.69235098361969],
"localizationMask": {
"mediaType":"image/png",
"data":"iVBORw0KGgoAAAANSUhEUgAAA8AAAAKDAQAAAAD9Fl5AAAAAu0lEQVR4nO3NsREAMAgDMWD/nZMVKEwn1T5/FQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMCl3g5f+HC24TRhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAj70gwKsTlmdBwAAAABJRU5ErkJggg=="
}
}
```
## Project Structure
```
example-submission/
├── main.py # Entry point
├── app/
│ ├── core/
│ │ ├── app.py # <= INSTANTIATE YOUR DETECTOR HERE
│ │ └── logging.py
│ ├── api/
│ │ ├── models.py # Request/response schemas
│ │ ├── controllers.py # Business logic
│ │ └── routes/
│ │ └── prediction.py # POST /predict
│ └── services/
│ ├── base.py # <= YOUR DETECTOR IMPLEMENTS THIS INTERFACE
│ └── inference.py # Example service based on ResNet-18
├── models/
│ └── microsoft/
│ └── resnet-18/ # Model weights and config
├── scripts/
│ ├── model_download.bash # Downloads resnet-18
│ ├── generate_test_datasets.py # Creates test datasets
│ └── test_datasets.py # Runs inference on test datasets
├── Dockerfile
├── .env.example # Environment config template
├── cat.json # An example /predict request object
├── makefile
├── prompt.sh # Script that makes a /predict request
├── requirements.cpu.in
├── requirements.cpu.txt
├── requirements.torch.cpu.in
├── requirements.torch.cpu.in
├── response.json # An example /predict response object
└──
```
## How to Plug In Your Own Model
To integrate your model, implement the `InferenceService` abstract class defined in `app/services/base.py`. You can follow the example implementation in `app/services/inference.py`, which is based on ResNet-18. After implementing the required interface, instantiate your model in the `lifespan()` function in `app/core/app.py`, replacing the `ResNetInferenceService` instance.
### Step 1: Create Your Service Class
```python
# app/services/your_model_service.py
from app.services.base import InferenceService
from app.api.models import ImageRequest, PredictionResponse
class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
def __init__(self, model_name: str):
self.model_name = model_name
self.model_path = f"models/{model_name}"
self.model = None
self._is_loaded = False
def load_model(self) -> None:
"""Load your model here. Called once at startup."""
self.model = load_your_model(self.model_path)
self._is_loaded = True
def predict(self, request: ImageRequest) -> PredictionResponse:
"""Actual inference happens here."""
image = decode_base64_image(request.image.data)
result = self.model(image)
logprobs = ...
mask = ...
return PredictionResponse(
logprobs=logprobs,
localizationMask=mask,
)
@property
def is_loaded(self) -> bool:
return self._is_loaded
```
### Step 2: Register Your Service
Open `app/core/app.py` and find the lifespan function:
```python
# Change this line:
service = ResNetInferenceService(model_name="microsoft/resnet-18")
# To this:
service = YourModelService(...)
```
That's it. The `/predict` endpoint now serves your model.
### Model Files
Put your model files under the `models/` directory:
```
models/
└── your-org/
└── your-model/
├── config.json
├── weights.bin
└── (other files)
```
## GPU inference
The default configuration in this repo runs the model on CPU and does not contain the necessary dependencies for using GPUs.
To enable GPU inference, you need to:
1. Base your Docker image on an image that contains the CUDA system packages such as [this one](https://hub.docker.com/layers/nvidia/cuda/13.0.2-cudnn-runtime-ubuntu24.04/images/sha256-4d242f206abc4b9588a6506cce2d88932cc879849395aae3785075179718cc49). If you're using the `nvidia/cuda` images, you probably want one of the `-runtime-` tags, as the `-devel-` versions contain dependencies you probably don't need.
2. Install the GPU version of PyTorch (or whichever framework you use).
3. Use the PyTorch `.to()` function (or its equivalent in your framework) in the `load_model()` and `predict()` functions to move model weights and input data to and from the CUDA device.
## Configuration
Settings are managed via environment variables or a `.env` file. See `.env.example` for all available options.
**Default values:**
- `APP_NAME`: "ML Inference Service"
- `APP_VERSION`: "0.1.0"
- `DEBUG`: false
- `HOST`: "0.0.0.0"
- `PORT`: 8000
- `MODEL_NAME`: "microsoft/resnet-18"
**To customize:**
```bash
# Copy the example
cp .env.example .env
# Edit values
vim .env
```
Or set environment variables directly:
```bash
export MODEL_NAME="google/vit-base-patch16-224"
uvicorn main:app --reload
```
## API Reference
**Endpoint:** `POST /predict`
**Request:**
```json
{
"image": {
"mediaType": "image/jpeg", // or "image/png"
"data": "<base64 string>"
}
}
```
To decode a request, first convert `.image.data` from a base64 string to binary data (i.e., a Python `bytes` string), then interpret the binary data as image data of the type specified in `.image.mediaType`. The `.image.mediaType` will be either `image/jpeg` or `image/png`.
**Response:**
```json
{
"logprobs": [float], // Log-probabilities of each label (length 4)
"localizationMask": { // [Optional] binary mask
"mediaType": "image/png", // Must be 'image/png'
"data": "<base64 string>" // Image data
}
}
```
The `.logprobs` field must contain a list of `floats` of length `4`. Each index in the list corresponds to the log-probability of the associated label. The possible labels are describe in the `app.api.models.Labels` enumeration:
```python
Natural = 0
FullySynthesized = 1
LocallyEdited = 2
LocallySynthesized = 3
```
The `Synthesized` labels mean that the image was partially or fully synthesized by a tool such as a generative image model. The `LocallyEdited` label means that the image was manipulated in some way other than by synthesizing content, such as by copying and pasting content from another image using image editing software.
The `.localizationMask` field is optional, but you should populate it if your detector is capable of localizing its detections. The mask is a binary (`0/1`) bitmap encoded as a PNG image. A non-zero value for a pixel means that the detector thinks that that pixel has been manipulated. A Python function to convert a `numpy` array to a PNG mask is provided in `app.api.models.BinaryMask.from_numpy()`.
**Docs:**
The server in this repository serves API docs at the following endpoints:
- Swagger UI: `http://localhost:8000/docs`
- ReDoc: `http://localhost:8000/redoc`
- OpenAPI JSON: `http://localhost:8000/openapi.json`
## PyArrow Test Datasets
We've included a test dataset system for validating your model. It generates 100 standardized test cases covering normal inputs, edge cases, performance benchmarks, and model comparisons.
### Generate Datasets
```bash
python scripts/generate_test_datasets.py
```
This creates:
- `scripts/test_datasets/*.parquet` - Test data (images, requests, expected responses)
- `scripts/test_datasets/*_metadata.json` - Human-readable descriptions
- `scripts/test_datasets/datasets_summary.json` - Overview of all datasets
### Run Tests
```bash
# Start your service first
make serve
```
In another terminal:
```bash
# Quick test (5 samples per dataset)
python scripts/test_datasets.py --quick
# Full validation
python scripts/test_datasets.py
# Test specific category
python scripts/test_datasets.py --category edge_case
```
### Dataset Categories (25 datasets each)
**1. Standard Tests** (`standard_test_*.parquet`)
- Normal images: random patterns, shapes, gradients
- Common sizes: 224x224, 256x256, 299x299, 384x384
- Formats: JPEG, PNG
- Purpose: Baseline validation
**2. Edge Cases** (`edge_case_*.parquet`)
- Tiny images (32x32, 1x1)
- Huge images (2048x2048)
- Extreme aspect ratios (1000x50)
- Corrupted data, malformed requests
- Purpose: Test error handling
**3. Performance Benchmarks** (`performance_test_*.parquet`)
- Batch sizes: 1, 5, 10, 25, 50, 100 images
- Latency and throughput tracking
- Purpose: Performance profiling
**4. Model Comparisons** (`model_comparison_*.parquet`)
- Same inputs across different architectures
- Models: ResNet-18/50, ViT, ConvNext, Swin
- Purpose: Cross-model benchmarking
### Test Output
```
DATASET TESTING SUMMARY
============================================================
Datasets tested: 100
Successful datasets: 95
Failed datasets: 5
Total samples: 1,247
Overall success rate: 87.3%
Test duration: 45.2s
Performance:
Avg latency: 123.4ms
Median latency: 98.7ms
p95 latency: 342.1ms
Max latency: 2,341.0ms
Requests/sec: 27.6
Category breakdown:
standard: 25 datasets, 94.2% avg success
edge_case: 25 datasets, 76.8% avg success
performance: 25 datasets, 91.1% avg success
model_comparison: 25 datasets, 89.3% avg success
```
## Common Issues
**Port 8000 already in use:**
```bash
# Find what's using it
lsof -i :8000
# Or just use a different port
uvicorn main:app --port 8080
```
**Model not loading:**
- Check the path: models should be in `models/<org>/<model-name>/`
- If you're trying to run the example ResNet-based model, make sure you ran `make download` to fetch the model weights.
- Check logs for the exact error
**Slow inference:**
- Inference runs on CPU by default
- For GPU: install CUDA PyTorch and modify service to use GPU device
- Consider using smaller models or quantization
# Dyff Web Portal – Quick Start Guide
## Signing In
1. Obtain a **Dyff API Key**.
2. Go to: <https://app.dyff.io/home>
3. Click the **Sign in** button in the top-right corner.
4. Select **Sign in with key**.
5. Paste in your **Dyff API Key**.
6. Click **Verify**.
---
## Finding Your Submission
1. After signing in, click **Operator** in the navigation bar.
2. In the dropdown menu, click **Submissions**.
This will take you to the **Submissions** page, where you can see the status of all submissions associated with your account or team.
![Operator Dashboard](operatorDash.png)
---
## Using the Submissions Page
![Submissions Dashboard](submissionsDash.png)
The **Submissions** page shows the detailed status of your submissions. You can:
- **Search by submission ID**
Use the **Submission ID** filter at the top of the table to find a specific submission (1).
- **Search by team ID**
Use the **Team ID** filter to find all submissions associated with a particular team (2).
To view details for a specific submission:
1. Find the row for your submission.
2. Click on the **Status** value for that submission (3).
This opens a detailed view where you can see information about:
- **Inference Service** (1)
- **Challenge** (2)
- **Evaluation** (3)
- **Safety Case** (4)
![Submissions Dashboard](submissions.png)
---
## Evaluations
![Evaluations](evaluations.png)
In the **Evaluation** section of a submission, you can:
- View the **raw JSON data** in the **Raw JSON** tab.
---
## Safety Case
![Safety Case](safetycase.png)
In the **Safety Case** section, you can:
- View logs and details related to the **safety case** for the given submission ID.
---
## TL;DR
- Use **Submissions** to see the **overall status** of your work.
- You can find your submission by:
- Entering your **Submission ID**, or
- Entering your **Team ID**.
- Click on the **Status** of a submission to see detailed information about its Inference Service, Challenge, Evaluation, and Safety Case.
## License
Apache 2.0