Example Submission for the SAFE: Image Edit Detection and Localization Challenge 2025

This project provides a starting point for implementing a submission to the SAFE: Image Edit Detection and Localization Challenge 2025. You do not need to use this code to participate in the challenge.

Clone this repository

To use the code and tools in this repository, clone it with git:

git clone https://huggingface.co/safe-challenge-2025/example-submission

How to participate

To participate in the challenge, you need to do three things:

  1. Visit the challenge home page and sign up using the linked registration form. After verifying your email, you will receive access credentials for the submission platform.
  2. Implement your detector model. You can use this repository as a starting point, but you don't have to.
  3. Submit your detector model for evaluation. You can build your submission package yourself and submit it using a CLI tool (preferred), or you can build your submission in a HuggingFace Space and submit the Space using a web form.

How to make a submission

The infrastructure for the challenge runs on DSRI's Dyff platform. Submissions to the challenge must be in the form of a containerized web service that serves a simple JSON HTTP API.

If you're comfortable building a Docker image yourself, the preferred way to make a submission is to upload and submit a built image using the Dyff client.

Alternatively, you can create a Docker HuggingFace Space and create submissions from the space using a webform. The advantage of using an HF Space is that it builds the Docker image for you. However, HF Spaces also have some limitations that you'll need to account for.

General considerations

  • Your submission will run without Internet access during evaluation. All of the files required to run your submission must be packaged along with it. You can either include files in the Docker image, or upload the files as a separate package and mount them in your application container during execution.

Submitting using the Dyff API

If you're able to build a Docker image for your submission yourself, the preferred way to make submissions is via the Dyff API. We provide a command line tool (challenge-cli.py) in this repository to simplify the submission process.

In the terminology of Dyff, the thing that you're submitting is an InferenceService. You can think of an InferenceService as a recipe for spinning up a Docker container that runs an HTTP server that serves an inference API. To create a new submission, you need to upload the Docker image that the service should run, and, optionally, a volume of files such as neural network weights that will be mounted in the container.

Install the Dyff SDK

You need Python 3.10+ (3.12 recommended). We recommend you install into a virtual environment. If you're using this repository, you can install dyff and a few other useful dependencies as described in the Quick Start section:

# After installing the `uv` tool:
make setup
source venv/bin/activate

Or, you can install just dyff into a venv like this:

python3 -m venv venv
source venv/bin/activate
python3 -m pip install --upgrade dyff

Install skopeo

To upload Docker images via the Dyff API, you need to have the skopeo tool in your PATH.

Prepare the submission data

Before creating a submission, you need to build the Docker image that you want to submit locally. For example, running the make docker-build command in this repository will build a Docker image in your local Docker daemon with the name safe-challenge-2025/example-submission:latest. You can check that the image exists using the docker images command:

$ docker images
REPOSITORY                                TAG       IMAGE ID       CREATED        SIZE
safe-challenge-2025/example-submission    latest    b86a46d856f0   3 hours ago    1.86GB
...

If your submission includes large data files such as neural network weights, we recommend that you upload these separately from the Docker image and then arrange for them to be mounted in the running container at run-time. You can upload a local directory recursively to the Dyff platform. Once uploaded, you will get the ID of a Dyff Model resources that you can reference in your InferenceService.

If you're uploading your large files separately as a Model, you'll need to tell Dyff where to mount them in your container. When testing your system locally, you can use the -v/--volume flag with docker run to mount a local directory in the container. Then, just make sure to specify the same mount path when creating your InferenceService in Dyff.

Use the challenge-cli tool

This repository contains a CLI script that simplifies the submission process. Usage is like this:

$ python3 challenge-cli.py
Usage: challenge-cli.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  submit
  upload-submission

You create a new submission in two steps. First, you upload the submission files and create an InferenceService. Then, you create the actual Submission resource to tell the Dyff platform that you want to submit the InferenceService for the challenge.

Upload submission files

To upload submission files, use the upload-submission command:

DYFF_API_TOKEN=<your token> python3 challenge-cli.py upload-submission [OPTIONS]

Notice that we're providing an access token via an environment variable.

This command creates a Dyff Artifact resource corresponding to your Docker image, an optional Dyff Model resource containing your uploaded model files, and a Dyff InferenceService resource that references the Artifact and the Model.

You need to provide your Account ID with --account, a name for your system with --name, and a Docker image name + tag with --image. The tool will create a Dyff Artifact resource representing the Docker image.

If your system serves the inference endpoint at a route other than predict, use the --endpoint flag to specify the correct route.

If you are uploading your large files in a separate data volume, use the --volume flag to specify the directory tree to upload, and use the --volume-mount flag to set the path where this direcctory should be mounted in the running container. When you use the flags, the tool creates a Dyff Model resource representing the uploaded files.

You can also use the --artifact and --model flags to provide the ID of an Artifact or Model that already exists instead of creating a new one. For example, if you always use the same Docker image but you mount different model weights in it for different submissions, you can create the Docker image Artifact once, and then reference its ID with --artifact to avoid uploading it again.

Submit your system for evaluation

After uploading the submission files, you create the actual Submission resource with:

DYFF_API_TOKEN=<your token> python3 challenge-cli.py submit [OPTIONS]

When submitting, you provide the ID of the InferenceService you created in the previous step in --service. You also provide your Account ID in --account, your Team ID for the challenge in --team, and the Task ID that you're submitting to in --task.

Submitting a Docker HuggingFace Space

If you can't build a Docker image yourself, or if the steps above seem too confusing, you can make a submission without interacting with the Dyff API directly by submitting from a HuggingFace Space. HF Spaces that use the Docker SDK will build a Docker image from the contents of the repository associated with the Space when the Space is run. You can then grant the Dyff platform permission to pull this image and use a web form to trigger a new submission.

These are the steps to prepare a HF Space for making submissions to the challenge:

  1. Create a new HuggingFace Organization (not a user account) for your challenge team. The length of your combined Organization name + Space name must be less than 47 characters due to a limitation of the HuggingFace API.
  2. Create a new Space within your Organization. The Space must use the Docker SDK. Private Spaces are OK and they will work with the submission process. The length of your combined Organization name + Space name must be less than 47 characters due to a limitation of the HuggingFace API.
  3. Create a file called DYFF_TEAM in the root directory of your HF Space. The contents of the file should be your Team ID (not your Account ID). This file allows our infrastructure to verify that your Team controls this HF Space.
  4. Create a Dockerfile in your Space that builds your challenge submission image.
  5. Run the Space; this will build the Docker image.

To make a challenge submission from your Space:

  1. Add the official SAFE Challenge user account as a Member of your organization with read permissions. Make sure you are adding the correct user account; the account name is safe-challenge-2025-submissions. This grants permission to our infrastructure to pull the Docker image built by your Space.
  2. When you're ready to submit, use the submission web form and enter the URL of your Space and the branch that you want to submit.

Handling large models

There is a size limitation on Space repositories. If your submission contains large files (such as neural network weights), it may be too large to store in the space. In this case, you need to fetch your files from somewhere else during the Docker build process.

This means that your Dockerfile should contain something like this:

COPY download-my-model.sh ./ 
RUN ./download-my-model.sh

One convenient option is to create a seperate HuggingFace Model repository and use git clone in your Dockerfile to fetch the repository files.

Handling private models

If access credentials are required to download your model files, you should provide them using the Secrets feature of HuggingFace Spaces. Do not hard-code credentials in your Dockerfile or anywhere else in your Space or Organization!

Access credentials are necessary if you want to clone a private HuggingFace Model repository during your Docker build process.

Access the secrets as described in the Secrets > Buildtime section. Remember that you can't download files at run-time because your system will not have access to the Internet.

How to implement a detector

To implement a new detector that you can submit to the challenge, you need to implement an HTTP server that serves the required JSON API for inference requests. This repository contains a template that you can use as a starting point for implementing a detector in Python. You should be able to adapt this template easily to support common model formats such as neural networks built with PyTorch.

You are also free to build detectors with any other technologies and software stacks that you want, but you may have to figure out packaging on your own. All that's required of your submission is that it runs in a Docker container and that it supports the required inference API.

Quick Start

Install uv: https://docs.astral.sh/uv/getting-started/installation/

Local development:

# Install dependencies
make setup
source venv/bin/activate

# Download the example model
make download

# Run it
make serve

In a second terminal:

# Process an example input
./prompt.sh cat.json

The server runs on http://127.0.0.1:8000. Check /docs for the interactive API documentation.

Docker:

# Build
make docker-build

# Run
make docker-run

The Docker container also runs the server at http://127.0.0.1:8000.

What Happens When You Start the Server

INFO: Starting ML Inference Service...
INFO: Initializing ResNet service: models/microsoft/resnet-18
INFO: Loading model from models/microsoft/resnet-18
INFO: Model loaded: 1000 classes
INFO: Startup completed successfully
INFO: Uvicorn running on http://0.0.0.0:8000

If you see "Model directory not found", check that your model files exist at the expected path with the full org/model structure.

Testing the API

By default, the server serves the inference API at /predict:

# Using curl
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "image": {
      "mediaType": "image/jpeg",
      "data": "<base64-encoded-image-data>"
    }
  }'

Example response:

{
  "logprobs": [-0.859380304813385,-1.2701971530914307,-2.1918208599090576,-1.69235098361969],
  "localizationMask": {
    "mediaType":"image/png",
    "data":"iVBORw0KGgoAAAANSUhEUgAAA8AAAAKDAQAAAAD9Fl5AAAAAu0lEQVR4nO3NsREAMAgDMWD/nZMVKEwn1T5/FQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMCl3g5f+HC24TRhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAj70gwKsTlmdBwAAAABJRU5ErkJggg=="
  }
}

Project Structure

example-submission/
β”œβ”€β”€ main.py                        # Entry point
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ app.py                 # <= INSTANTIATE YOUR DETECTOR HERE
β”‚   β”‚   └── logging.py           
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ models.py              # Request/response schemas
β”‚   β”‚   β”œβ”€β”€ controllers.py         # Business logic
β”‚   β”‚   └── routes/
β”‚   β”‚       └── prediction.py      # POST /predict
β”‚   └── services/
β”‚       β”œβ”€β”€ base.py                # <= YOUR DETECTOR IMPLEMENTS THIS INTERFACE
β”‚       └── inference.py           # Example service based on ResNet-18
β”œβ”€β”€ models/
β”‚   └── microsoft/
β”‚       └── resnet-18/             # Model weights and config
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ model_download.bash        # Downloads resnet-18
β”‚   β”œβ”€β”€ generate_test_datasets.py  # Creates test datasets
β”‚   └── test_datasets.py           # Runs inference on test datasets
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ .env.example                   # Environment config template
β”œβ”€β”€ cat.json                       # An example /predict request object
β”œβ”€β”€ makefile
β”œβ”€β”€ prompt.sh                      # Script that makes a /predict request
β”œβ”€β”€ requirements.cpu.in              
β”œβ”€β”€ requirements.cpu.txt
β”œβ”€β”€ requirements.torch.cpu.in    
β”œβ”€β”€ requirements.torch.cpu.in    
β”œβ”€β”€ response.json                  # An example /predict response object
└──

How to Plug In Your Own Model

To integrate your model, implement the InferenceService abstract class defined in app/services/base.py. You can follow the example implementation in app/services/inference.py, which is based on ResNet-18. After implementing the required interface, instantiate your model in the lifespan() function in app/core/app.py, replacing the ResNetInferenceService instance.

Step 1: Create Your Service Class

# app/services/your_model_service.py
from app.services.base import InferenceService
from app.api.models import ImageRequest, PredictionResponse

class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
    def __init__(self, model_name: str):
        self.model_name = model_name
        self.model_path = f"models/{model_name}"
        self.model = None
        self._is_loaded = False

    def load_model(self) -> None:
        """Load your model here. Called once at startup."""
        self.model = load_your_model(self.model_path)
        self._is_loaded = True

    def predict(self, request: ImageRequest) -> PredictionResponse:
        """Actual inference happens here."""
        image = decode_base64_image(request.image.data)
        result = self.model(image)

        logprobs = ...
        mask = ...

        return PredictionResponse(
            logprobs=logprobs,
            localizationMask=mask,
        )

    @property
    def is_loaded(self) -> bool:
        return self._is_loaded

Step 2: Register Your Service

Open app/core/app.py and find the lifespan function:

# Change this line:
service = ResNetInferenceService(model_name="microsoft/resnet-18")

# To this:
service = YourModelService(...)

That's it. The /predict endpoint now serves your model.

Model Files

Put your model files under the models/ directory:

models/
└── your-org/
    └── your-model/
        β”œβ”€β”€ config.json
        β”œβ”€β”€ weights.bin
        └── (other files)

GPU inference

The default configuration in this repo runs the model on CPU and does not contain the necessary dependencies for using GPUs.

To enable GPU inference, you need to:

  1. Base your Docker image on an image that contains the CUDA system packages such as this one. If you're using the nvidia/cuda images, you probably want one of the -runtime- tags, as the -devel- versions contain dependencies you probably don't need.
  2. Install the GPU version of PyTorch (or whichever framework you use).
  3. Use the PyTorch .to() function (or its equivalent in your framework) in the load_model() and predict() functions to move model weights and input data to and from the CUDA device.

Configuration

Settings are managed via environment variables or a .env file. See .env.example for all available options.

Default values:

  • APP_NAME: "ML Inference Service"
  • APP_VERSION: "0.1.0"
  • DEBUG: false
  • HOST: "0.0.0.0"
  • PORT: 8000
  • MODEL_NAME: "microsoft/resnet-18"

To customize:

# Copy the example
cp .env.example .env

# Edit values
vim .env

Or set environment variables directly:

export MODEL_NAME="google/vit-base-patch16-224"
uvicorn main:app --reload

API Reference

Endpoint: POST /predict

Request:

{
  "image": {
    "mediaType": "image/jpeg",  // or "image/png"
    "data": "<base64 string>"
  }
}

To decode a request, first convert .image.data from a base64 string to binary data (i.e., a Python bytes string), then interpret the binary data as image data of the type specified in .image.mediaType. The .image.mediaType will be either image/jpeg or image/png.

Response:

{
  "logprobs": [float],         // Log-probabilities of each label (length 4)
  "localizationMask": {        // [Optional] binary mask
    "mediaType": "image/png",  // Must be 'image/png'
    "data": "<base64 string>"  // Image data
  }
}

The .logprobs field must contain a list of floats of length 4. Each index in the list corresponds to the log-probability of the associated label. The possible labels are describe in the app.api.models.Labels enumeration:

Natural = 0
FullySynthesized = 1
LocallyEdited = 2
LocallySynthesized = 3

The Synthesized labels mean that the image was partially or fully synthesized by a tool such as a generative image model. The LocallyEdited label means that the image was manipulated in some way other than by synthesizing content, such as by copying and pasting content from another image using image editing software.

The .localizationMask field is optional, but you should populate it if your detector is capable of localizing its detections. The mask is a binary (0/1) bitmap encoded as a PNG image. A non-zero value for a pixel means that the detector thinks that that pixel has been manipulated. A Python function to convert a numpy array to a PNG mask is provided in app.api.models.BinaryMask.from_numpy().

Docs:

The server in this repository serves API docs at the following endpoints:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc
  • OpenAPI JSON: http://localhost:8000/openapi.json

PyArrow Test Datasets

We've included a test dataset system for validating your model. It generates 100 standardized test cases covering normal inputs, edge cases, performance benchmarks, and model comparisons.

Generate Datasets

python scripts/generate_test_datasets.py

This creates:

  • scripts/test_datasets/*.parquet - Test data (images, requests, expected responses)
  • scripts/test_datasets/*_metadata.json - Human-readable descriptions
  • scripts/test_datasets/datasets_summary.json - Overview of all datasets

Run Tests

# Start your service first
make serve

In another terminal:

# Quick test (5 samples per dataset)
python scripts/test_datasets.py --quick

# Full validation
python scripts/test_datasets.py

# Test specific category
python scripts/test_datasets.py --category edge_case

Dataset Categories (25 datasets each)

1. Standard Tests (standard_test_*.parquet)

  • Normal images: random patterns, shapes, gradients
  • Common sizes: 224x224, 256x256, 299x299, 384x384
  • Formats: JPEG, PNG
  • Purpose: Baseline validation

2. Edge Cases (edge_case_*.parquet)

  • Tiny images (32x32, 1x1)
  • Huge images (2048x2048)
  • Extreme aspect ratios (1000x50)
  • Corrupted data, malformed requests
  • Purpose: Test error handling

3. Performance Benchmarks (performance_test_*.parquet)

  • Batch sizes: 1, 5, 10, 25, 50, 100 images
  • Latency and throughput tracking
  • Purpose: Performance profiling

4. Model Comparisons (model_comparison_*.parquet)

  • Same inputs across different architectures
  • Models: ResNet-18/50, ViT, ConvNext, Swin
  • Purpose: Cross-model benchmarking

Test Output

DATASET TESTING SUMMARY
============================================================
Datasets tested: 100
Successful datasets: 95
Failed datasets: 5
Total samples: 1,247
Overall success rate: 87.3%
Test duration: 45.2s

Performance:
  Avg latency: 123.4ms
  Median latency: 98.7ms
  p95 latency: 342.1ms
  Max latency: 2,341.0ms
  Requests/sec: 27.6

Category breakdown:
  standard: 25 datasets, 94.2% avg success
  edge_case: 25 datasets, 76.8% avg success
  performance: 25 datasets, 91.1% avg success
  model_comparison: 25 datasets, 89.3% avg success

Common Issues

Port 8000 already in use:

# Find what's using it
lsof -i :8000

# Or just use a different port
uvicorn main:app --port 8080

Model not loading:

  • Check the path: models should be in models/<org>/<model-name>/
  • If you're trying to run the example ResNet-based model, make sure you ran make download to fetch the model weights.
  • Check logs for the exact error

Slow inference:

  • Inference runs on CPU by default
  • For GPU: install CUDA PyTorch and modify service to use GPU device
  • Consider using smaller models or quantization

Dyff Web Portal – Quick Start Guide

Signing In

  1. Obtain a Dyff API Key.
  2. Go to: https://app.dyff.io/home
  3. Click the Sign in button in the top-right corner.
  4. Select Sign in with key.
  5. Paste in your Dyff API Key.
  6. Click Verify.

Finding Your Submission

  1. After signing in, click Operator in the navigation bar.
  2. In the dropdown menu, click Submissions.

This will take you to the Submissions page, where you can see the status of all submissions associated with your account or team.

Operator Dashboard


Using the Submissions Page

Submissions Dashboard

The Submissions page shows the detailed status of your submissions. You can:

  • Search by submission ID
    Use the Submission ID filter at the top of the table to find a specific submission (1).

  • Search by team ID
    Use the Team ID filter to find all submissions associated with a particular team (2).

To view details for a specific submission:

  1. Find the row for your submission.
  2. Click on the Status value for that submission (3).

This opens a detailed view where you can see information about:

  • Inference Service (1)
  • Challenge (2)
  • Evaluation (3)
  • Safety Case (4)

Submissions Dashboard


Evaluations

Evaluations

In the Evaluation section of a submission, you can:

  • View the raw JSON data in the Raw JSON tab.

Safety Case

Safety Case

In the Safety Case section, you can:

  • View logs and details related to the safety case for the given submission ID.

TL;DR

  • Use Submissions to see the overall status of your work.
  • You can find your submission by:
    • Entering your Submission ID, or
    • Entering your Team ID.
  • Click on the Status of a submission to see detailed information about its Inference Service, Challenge, Evaluation, and Safety Case.

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support