DeQA-Doc-Color: Document Image Color Quality Assessment
DeQA-Doc-Color is a vision-language model specialized in assessing the color quality of document images. It evaluates color fidelity, saturation, white balance, and color-related artifacts in scanned or photographed documents.
Model Family
This model is part of the DeQA-Doc family, which includes three specialized models:
| Model | Description | HuggingFace |
|---|---|---|
| DeQA-Doc-Overall | Overall document quality | mapo80/DeQA-Doc-Overall |
| DeQA-Doc-Color | Color quality assessment (this model) | mapo80/DeQA-Doc-Color |
| DeQA-Doc-Sharpness | Sharpness/clarity assessment | mapo80/DeQA-Doc-Sharpness |
Quick Start
import torch
from transformers import AutoModelForCausalLM
from PIL import Image
# Load the model
model = AutoModelForCausalLM.from_pretrained(
"mapo80/DeQA-Doc-Color",
trust_remote_code=True,
torch_dtype=torch.float16,
device_map="auto",
)
# Score an image
image = Image.open("document.jpg").convert("RGB")
score = model.score([image])
print(f"Color Quality Score: {score.item():.2f} / 5.0")
What Does Color Quality Measure?
The color quality score evaluates:
- Color Fidelity: How accurately colors are reproduced
- White Balance: Neutral whites without color casts (yellow, blue tints)
- Saturation: Appropriate color intensity (not washed out or oversaturated)
- Color Artifacts: Absence of color bleeding, banding, or chromatic aberration
- Uniformity: Consistent color reproduction across the document
Score Interpretation
| Score Range | Quality Level | Typical Issues |
|---|---|---|
| 4.5 - 5.0 | Excellent | Perfect color reproduction |
| 3.5 - 4.5 | Good | Minor color shifts, slight tinting |
| 2.5 - 3.5 | Fair | Noticeable color cast, uneven colors |
| 1.5 - 2.5 | Poor | Strong color distortion, washed out |
| 1.0 - 1.5 | Bad | Severe color problems, unusable |
Batch Processing
images = [
Image.open("doc1.jpg").convert("RGB"),
Image.open("doc2.jpg").convert("RGB"),
Image.open("doc3.jpg").convert("RGB"),
]
scores = model.score(images)
for i, score in enumerate(scores):
print(f"Document {i+1} Color Score: {score.item():.2f} / 5.0")
Use Cases
- Scanner Calibration: Detect when scanners need color calibration
- Photo Document QA: Flag photos with poor lighting/white balance
- Color-Critical Documents: Verify color accuracy for maps, charts, branded materials
- Archive Preservation: Identify documents with color degradation
- Print Quality Control: Verify color reproduction in printed documents
Example: Detect Color Issues
import torch
from transformers import AutoModelForCausalLM
from PIL import Image
model = AutoModelForCausalLM.from_pretrained(
"mapo80/DeQA-Doc-Color",
trust_remote_code=True,
torch_dtype=torch.float16,
device_map="auto",
)
def diagnose_color_quality(image_path):
img = Image.open(image_path).convert("RGB")
score = model.score([img]).item()
if score >= 4.5:
diagnosis = "Excellent color quality"
elif score >= 3.5:
diagnosis = "Good - minor color issues"
elif score >= 2.5:
diagnosis = "Fair - consider color correction"
elif score >= 1.5:
diagnosis = "Poor - needs color correction or rescan"
else:
diagnosis = "Bad - severe color problems, rescan required"
return score, diagnosis
score, diagnosis = diagnose_color_quality("scanned_document.jpg")
print(f"Score: {score:.2f}/5.0 - {diagnosis}")
Multi-Dimensional Quality Assessment
Combine with other DeQA-Doc models for comprehensive assessment:
import torch
from transformers import AutoModelForCausalLM
from PIL import Image
# Load all three models
models = {
"overall": AutoModelForCausalLM.from_pretrained(
"mapo80/DeQA-Doc-Overall", trust_remote_code=True,
torch_dtype=torch.float16, device_map="auto"
),
"color": AutoModelForCausalLM.from_pretrained(
"mapo80/DeQA-Doc-Color", trust_remote_code=True,
torch_dtype=torch.float16, device_map="auto"
),
"sharpness": AutoModelForCausalLM.from_pretrained(
"mapo80/DeQA-Doc-Sharpness", trust_remote_code=True,
torch_dtype=torch.float16, device_map="auto"
),
}
def full_quality_report(image_path):
img = Image.open(image_path).convert("RGB")
scores = {}
for name, model in models.items():
scores[name] = model.score([img]).item()
return scores
report = full_quality_report("document.jpg")
print(f"Overall: {report['overall']:.2f}/5.0")
print(f"Color: {report['color']:.2f}/5.0")
print(f"Sharpness: {report['sharpness']:.2f}/5.0")
Model Architecture
- Base Model: mPLUG-Owl2 (LLaMA2-7B + ViT-L Vision Encoder)
- Vision Encoder: CLIP ViT-L/14 (1024 visual tokens via Visual Abstractor)
- Language Model: LLaMA2-7B
- Training: Full fine-tuning on document color quality datasets
- Input Resolution: Images are resized to 448x448 (with aspect ratio preservation)
Technical Details
| Property | Value |
|---|---|
| Model Size | ~16 GB (float16) |
| Parameters | ~7.2B |
| Input | RGB images (any resolution) |
| Output | Color quality score (1.0 - 5.0) |
| Inference | ~2-3 seconds per image on A100 |
Hardware Requirements
| Setup | VRAM Required | Recommended |
|---|---|---|
| Full precision (fp32) | ~32 GB | A100, H100 |
| Half precision (fp16) | ~16 GB | A100, A40, RTX 4090 |
| With CPU offload | ~8 GB GPU + RAM | RTX 3090, RTX 4080 |
Installation
pip install torch transformers accelerate pillow sentencepiece protobuf
Note: Use transformers>=4.36.0 for best compatibility.
Limitations
- Optimized for document images (may not generalize to natural photos)
- Color assessment is relative to training data distribution
- Black & white documents may receive lower scores (use Overall model instead)
- Requires GPU with sufficient VRAM for efficient inference
Credits & Attribution
This model is based on the DeQA-Doc project by Junjie Gao et al., which won the Championship in the VQualA 2025 DIQA (Document Image Quality Assessment) Challenge.
Original Repository: https://github.com/Junjie-Gao19/DeQA-Doc
All credit for the research, training methodology, and model architecture goes to the original authors.
Citation
If you use this model in your research, please cite the original paper:
@inproceedings{deqadoc,
title={{DeQA-Doc}: Adapting {DeQA-Score} to Document Image Quality Assessment},
author={Gao, Junjie and Liu, Runze and Peng, Yingzhe and Yang, Shujian and Zhang, Jin and Yang, Kai and You, Zhiyuan},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop},
year={2025},
}
ArXiv: https://arxiv.org/abs/2507.12796
License
Apache 2.0
Related Models
- DeQA-Doc-Overall - Overall quality assessment
- DeQA-Doc-Sharpness - Sharpness assessment
- Downloads last month
- 17