← Back to Projects
🚗
GPT-4 Vision Fine-Tuning for Vehicle Detection
Fine-tuned GPT-4 Vision for vehicle make/model/year detection — 74% classification accuracy, enabling downstream automated price estimation.
OpenAI GPT-4 VisionPythonComputer VisionFine-Tuning
Overview
Fine-tuned GPT-4 Vision on a curated dataset of vehicle images to classify make, model, and year — achieving 74% top-1 accuracy on a held-out test set. The model feeds a downstream automated price estimation pipeline, replacing manual vehicle identification.
Problem
The client needed to automate vehicle valuation from photos submitted by users. Existing off-the-shelf CV models lacked the nuanced understanding required to distinguish similar model years and trim levels. GPT-4 Vision’s visual reasoning capability made it the right tool for this domain.
Approach
- Dataset curation — sourced and cleaned ~8,000 labelled vehicle images across 200+ make/model/year classes
- Prompt engineering baseline — established zero-shot GPT-4V accuracy (~52%) as comparison baseline
- Fine-tuning — used OpenAI’s fine-tuning API with structured output schema enforcing make/model/year fields
- Evaluation — benchmarked on 1,200-image held-out test set; achieved 74% top-1 accuracy
Key Results
- 74% classification accuracy on vehicle make/model/year detection
- +22 percentage points improvement over zero-shot baseline
- Downstream price estimation error reduced by ~35%
Technical Highlights
- OpenAI fine-tuning API with vision-capable model checkpoints
- Structured JSON output schema with confidence scores
- Python data pipeline for image preprocessing and JSONL training set generation
- Custom evaluation harness with per-class confusion matrix analysis