🚗

GPT-4 Vision Fine-Tuning for Vehicle Detection

Fine-tuned GPT-4 Vision for vehicle make/model/year detection | 74% classification accuracy, enabling downstream automated price estimation.

OpenAI GPT-4 VisionPythonComputer VisionFine-Tuning

Overview

Fine-tuned GPT-4 Vision on a curated dataset of vehicle images to classify make, model, and year, achieving 74% top-1 accuracy on a held-out test set. The model feeds a downstream automated price estimation pipeline, replacing manual vehicle identification.

Problem

The client needed to automate vehicle valuation from photos submitted by users. Existing off-the-shelf CV models lacked the nuanced understanding required to distinguish similar model years and trim levels. GPT-4 Vision’s visual reasoning capability made it the right tool for this domain.

Approach

Dataset curation: sourced and cleaned ~8,000 labelled vehicle images across 200+ make/model/year classes
Prompt engineering baseline: established zero-shot GPT-4V accuracy (~52%) as comparison baseline
Fine-tuning: used OpenAI’s fine-tuning API with structured output schema enforcing make/model/year fields
Evaluation: benchmarked on 1,200-image held-out test set; achieved 74% top-1 accuracy

Key Results

74% classification accuracy on vehicle make/model/year detection
+22 percentage points improvement over zero-shot baseline
Downstream price estimation error reduced by ~35%

Technical Highlights

OpenAI fine-tuning API with vision-capable model checkpoints
Structured JSON output schema with confidence scores
Python data pipeline for image preprocessing and JSONL training set generation
Custom evaluation harness with per-class confusion matrix analysis

← All Projects