← Back to Projects
🚗

GPT-4 Vision Fine-Tuning for Vehicle Detection

Fine-tuned GPT-4 Vision for vehicle make/model/year detection — 74% classification accuracy, enabling downstream automated price estimation.

OpenAI GPT-4 VisionPythonComputer VisionFine-Tuning

Overview

Fine-tuned GPT-4 Vision on a curated dataset of vehicle images to classify make, model, and year — achieving 74% top-1 accuracy on a held-out test set. The model feeds a downstream automated price estimation pipeline, replacing manual vehicle identification.

Problem

The client needed to automate vehicle valuation from photos submitted by users. Existing off-the-shelf CV models lacked the nuanced understanding required to distinguish similar model years and trim levels. GPT-4 Vision’s visual reasoning capability made it the right tool for this domain.

Approach

  1. Dataset curation — sourced and cleaned ~8,000 labelled vehicle images across 200+ make/model/year classes
  2. Prompt engineering baseline — established zero-shot GPT-4V accuracy (~52%) as comparison baseline
  3. Fine-tuning — used OpenAI’s fine-tuning API with structured output schema enforcing make/model/year fields
  4. Evaluation — benchmarked on 1,200-image held-out test set; achieved 74% top-1 accuracy

Key Results

  • 74% classification accuracy on vehicle make/model/year detection
  • +22 percentage points improvement over zero-shot baseline
  • Downstream price estimation error reduced by ~35%

Technical Highlights

  • OpenAI fine-tuning API with vision-capable model checkpoints
  • Structured JSON output schema with confidence scores
  • Python data pipeline for image preprocessing and JSONL training set generation
  • Custom evaluation harness with per-class confusion matrix analysis