31st January 2025 OpenAI’s o3-mini is out today. As with other o-series models it’s a slightly difficult one to evaluate—we now need to decide if a prompt is best run using GPT-4o, o1, o3-mini or (if we have access) o1 Pro. Confusing matters further, the benchmarks in the o3-mini system card (PDF) aren’t a universal win for o3-mini across all categories. It generally benchmarks higher than GPT-4o