OpenAI’s new AI agent benchmark competition Oct 11, 2024 OpenAI introduced MLE-bench, a new benchmark designed to evaluate how well AI agents perform on real-world machine learning engineering tasks...