For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
This research explores the potential of GPT-4, Claude 3 Opus, Mistral 8x22B and Gemini 1.5 Pro in automating business optimization and decision-making, traditionally reliant on human expertise.

The studies use a dataset of 16 real-world business problems across four optimisation classes: Linear Programming (LP), Integer Programming (IP), Mixed-Integer Programming (MIP), and Nonlinear Programming (NLP). The research compares two pipelines—single-step prompting and Chain-of-Thought prompting—where the large language model (LLM) generates Pyomo solver code to find optimal solutions. Although the LLMs show an ability to model and translate problems into solver code, significant issues were identified in formulating accurate constraints and producing consistent results. These challenges must be addressed to ensure reliable use in future optimisation tasks. Nevertheless, Gemini 1.5 Pro delivered the best performance, followed by Claude 3 Opus, then Mistral 8x22B, and finally GPT-4.