Interview with OR2025 Semi-Plenary Speaker Prof. Manuel López-Ibáñez

This interview was conducted by Prof. Kevin Tierney on the occasion of my semi-plenary talk “The Future of Optimisation Research” at the OR2025 Conference in Bielefeld. The interview was published by OR News Nr. 84 August 2025.

Manuel López-Ibáñez (MLI): Optimisation has been widely successful at solving difficult problems and it is widely used in practical applications. Nevertheless, there are several research practices that are holding optimisation research back. In addition, recent developments in commercial solvers and machine learning, Large Language Models (LLMs) in particular, are posing new challenges as well as new opportunities.

KT: In the abstract of your talk, you bring up some important criticisms of the field right now. Let’s talk first about novelty. What is the challenge facing the field of OR and what needs to be done to improve the novelty of our approaches?

MLI: There are two main aspects to novelty in optimisation research: novel methods and novel applications. With regard to novel methods, the most egregious example are the increasing plethora of metaphor-inspired metaheuristics that lack any novelty (Aranha et al. 2022). But there are also less obvious examples of repeatedly rediscovering the same idea over and over again. In addition, it has been pointed out that a great deal of optimisation research involves a kind of “up-the-wall game”, where the goal is to “outperform” a selected number of “competitors” on well-known artificial “benchmark”, without any understanding of why the new method performs “better” than the previous ones. There is significant research to be done in synthesising and formalising algorithmic ideas in optimisation, such that algorithms are not classified by an arbitrary “nickname”, but by the core ideas and features of their algorithmic components.

With respect to novel applications, it is not difficult nowadays to find papers that address a very particular variant of some problem, without any intention of generalising the peculiarities of the problem to any problem class or finding connections with similar problems. In the most blatant cases, the actual model is nearly indistinguishable from problems that are more widely studied, but the authors lean on qualitative peculiarities to claim a degree of novelty and avoid the work of finding connections or comparing with previous works. The opposite of this practice are works that systematically categorise the variants of a problem class by their features and provide insights on how those features should guide the design of solution methods (Vidal et al. 2014).

KT: Regarding reproducibility, what should authors be focusing on to improve this?

MLI: “Reproducibility” is an overloaded term that has multiple meanings. In M. López-Ibáñez, Branke, and Paquete (2021), we distinguish four levels of reproducibility and argue that the lowest level, Repeatability, (i.e., exactly repeating the original experiment generates precisely the same results) is useful, but the scientific goal should be Replicability (i.e., it is possible to independently repeat the experiment and reach the same conclusion). Although an important part of optimisation research is mathematical, a large part is empirical and, without reproducibility, empirical research is not scientific. But there are other benefits of making your research reproducible. Making code and data publicly available helps a research field progress more rapidly by allowing newcomers to easily build upon the work of others and more quickly identify errors and misconceptions. Completely reproducible experimental setups help in identifying biases or “overfitting” that lead to conclusions failing to generalise beyond the specific conditions evaluated in a study. Platforms like Zenodo (https://zenodo.org) allow researchers to upload not only the specific code used for an experiment, but also the data generated and the code and data necessary for reproducing the analysis. However, the main obstacles to reproducibility are cultural and not technical, and the main drivers of change would be the policies of the research funding bodies and of the large and influential journals.

KT: Can you give some examples of questionable research practices? What should we be watching out for?

MLI: There are three papers that should be mandatory reading for anyone doing empirical research in optimisation (full citations are provided below):

Unfortunately, not only we still see in recent publications many of the pitfalls that those papers warn about, but some of those pitfalls have become easier to fall into and harder to detect. For example, the wide availability of computational resources means that it is easier than ever to carry out a large amount of manual or automatic fine-tuning of algorithmic parameters, and never report any of the details (much less make the tuning fully reproducible). Even in the case of deterministic and exact methods, it is well-known that irrelevant changes to the input data may lead to large differences in behaviour, thus it has become easier to (intentionally or not) “play” with the data and the methods until they “work”. Algorithms may also have “structural bias” towards some regions of the search space (Kononova et al. 2015), thus possibly having an advantage in particular benchmarks. Such structural bias is much harder to identify in complex algorithms applied to complex problems.

Finally, there are some questionable practices that are common to many scientific fields but increasingly prevalent in optimisation. One example is citation inflation, where authors cite recent papers, sometimes from colleagues or themselves, when referencing well-known concepts, instead of the seminal papers that actually proposed the concept. Some papers cite a long list of only tangentially related works, a practice that may be intentionally motivated by citation cartels or unintentionally by “bandwagon citations” (i.e., citing particular papers because others have cited the same papers in similar contexts). Such practices have resulted in astonishingly inflated citations to papers and authors that you have probably never heard of, but are the most cited works and researchers in the optimisation field. Maybe the answer is that we should not care at all about citations, but I am not sure that ignoring the problem of citation inflation is the best course of action.

KT: The papers you cite are all quite old; if people are still not following these guidelines 25+ years later, clearly something fundamental in our field needs to change, does it not?

MLI: There has been some progress. Some journals (Journal of Heuristics 2015) have adopted policies aimed at enforcing some of those guidelines. Papers that perform statistical analysis and report the details of the parameter tuning process are not uncommon nowadays. And I hope that a majority of those attending my talk at OR 25 will be aware of most of those issues. Thus, it is not my goal to discuss those well-known pitfalls and questionable research practices in detail. My argument is that there is an increasingly large amount of research published in optimisation that suffers from those issues and ignoring the problem will not make it go away.

KT: In your talk’s abstract, you mention the idea of “human-in-the-optimisation-loop.” What is new here and what opportunities does this paradigm hold?

MLI: Digital transformation is reaching more businesses and organisations, with processes becoming increasingly simulated and automatised. Nevertheless, there is still a need for a human to make decisions about trade-offs among conflicting objectives and constraints. Interactive optimisation, either for tackling multi-objective problems or problems with unspecified objectives, is not a new concept. However, in an increasingly dynamic and digital environment, there is a need for systems with the ability to dynamically define objectives and constraints, and for solving methods capable of adapting to those changes. Techniques in preference modelling and machine learning are able to quickly learn user-preferences and guide the search towards the most preferred solutions without the need of generating a complete approximation to the Pareto frontier. In addition, given the advances in LLMs and the availability of flexible solvers, it is not unthinkable that end-users, with no knowledge in optimisation, will be able to interactively describe their problem in natural language and design a solution method that suits their needs. Few works in optimisation consider the interaction with end-users, and most of them model such users as unbounded rational agents, without any cognitive biases, guided by a static utility function. Optimisation researchers need to start talking with colleagues in behavioural OR and cognitive science working on decision-making to incorporate more realistic models of human interaction within the design of their optimisation methods.

KT: Regarding LLMs, what promise and pitfalls do you think these offer for OR?

MLI: The publication of FunSearch (Romera-Paredes et al. 2024) and AlphaEvolve (Nokinov et al. 2025) have shown that LLMs are capable of generating human-level heuristics for well-known optimisation problems. A large part of the optimisation community are understandably not in awe by these results, as generative hyper-heuristics and genetic programming have been able to do the same for at least a decade and with a much smaller carbon footprint. But a major difference with respect to those techniques is the potential and flexibility of the LLM approach. Once the LLM system is setup, it becomes relatively easy to generate heuristics for any problem in any programming language. In recent work that I will present at the talk, we are exploring the capabilities and limitations of LLMs when generating heuristics for increasingly complex problems. They turned out to be more capable than I expected!

Beyond their ability for generating heuristics, LLMs also have the potential of becoming a user-friendly interface to models and solvers, by converting natural language descriptions of objectives and constraints into code that models the problem and invokes the appropriate solver. It is easy to construct examples where this works sufficiently well, but it is equally easy to construct examples where the LLM produces subtle errors or complete nonsense. In the hands of an optimisation expert, it can be a helpful tool (when it works). But without an expert capable of ensuring the correctness of the output, I would not risk deploying anything generated by an LLM for any serious application.

Nevertheless, there is a risk that good “marketing” wins over good science, by providing a biased picture of what LLMs are actually capable of doing relative to other well-known methods. Such a biased picture results both in “AI hype”, which influences decisions about research funding and promotion, as well as in “AI scepticism”, which leads to researchers avoiding LLMs as a method worth of study.

KT: Having started your career in Computer Science, when did you find Operations Research and what drew you to it?

MLI: The first optimisation problem that caught my attention was the bi-objective quadratic assignment problem (QAP), which I studied during my Diplomarbeit at the University of Darmstadt under the supervision of Thomas Stützle and Luís Paquete. The QAP has a rather simple formulation, but it is somewhat hard to optimise, even when using heuristics. The bi-objective variant has some interesting properties. For example, the correlation between the objectives and the structure of the flow matrices leads to quite different Pareto fronts and algorithmic behaviours. In addition, my PhD work on optimisation applied to water distribution networks showed me the potential of optimisation for real-world impact, as well as the challenges to realising that potential.

One of my favourite ideas in optimisation is the core idea of what is commonly known as “Iterated local search” (Lourenço, Martin, and Stützle 2003), that is, alternating small changes to quickly reach local optima with large changes to escape the basin of attraction of that local optima. It illustrates the fundamental exploration versus exploitation trade-off that arises in optimisation, but also in many other domains (decision-making, machine learning, etc.) This very simple idea keeps being applied within optimisation in different forms (Martin, Otto, and Felten 1991; Wales and Doye 1997; Loshchilov and Hutter 2017). It is also easy to formulate many other methods as variants or generalisations of this idea (Mascia et al. 2014): (G)VNS, Large Neighbourhood Search, iterated greedy, ruin-and-recreate, and hybrids of a global optimiser (e.g., an evolutionary algorithm) and a local one, among others.

A less-known idea that I find rather fascinating is the empirical attainment function (EAF) (Grunert da Fonseca and Fonseca 2010), which describes the probabilistic distribution of sets of multi-dimensional vectors, and has applications in analysing the behaviour of multi-objective optimisers as well as the anytime behaviour of single-objective optimisers (Manuel López-Ibáñez et al. 2025). I believe there are still lot of ideas and applications to be unlocked from the EAF.

KT: How do you achieve real-world impact with your methods? What tips would you give to other researchers?

MLI: This is just my personal experience rather than any kind of recipe for “success”.

I have found very beneficial to make my code available and easy to use by others. This goes beyond publishing the code to enable reproducibility. As an example, I am the main developer of the “irace” tool for automatic algorithm configuration (https://mlopez-ibanez.github.io/irace/). I have been developing irace now for more than 13 years and I still use irace in my own research and teaching. It is specially rewarding when people tell me irace has helped them in some way. I know of researchers who learned about automatic algorithm configuration via irace, but ended up using a different tool. And that is OK, because irace was still useful for allowing me to teach about the benefits of automatic algorithm configuration.

Another “tip” is to learn about and work with a variety of optimisation approaches and techniques. It is often repeated yet still true that: “if the only tool you have is a hammer, you will treat everything as if it were a nail”.

I would also recommend talking to businesses and organisations that are facing optimisation problems, even if they are not immediately interested in research collaborations. This can take the form of knowledge exchange programmes, innovation factories, pro-bono or paid consulting, or simply having a chat with students in Executive Education or speakers at business events. The connections with businesses and non-profit organisations is to me one of the main benefits of working at Alliance Manchester Business School. Even if those connections do not directly result in a research publication, I believe such interactions provide insights about the challenges and barriers for solving practical real-world problems.

KT: As an active researcher in the area of algorithm configuration, I am especially impressed by your long commitment to providing irace to the community, it has certainly made an impact.

MLI: First, the good news: it is becoming easier. More OR journals are accepting papers in the intersection of learning and optimisation. Nevertheless, it is also a good idea to try to publish in AI/ML conferences or conferences that have reviewers with expertise in ML (such as ACM GECCO), to get feedback on your learning approach. Studies that tackle optimisation problems are now routinely presented at AI/ML conferences, including top ones such as NeurIPS.

The bad news is that it is not always easy. Research practices such as actively avoiding overfitting, fully-specified hyper-parameter tuning and empirical reproducibility are not (yet) routine in optimisation research (as discussed above), while they are often considered minimum requirements for acceptance in ML. I have seen expert reviewers from the ML field refuse to review optimisation papers simply because the code and data were not available or there was no clear separation between “training” and “testing” problem instances. On the other hand, some optimisation papers published in top AI/ML conferences would be rejected in OR journals due to tackling problems that are considered too small by today’s standards or lacking a comparison with the state-of-the-art methods for those problems. In my opinion, the best works are able to meet the standards of both research communities. Another mistake that I see sometimes in works combining learning and optimisation is the lack of validation of what the learning component is actually learning and how it contributes to solving the problem (beyond possibly introducing random perturbations and thus, increasing exploration).

And, of course, works on the border of learning and optimisation are welcome at ACM TELO (https://telo.acm.org)!

KT: Thanks for answering my questions and I look forward to your semi-plenary at OR 25 in Bielefeld!

References

Aranha, C., C. L. Camacho Villalón, F. Campelo, and et al. 2022. “Metaphor-Based Metaheuristics, a Call for Action: The Elephant in the Room.” Swarm Intelligence. https://doi.org/10.1007/s11721-021-00202-9.

Gent, I. P., Stuart A. Grant, Ewen MacIntyre, Patrick Prosser, Paul Shaw, Barbara M. Smith, and Toby Walsh. 1997. “How Not To Do It.” Technical Report. School of Computer Studies, University of Leeds.

Grunert da Fonseca, V., and C. M. Fonseca. 2010. “The Attainment-Function Approach to Stochastic Multiobjective Optimizer Assessment and Comparison.” Edited by T. Bartz-Beielstein, M. Chiarandini, L. Paquete, and M. Preuss. Experimental Methods for the Analysis of Optimization Algorithms. Berlin/Heidelberg: Springer.

Hooker, J. N. 1996. “Testing Heuristics: We Have It All Wrong.” Journal of Heuristics.

Johnson, D. S. 2002. “A Theoretician’s Guide to the Experimental Analysis of Algorithms.” Edited by M. H. Goldwasser, D. S. Johnson, and C. C. McGeoch. Data Structures, Near Neighbor Searches, and Methodology: Fifth and Sixth DIMACS Implementation Challenges. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. Providence, RI: American Mathematical Society.

Journal of Heuristics. 2015. “Policies on Heuristic Search Research.” https://link.springer.com/journal/10732/updates/17199246.

Kononova, A. V., D. W. Corne, P. De Wilde, V. Shneer, and F. Caraffini. 2015. “Structural Bias in Population-Based Algorithms.” Information Sciences. https://doi.org/10.1016/j.ins.2014.11.035.

López-Ibáñez, Manuel, Diederick Vermetten, Johann Dreo, and Carola Doerr. 2025. “Using the Empirical Attainment Function for Analyzing Single-Objective Black-Box Optimization Algorithms.” IEEE Transactions on Evolutionary Computation. https://doi.org/10.1109/TEVC.2024.3462758.

López-Ibáñez, M., J. Branke, and L. Paquete. 2021. “Reproducibility in Evolutionary Computation.” ACM Transactions on Evolutionary Learning and Optimization. https://doi.org/10.1145/3466624.

Loshchilov, I., and F. Hutter. 2017. “SGDR: Stochastic Gradient Descent with Warm Restarts.” Proceedings of ICLR 2017.

Lourenço, H. R., O. Martin, and T. Stützle. 2003. “Iterated Local Search.” Handbook of Metaheuristics. International Series in Operations Research & Management Science. Kluwer Academic Publishers. https://doi.org/10.1007/0-306-48056-5_11.

Martin, O., S. W. Otto, and E. W. Felten. 1991. “Large-Step Markov Chains for the Traveling Salesman Problem.” Complex Systems.

Mascia, F., M. López-Ibáñez, J. Dubois-Lacoste, and T. Stützle. 2014. “Grammar-Based Generation of Stochastic Local Search Heuristics Through Automatic Algorithm Configuration Tools.” Computers & Operations Research. https://doi.org/10.1016/j.cor.2014.05.020.

Nokinov, A., N. Vu, M. Eisenberger, and et al. 2025. “AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery.” Google DeepMind. https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf.

Romera-Paredes, B., M. Barekatain, A. Novikov, and et al. 2024. “Mathematical Discoveries from Program Search with Large Language Models.” Nature. https://doi.org/10.1038/s41586-023-06924-6.

Vidal, T., T. G. Crainic, M. Gendreau, and C. Prins. 2014. “A Unified Solution Framework for Multi-Attribute Vehicle Routing Problems.” European Journal of Operational Research.

Wales, D. J., and J. P. K. Doye. 1997. “Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms.” Journal of Physical Chemistry A.