Useful thoughts, technical. Considering the worst case.
Algorithms with Predictions By Michael Mitzenmacher, Sergei Vassilvitskii
Communications of the ACM, July 2022, Vol. 65 No. 7, Pages 33-35 10.1145/3528087
The theoretical study of algorithms and data structures has been bolstered by worst-case analysis, where we prove bounds on the running time, space, approximation ratio, competitive ratio, or other measure that holds even in the worst case. Worst-case analysis has proven invaluable for understanding aspects of both the complexity and practicality of algorithms, providing useful features like the ability to use algorithms as building blocks and subroutines with a clear picture of the worst-case performance. More and more, however, the limitations of worst-case analysis become apparent and create new challenges. In practice, we often do not face worst-case scenarios, and the question arises of how we can tune our algorithms to work even better on the kinds of instances we are likely to see, while ideally keeping a rigorous formal framework similar to what we have developed through worst-case analysis.
A key issue is how we can define the subset of "instances we are likely to see." Here we look at a recent trend in research that draws on machine learning to answer this question. Machine learning is fundamentally about generalizing and predicting from small sets of examples, and so we model additional information about our algorithm's input as a "prediction" about our problem instance to guide and hopefully improve our algorithm. Of course, while ML performance has made tremendous strides in a short amount of time, ML predictions can be error-prone, with unexpected results, so we must take care in how much our algorithms trust their predictors. Also, while we suggest ML-based predictors, predictions really can come from anywhere, and simple predictors may not need sophisticated machine learning techniques. For example, just as yesterday's weather may be a good predictor of today's weather, if we are given a sequence of similar problems to solve, the solution from the last instance may be a good guide for the next.
What we want, then, is merely the best of both worlds. We seek algorithms augmented with predictions that are:
Consistent: when the predictions are good, they are near-optimal on a per instance basis;
Robust: when the predictions are bad, they are near-optimal on a worst-case basis;
Smooth: the algorithm interpolates gracefully between the robust and consistent settings; and
Learnable: we can learn whatever we are trying to predict with sufficiently few examples.
Our goal is a new approach that goes beyond worst-case analysis.14 We identify the part of the problem space that a deployed algorithm is seeing and automatically tune its performance accordingly.
As a natural starting example, let us consider binary search with the addition of predictions. When looking for an element in a large sorted array, classical binary search compares the target with the middle element and then re-curses on the appropriate half (see Figure 1). Consider, however, how we find a book in a bookstore or library. If we are looking for a novel by Isaac Asimov, we start searching near the beginning of the shelf, and then look around, iteratively doubling our search radius if our initial guess was far off (see Figure 2). We can make this precise to show that there is an algorithm with running time logarithmic in the error of our initial guess (measured by how far off we are from the correct location), as opposed to being logarithmic in the number of elements in the array, which is the standard result for binary search. Since the error is no larger than the size of the array, we obtain an algorithm that is consistent (small errors allow us to find the element in constant time) and robust (large errors recover the classical O(log n) result, albeit with a larger constant factor). ... '
No comments:
Post a Comment