Thoughtful and useful piece. Though I don't seen how this is necessarily universally optimal, which is usually a broad claim. Link to full and technical paper below.
How to Construct the Optimal Neural Architecture for Your Machine Learning Task
By Adrian de Wynter
Alexa Alexa research Alexa science
The first step in training a neural network to solve a problem is usually the selection of an architecture: a specification of the number of computational nodes in the network and the connections between them. Architectural decisions are generally based on historical precedent, intuition, and plenty of trial and error.
In a theoretical paper I presented last week at the 28th International Conference on Artificial Neural Networks in Munich, I show that the arbitrary selection of a neural architecture is unlikely to provide the best solution to a given machine learning problem, regardless of the learning algorithm used, the architecture selected, or the tuning of training parameters such as batch size or learning rate.
Rather, my paper suggests, we should use computational methods to generate neural architectures tailored to specific problems. Only by considering a vast space of possibilities can we identify an architecture that comes with theoretical guarantees on the accuracy of its computations.
In fact, the paper is more general than that. Its results don’t just apply to neural networks. They apply to any computational model, provided that it’s Turing equivalent, meaning that it can compute any function that the standard computational model — the Turing machine — can.
To be more specific, we must introduce the function approximation problem. This is a common mathematical formulation of what machine learning actually does: given a function (i.e., your model) and a set of samples, you search through the parameters of the function so that it approximates the outputs of a target function (i.e., the distribution of your data). ..... "
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment