is concerned with the design and development of algorithms and techniques that allow computers to 'learn.' The major focus of machine learning research is to extract information from data automatically, using computational and statistical methods. This extracted information may then be generalized into rules and patterns.
[is a] type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed, and focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.
[a] field of AI made up of a set of techniques and algorithms that can be used to "train" a machine to automatically recognise patterns in a set of data. By recognising patterns in data, these machines can derive models that explain the data and/or predict future data. In summary, it is a machine that can learn without being explicitly programmed to perform the task.
Modern machine learning is a statistical process that starts with a body of data and tries to derive a rule or procedure that explains the data or can predict future data. This approach — learning from data — contrasts with the older "expert system” approach to AI, in which programmers sit down with human domain experts to learn the rules and criteria used to make decisions, and translate those rules into software code. An expert system aims to emulate the principles used by human experts, whereas machine learning relies on statistical methods to find a decision procedure that works well in practice.
An advantage of machine learning is that it can be used even in cases where it is infeasible or difficult to write down explicit rules to solve a problem. In a sense, machine learning is not an algorithm for solving a specific problem, but rather a more general approach to finding solutions for many different problems, given data about them.
To apply machine learning, a practitioner starts with a historical data set, which the practitioner divides into a training set and a test set. The practitioner chooses a model, or mathematical structure that characterizes a range of possible decision-making rules with adjustable parameters. A common analogy is that the model is a "box" that applies a rule, and the parameters are adjustable knobs on the front of the box that control how the box operates. In practice, a model might have many millions of parameters.
The practitioner also defines an objective function used to evaluate the desirability of the outcome that results from a particular choice of parameters. The objective function will typically contain parts that reward the model for closely matching the training set, as well as parts that reward the use of simpler rules.
Training the model is the process of adjusting the parameters to maximize the objective function. Training is the difficult technical step in machine learning. A model with millions of parameters will have astronomically more possible outcomes than any algorithm could ever hope to try, so successful training algorithms have to be clever in how they explore the space of parameter settings so as to find very good settings with a feasible level of computational effort.
Once a model has been trained, the practitioner can use the test set to evaluate the accuracy and effectiveness of the model. The goal of machine learning is to create a trained model that will generalize — it will be accurate not only on examples in the training set, but also on future cases that it has never seen before. While many of these models can achieve better-than-human performance on narrow tasks such as image labeling, even the best models can fail in unpredictable ways. For example, for many image labeling models it is possible to create images that clearly appear to be random noise to a human but will be falsely labeled as a specific object with high confidence by a trained model.
Researchers use several methods to train machine learning algorithms, including:
- Supervised machine learning. An algorithm with labeled data or input identifies logical patterns in the data and uses those patterns to predict a specified answer to a problem. For example, an algorithm trained on many labeled images of cats and dogs could then classify new, unlabeled images as containing either a cat or a dog.
- Unsupervised machine learning. An algorithm with unlabeled data that allows the algorithm to identify structure in the inputs, for example by clustering similar data, without a preconceived idea of what to expect. In this technique, an algorithm could, for example, cluster images into groups based on similar features, such as a group of cat images and a group of dog images, without being told that the images in the training set are those of cats or dogs.
- Semisupervised learning. An algorithm with a training set that is partially labeled uses the labeled data to determine a pattern and apply labels to the remaining data.
- 2011 Data Mining Report to Congress, at 9 n.37.
- A State Cyber Hub Operations Framework, at 61.
- Unboxing Artificial Intelligence: 10 steps to protect Human Rights, at 24.
- Cyberspace Solarium Commission - Final Report, at 136.
- "Formally, machine learning is a sub-field of artificial intelligence. However, in recent years, some organizations have begun using the terms artificial intelligence and machine learning interchangeably." Machine Learning Glossary (full-text).
- "Overview" section: Preparing for the Future of Artificial Intelligence, at 8-9.
- "Training machine learning algorithms" section: Technology Assessment: Artificial Intelligence in Health Care: Benefits and Challenges of Technologies to Augment Patient Care, at 3.
- Frank Chen, "AI, Deep Learning, and Machine Learning: A Primer," Andreessen Horowitz (June 10, 2016) (full-text).