Deep learning is a type of machine learning in which a model learns to perform classification tasks directly from images, text, or sound. Deep learning is usually implemented using a neural network architecture. Deep learning architectures such as deep neural networks, deep belief networks, recurrent neural networks and convolutional neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases superior to human experts.
Deep learning approaches can be categorized as follows: Supervised, semi-supervised or partially supervised, and unsupervised. In supervised machine learning (ML), the ML system learns by example to recognize things, such as objects in images or speech. Humans provide these examples to ML systems during their training in the form of labeled data. With enough labeled data, we can generally build accurate pattern recognition models.
Labeling is an indispensable stage of data preprocessing in supervised learning. Historical data with predefined target attributes (values) is used for this model training style. An algorithm can only find target attributes if a human mapped them. The problem is that training accurate models currently requires lots of labeled data.
For tasks like machine translation, speech recognition or object recognition, deep neural networks (DNNs) have emerged as the state of the art, due to the superior accuracy they can achieve. To gain this advantage over other techniques, however, DNN models need more data, typically requiring 109 or 1010 labeled training examples to achieve good performance.
For example for conducting sentiment analysis of a company’s reviews on social media and tech site discussion sections allows businesses to evaluate their reputation and expertise compared with competitors. It also gives the opportunity to research industry trends to define the development strategy.
You will need to collect and label at least 90,000 reviews to build a model that performs adequately. Assuming that labeling a single comment may take a worker 30 seconds, he or she will need to spend 750 hours or almost 94 work shifts averaging 8 hours each to complete the task or three months. Considering that the median hourly rate for a data scientist in the US is $36.27, labeling will cost you $27,202.5.
The commercial world has harvested and created large sets of labeled data for training models. These datasets are often created via crowdsourcing: a cheap and efficient way to create labeled data. Unfortunately, crowdsourcing techniques are often not possible for proprietary or sensitive data. Creating data sets for these sorts of problems can result in 100x higher costs and 50x longer time to label.
To make matters worse, machine learning models are brittle, in that their performance can degrade severely with small changes in their operating environment. For instance, the performance of computer vision systems degrades when data is collected from a new sensor and new collection viewpoints. Similarly, dialog and text understanding systems are very sensitive to changes in formality and register. As a result, additional labels are needed after initial training to adapt these models to new environments and data collection conditions. For many problems, the labeled data required to adapt models to new environments approaches the amount required to train a new model from scratch.
The Learning with Less Labels (LwLL) program aims to make the process of training machine learning models more efficient by reducing the amount of labeled data required to build a model by six or more orders of magnitude, and by reducing the amount of data needed to adapt models to new environments to tens to hundreds of labeled examples.
In order to achieve the massive reductions of labeled data needed to train accurate models, the LwLL program will focus on two technical objectives:
- Development of learning algorithms that learn and adapt efficiently; and
- Formally characterize machine learning problems and prove the limits of learning and adaptation.
DARPA awards LWLL contract to UBC’s Department of Computer Science
In another big win for the Department of Computer Science, UBC has been awarded a $1.3 million grant from DARPA, the US government agency researching breakthrough technology in the defence industry.
Learning With Less Labels (LWLL) develops more efficient machine learning by massively reducing the amount of labeled data needed to train accurate models. The three-year contract will be overseen by Leonid Sigal and Frank Wood as principal investigators, and delivered by Dr. Wood’s research group Programming Languages for Artificial Intelligence (PLAI).
The LWLL program is a subcontract through Charles River Analytics led by their Chief Scientist Avi Pfeffer and DARPA Program Manager Bruce Draper.
As Frank Wood explains, “deep learning is revolutionary but requires huge amounts of labelled training data. Our research under this program is about figuring out ways to make deep learning systems use less data. Leon, myself, and other members of the CRA team have long-standing expertise in leveraging side-information to achieve just this, particularly in visual classification, localization, and segmentation tasks.”