About

Deep learning has radically advanced the state-of-the-art in machine learning and computer vision. Nevertheless, progress has been driven almost entirely by empirical observations, hacks, and tricks. Due to our lack of understanding, every stage of the classical Design/Build/Test (DBT) engineering methodology is broken. Today's deep learning users must (Design) choose an ad-hoc deep network architecture, (Build) train the DN using an optimizer with hyper-parameters chosen by trial and error, and (Test) produce a model that may be fragile to minor domain shifts, training set outliers, and adversarial perturbations. Without a theoretical foundation, deep learning will continue to result in poorly understood and fragile systems that are not appropriate for a large class of critical applications, particularly for DoD use cases in which reliability is of utmost importance. This MURI project is developing a principled theory of deep learning that is based on rigorous mathematical principles. We are investigating three interconnected research thrusts that address the foundational issues in the Design/Build/Test pipeline.

Design Thrust: Mathematical issues in deep network design. Today’s deep network design process is alchemistic and based on trial and error. We address this issue by developing an approximation theory for deep networks based on spline functions that is compatible with the overparameterization of current deep networks.

Build Thrust: Mathematical issues in deep network training. The process of training deep networks requires enormous resources for both mining datasets and optimization. Ad hoc training approaches require intractable amounts of labelled data and yield models with unpredictable behaviors. Furthermore, the role of implicit regularization in training is a major enigma that has broken our understanding of model fitting. We aim to understand the build process through new theories from partial differential equations that explain how a deep network's architecture interacts with optimizers, statistical methods to understand the implicit bias of stochastic optimizers, and principled methods for learning from less data.

Test Thrust: Mathematical issues in characterizing deep network performance. The life of a deep-learning agent does not end when training is over. Deep networks are deployed in complex environments with out-of-sample data, adversarial attacks, and complex high-dimensional inputs. We are developing new methods to verify network performance using formal methods, quantify uncertainty using statistical methods, guarantee robustness to data corruption, and detect manipulated and adversarial inputs.

Validation with 4D (space+time) action recognition. Throughout this research effort, we motivate and validate our theoretical developments with a range of challenging applications in action recognition and video processing.

Impact on DoD capabilities. Most DoD machine learning systems must train using small datasets that are label-poor, and must operate in safety-critical situations where reliability is of utmost importance. This MURI project enhances DoD capabilities by enabling better learning with less data and by producing a suite of methods to improve and certify the reliability of deep network models.