RESEARCH

Our areas of research and product development are focused on the following hard problems faced by organizations today

Training with Less Data

While organizations today have large amounts of data, their datasets tend to be noisy, incomplete and imbalanced. This results in data scientists and engineers spending most of their time pre-processing, cleaning, and featurizing the data. These efforts are often insufficient, and deep learning techniques routinely fail on sparse datasets. Organizations are then forced to use classical machine learning techniques that require enormous amounts of manual feature engineering. At RealityEngines.AI, we are actively pursuing the following research areas that will enable training on less data.

Meta-Learning - Deep learning models typically require training with a large number of training samples. On the other hand, humans learn concepts and skills much more quickly and efficiently. We only need a few examples to tell lions apart from cats. Meta-learning is a sub-field of machine learning that aims to teach models how to learn. We hope to build on the work outlined by Model-Agnostic Meta-Learning (MAML) and first-order Meta-Learning Algorithms. The MAML algorithm provides a good initialization of a model’s parameters to achieve an optimal fast learning on a new task with only a small number of gradient steps while avoiding overfitting that may happen when using a small dataset. Our service uses principles of meta-learning to create robust models even when you have a small number of training examples.
Generative Models for Dataset Augmentation - Dataset augmentation is a technique to synthetically expand a training dataset by applying a wide array of domain-specific transformations. This is a particularly useful tool for small datasets, and it is even shown to be effective on large datasets like Imagenet. While it is a standard tool used in training supervised deep learning models, it requires extensive domain knowledge, and the transformations must be designed and tested carefully. Over the last 2 years, Generative Adversarial Networks (GANs) have been used successfully for dataset augmentation in various domains including computer vision, anomaly detection, and forecasting. The use of GANs makes dataset augmentation possible even with little or no domain-specific knowledge. Fundamentally, GANs learn how to produce data from a dataset that is indistinguishable from the original data. However, there are some practical issues with using GANs, and training a GAN is notoriously difficult. GANs have been a very active area of research, and several new types of GANs including Wasserstein GANs and MMD GANs address some of these issues. Recently, there has also been some work on domain-agnostic GAN implementation for dataset augmentation. At RealityEngines.AI, we are innovating on the state-of-the-art GAN algorithms that can perform well on noisy and incomplete datasets. We have innovated on Data Augmentation Generative Adversarial Networks to create synthetic datasets that can be combined with original datasets to create more robust models. The demo on our homepage is based on GANs. We plan to publish a few papers in this area over the course of this year.
Combining Neural Nets with Logic Rules/Specifications - The cognitive process of human beings indicates that people learn not only from concrete examples (as deep neural nets do) but also from different forms of general knowledge and rich experiences. It’s difficult to encode human intention to guide the models to capture desired patterns. In fact, most enterprise systems today are rule-based. Experts have encoded rules based on tribal knowledge from their domains. ML models that are built to replace these rule-based systems often struggle to beat them on accuracy, especially when there is sparse data. At RealityEngines.AI, we want to preserve expert knowledge by developing hybrid systems that combine logic rules with neural nets. While there is some recent research in this area, including a recent paper by DeepMind that lays the groundwork for a general-purpose, constraint-driven AI, it is still nascent. Most research papers don’t address building these hybrid models at scale or incorporating multiple rules into the models. RealityEngines.AI is working on a service that allows developers and data scientists to specify multiple knowledge rules along with training data to develop accurate models. For example, there may be a rule that ‘dog owners tend to like buying dog toys’ in a recommender system or a constraint that a learned dynamic system must be consistent with physical law. Our publication in this area combines first-order logic constraints with conventional supervised learning.
Transfer Learning - Transfer learning is a machine learning technique that allows us to reuse policies from one domain or dataset on a related domain or dataset. By using transfer learning, we enable organizations to train models in a simulated environment and apply them in the real world. State-of-the-art language and vision modeling techniques typically pre-train on a large dataset, then either use fine-tuning or transfer learning to train a custom model on the target dataset. RealityEngines.AI packages and extends the state-of-the-art transfer learning techniques that result in the most performant models. As part of our service, we plan to package pre-trained language and vision models. We’ll also make it easy to fine-tune those models or apply transfer learning to adapt them for a custom task.
 
Differentiable Functions for Combining First-order Constraints with Deep Learning via Weighted Proof Tracing
Author: Naveen Sundar Govindarajulu and Colin White
 
AI-Assisted ML

Deep learning has seen great success across a wide variety of domains. The best neural architectures are often carefully constructed by seasoned deep learning experts in each domain. For example, years of experimentation have shown how to arrange bidirectional transformers to work well for language tasks and dilated separable convolutions for image tasks. A relatively new sub-field of deep-learning deals with automated machine learning, or as we prefer to call it: AI-assisted machine learning. The fundamental idea is that AI will create a first pass of the deep-learning model given a use-case or a dataset. Developers/data scientists can then either use that model directly or fine-tune. We are conducting cutting-edge research in the main pillars of AI-Assisted ML: hyperparameter optimization (HPO) and neural architecture search (NAS).

Hyperparameter optimization

When developing a deep learning model, there are many knobs and dials to tune that depend on the specific task and dataset at hand. For example, setting the learning rate too high can prevent the algorithm from converging. Setting the learning rate too low can cause the algorithm to get stuck at a local minimum. There are countless other hyperparameters such as the number of epochs, batch size, momentum, regularization, shape, and size of the neural network. These hyperparameters are all dependent on each other and interact in intricate ways, so finding the best hyperparameters for a given dataset is an extremely difficult and highly nonconvex optimization problem.

Randomly testing different sets of hyperparameters may eventually find a decent solution, but could take years of computation time. Efficiently tuning deep learning hyperparameters is an active area of research. Five years ago, the best algorithms weren’t much better than random search. Now algorithms are capable of orders of magnitude speedups. At RealityEngines.AI, we use state-of-the-art HPO while training all our models.

Neural Architecture Search

Neural architecture search (NAS) is a rapidly developing area of research in which the process of choosing the best architecture is automated. Reinforcement learning saw one of the field’s first successes, and recently the computational time for NAS has been made tractable due to continuous optimization and weight-sharing techniques. At RealityEngines.AI, we will use NAS to both fine-tune proven deep network paradigms, and learn novel architectures for new domains.

Our goal is to empower data scientists and developers to create custom, production-grade models in days, not months. We have designed our own neural architecture search that beats the state-of-the-art benchmarks and focuses on generalizability. For an introduction to Bayesian optimization for NAS, see our first blog post. To read about our latest research, see our blog post on BANANAS; a new method for neural architecture search.

 
Neural Architecture Search via Bayesian Optimization with a Neural Network Model
Author: Colin White, Willie Neiswanger, Yash Savani
 
Explainability and Bias in Neural Nets

Business Analysts and subject matter experts within organizations are often frustrated when dealing with deep learning models. These models can appear to be black boxes that generate predictions which humans can’t explain. Over the last two years, there has been considerable research in explainability in AI. This has resulted in the release of an open-source tool, LIME, which measures the responsiveness of a model’s outputs to perturbations in its inputs. Then there’s SHAP (SHapley Additive exPlanations), a game-theoretic approach to explain the output of any machine learning model. Google has introduced Testing with Concept Activation Vectors (TCAV), a technique that may be used to generate insights and hypotheses. Google Brain’s scientists also explored attribution of predictions to input features in their 2016 paper, Axiomatic attribution for deep neural networks. Our efforts in this area build on these techniques to create a cloud microservice that will explain model predictions and determine if models exhibit bias.

In addition to Explainability, we are actively combating bias in AI models and presented our work around de-biasing AI models at NeurIPs 2019. Please find our publication on that topic below and our blog post about it here.

 
DECO: Debiasing through Compositional Optimization of Machine Learning Models
Author: Naveen Sundar Govindarajulu and Colin White