Yu Abstract | Conference 2018

Deep Learning at the edge: Real-Time Intelligence on Embedded Systems

The proliferation of machine-learning is undergoing a rapid shift, with a re-thinking from traditional, naive neural networks, towards deep learning models where the neural hierarchy is more rational, optimized, and informative. Many advanced deep learning algorithms require high-end computer servers and GPUs for training model and inferring new knowledge and are beyond the capacity of edge computers and embedded system that target at real-time applications.

In this paper, we address the mismatch between deep learning and embedded systems and apply an essential concept in deep learning, i.e., transfer learning that usually tackles the training and learning phases from the ground up in an offline stage under the assumption that the holistic data distribution is well captured in the network structure, so that the future inference and output is generated rapidly in a short path/feed-forward manner dictated by the network construction. Even when a real-life application might be specific and beyond the scope of off-line training, we can still formulate it as an online learning setting; the learning overhead is still manageable considering the incremental learning cost can be distributed to each instance, while the model is already pre-trained. We will focus on transfer learning, decouple computation-/data-intensive batch processing tasks and latency-sensitive online processing, migrate complex network models into mobile platforms to create light-weight deep learning for real-time decision making at the edge where event occurrences and decision points reside. Furthermore, this novel autonomous system, i.e., Deep Learning on SoC integrates recent successes in neural networks, smart embedding systems (Nvidia Jetson SoC) and the Internet of Things (IoT) and offers a cost-effective prediction system. To meet the real-time requirement, we will propose a hardware-based solution (FPGA) for accelerating inferences and predictions and uses the prediction results to monitor, track, activate, and adjust target devices.

PRESENTATION