The Eponymous Pickle: Speeding Up Learning Inference by 2X

Tuesday, August 27, 2019

Speeding Up Learning Inference by 2X

New methods, technical:

New Technique Speeds Up Deep-Learning Inference on TensorFlow by 2x
by Anthony Alford in InfoQ

Researchers at North Carolina State University recently presented a paper at the International Conference on Supercomputing (ICS) on their new technique, "deep reuse" (DR), that can speed up inference time for deep-learning neural networks running on TensorFlow by up to 2x, with almost no loss of accuracy.

Dr. Xipeng Shen, along with graduate student Lin Ning, authored the paper describing the technique, which requires no special hardware or changes to the deep-learning model. By taking advantage of similarities in the data values that are input into a neural network layer, DR eliminates redundant computation during inference, reducing the total time taken. Reducing computation also reduces power consumption, a key feature for mobile or embedded applications. In experiments running several common computer-vision deep-learning models on GPUs, including CifarNet, AlexNet, and VGG-19, DR achieved from 1.75X to 2.02X speedup, with an increase in error of 0.0005. In some cases, DR actually improved accuracy slightly. In similar experiments on a mobile phone, DR "achieves an average of 2.12x speedup for CifarNet and 2.55X for AlexNet." .... "

The Eponymous Pickle

About Me

RSS

Blog Archive

Tuesday, August 27, 2019

Speeding Up Learning Inference by 2X

No comments: