go back
go back
Volume 15, No. 2
ETO: Accelerating Optimization of DNN Operators by High-Performance Tensor Program Reuse
Abstract
Recently, deep neural networks (DNNs) have achieved great successin various applications, and reducing the DNN response latencybecomes a key issue to apply DNNs to many applications. Existing solutions either manually tune the kernel library or utilizesearch-based compilation to reduce the operator latency. However,manual tuning requires significant engineering effort, and the hugesearch space makes the search cost of the search-based compilationunaffordable in some situations. In this work, we propose Heracles, a framework for speeding up DNN operator optimizationbased on reusing the information of performant tensor programs.Specifically, Heracles defines conditions for the information reusebetween two operators. For operators satisfying the conditions,given a performant tensor program of one operator, Heracles usesa reuse-based tuner to significantly prune the search space of theother one, and keeps good effectiveness at the same time. Heraclesalso proposes a method to increase the reuse opportunities amonga set of operators by injecting extra operators as bridges betweentwo operators which do not satisfy the reuse conditions, and weprove the problem of adding the smallest set of extra operators tobe NP-hard. Heracles uses heuristics to add extra operators, andsolves the reuse relationship decision problem as a directed steinertree problem. Compared with various existing methods, the experiments show that Heracles is effective and efficient in optimizingDNN operators.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy