Summary: | Deep Neural Network (DNN) Intellectual Property (IP) models must be kept undisclosed to avoid revealing trade secrets. Recent works have devised machine learning techniques that leverage on side-channel information leakage of the target platform to reverse engineer DNN architectures. However, these works fail to perform successful attacks on DNNs that have undergone performance optimizations (i.e., operator fusion) using DNN compilers, e.g., Apache Tensor Virtual Machine (TVM). We propose a two-phase attack framework to infer the layer sequences of optimized DNNs through side-channel information leakage. In the first phase, we use a recurrent network with multi-head attention components to learn the intra and interlayer fusion patterns from GPU traces of TVM-optimized DNNs, in order to accurately predict the operation distribution. The second phase uses a model to learn the run-time temporal correlations between operations and layers, which enables the prediction of layer sequence. An encoding strategy is proposed to overcome the convergence issues faced by existing learning-based methods when inferring the layer sequences of optimized DNNs. Extensive experiments show that our learning-based framework outperforms state-of-the-art DNN model extraction techniques. Our framework is also the first to effectively reverse engineer both Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) using side-channel leakage.
|