Summary: | <p>In this thesis, we develop a rich family of efficient and performant Bayesian optimisation (BO) methods to tackle various AutoML tasks. We first introduce a fast information-theoretic BO method, FITBO, that overcomes the computation bottleneck of information-theoretic acquisition functions while maintaining their competitiveness on the noisy optimisation problems frequently encountered in AutoML. We then improve on the idea of local penalisation and develop an asynchronous batch BO solution, PLAyBOOK, to enable more efficient use of parallel computing resources when evaluation runtime varies across configurations. In view of the fact that many practical AutoML problems involve a mixture of multiple continuous and multiple categorical variables, we propose a new framework, named Continuous and Categorical BO (CoCaBO) to handle such mixed-type input spaces. CoCaBO merges the strengths of multi-armed bandits on categorical inputs and that of BO on continuous space, and uses a tailored kernel to permit information sharing across different categorical variables. We also extend CoCaBO by harnessing the concept of local trust region to achieve competitive performance on high-dimensional optimisation problems with mixed input types.</p>
<p>Beyond hyper-parameter tuning, we also investigate the novel use of BO on two important AutoML applications: black-box adversarial attack and neural architecture search. For the former (adversarial attack), we introduce the first BO-based attacks on image and graph classifiers; by actively querying the unknown victim classifier, our BO attacks can successfully find adversarial perturbations with many fewer attempts than competing baselines. They can thus serve as efficient tools for assessing the robustness of models suggested by AutoML. For the latter (neural architecture search), we leverage the Weisfeiler-Lehamn graph kernel to empower our BO search strategy, NAS-BOWL, to naturally handle the directed acyclic graph representation of architectures. Besides achieving superior query efficiency, our NAS-BOWL also returns interpretable sub-features that help explain the architecture performance, thus marking the first step towards interpretable neural architecture search. Finally, we examine the most computation-intense step in AutoML pipeline: generalisation performance evaluation for a new configuration. We propose a cheap yet reliable test performance estimator based on a simple measure of training speed. It consistently outperforms various existing estimators on on a wide range of architecture search spaces and and can be easily incorporated into different search strategies, including BO, to improve the cost efficiency.</p>
|