Tóm tắt: | <p>Recently, significant advances in artificial intelligence (AI) have surpassed what was imaginable even five years ago. Today, we can instruct diffusion-based models to generate high-quality videos from human descriptions or prompt large language models (LLMs) to assist with writing, translation, and even mathematical reasoning. These remarkable abilities arise from training massive deep-learning models on huge amounts of data. However, we do not always have enough data. In some tasks, such as mathematical reasoning or molecule generation, available data are very limited. Furthermore, despite current LLMs utilizing nearly all available data on the Internet, they remain imperfect. Thus, it is a critical question how to enhance the performance of AI systems when it is difficult to increase the amount of training data.</p>
<p>In this thesis, we address this challenge from the perspective of inductive biases. Specifically, we investigate how to effectively use human knowledge about data or tasks to optimize the behavior of a machine learning algorithm, without requiring extra data. We will first give a brief review of research on inductive biases, and then we will show how to incorporate inductive biases during structure designing, training, and inference of a machine learning model, respectively. We also performed extensive experiments demonstrating that incorporating appropriate inductive biases can greatly boost model performance on a variety of tasks without the need for additional data.</p>
|