Resumo: | <p>This dissertation addresses the problem of automatic exploit generation for heap based buffer overflows in language interpreters. Language interpreters are ubiquitous
within modern software, embedded in everything from web browsers, to anti-virus engines, to cloud computing platforms, and, within interpreters, heap-based buffer overflows are a common source of vulnerability. Automatic exploit generation for such vulnerabilities is a largely open problem. </p>
<p>In the past decade, greybox methods that combine large scale input generation with feedback from instrumentation have proven themselves to be the most successful approach to detecting many types of software vulnerabilities. Despite this, prior to the start of my research they had not been a significant component of exploit generation systems. Greybox approaches are attractive as they tend to scale far better than whitebox approaches when applied to large software. However, end-to-end exploit generation is too complex and multi-faceted a task to approach with a single greybox solution. During my research I have analysed the exploit generation problem for heap-based overflows in language interpreters in order to break it down
into a set of logical sub-problems that can be addressed with separate greybox solutions. In this dissertation I present these sub-problems, greybox algorithms for each, and demonstrate how the solutions for the sub-problems can be combined to generate an exploit. The most significant of the sub-problems that I address is the heap layout problem, for which I provide a detailed analysis, two different greybox solutions, and methods for integrating solutions to this problem into both manual and automatic exploit generation. </p>
<p>The presented algorithms form the first approach to automatic exploit generation for heap overflows in interpreters. They also provide the first approach to exploit generation in any class of program that integrates a solution for automatic heap layout manipulation. At the core of the approach is a novel method for discovering exploit primitives—inputs to the target program that result in a sensitive operation, such as a function call or a memory write, using attacker-injected data. To produce an exploit primitive from a heap overflow vulnerability, one has to discover a target data structure to corrupt, ensure an instance of that data structure is adjacent to the source of the overflow on the heap, and ensure that the post-overflow corrupted data is used in a manner desired by the attacker. I present solutions to address these three tasks in an automatic, greybox, and modular manner.</p>
|