What is decidable about string constraints with ReplaceAll function

<p>The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular c...

Full description

Bibliographic Details
Main Authors: Chen, T, Chen, Y, Hague, M, Lin, A, Wu, Z
Format: Journal article
Published: Association for Computing Machinery 2017
_version_ 1826266565643862016
author Chen, T
Chen, Y
Hague, M
Lin, A
Wu, Z
author_facet Chen, T
Chen, Y
Hague, M
Lin, A
Wu, Z
author_sort Chen, T
collection OXFORD
description <p>The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers.</p><p> Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted.</p>
first_indexed 2024-03-06T20:40:53Z
format Journal article
id oxford-uuid:343e996a-91ff-4dc9-a52d-c34607b7800b
institution University of Oxford
last_indexed 2024-03-06T20:40:53Z
publishDate 2017
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:343e996a-91ff-4dc9-a52d-c34607b7800b2022-03-26T13:24:50ZWhat is decidable about string constraints with ReplaceAll functionJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:343e996a-91ff-4dc9-a52d-c34607b7800bSymplectic Elements at OxfordAssociation for Computing Machinery2017Chen, TChen, YHague, MLin, AWu, Z<p>The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers.</p><p> Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted.</p>
spellingShingle Chen, T
Chen, Y
Hague, M
Lin, A
Wu, Z
What is decidable about string constraints with ReplaceAll function
title What is decidable about string constraints with ReplaceAll function
title_full What is decidable about string constraints with ReplaceAll function
title_fullStr What is decidable about string constraints with ReplaceAll function
title_full_unstemmed What is decidable about string constraints with ReplaceAll function
title_short What is decidable about string constraints with ReplaceAll function
title_sort what is decidable about string constraints with replaceall function
work_keys_str_mv AT chent whatisdecidableaboutstringconstraintswithreplaceallfunction
AT cheny whatisdecidableaboutstringconstraintswithreplaceallfunction
AT haguem whatisdecidableaboutstringconstraintswithreplaceallfunction
AT lina whatisdecidableaboutstringconstraintswithreplaceallfunction
AT wuz whatisdecidableaboutstringconstraintswithreplaceallfunction