Sikuli: Using GUI screenshots for search and automation

We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sik...

Full description

Bibliographic Details
Main Authors: Yeh, Tom, Chang, Tsung-Hsiang, Miller, Robert C.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery (ACM) 2012
Online Access:http://hdl.handle.net/1721.1/72686
https://orcid.org/0000-0002-0442-691X
_version_ 1826199558293553152
author Yeh, Tom
Chang, Tsung-Hsiang
Miller, Robert C.
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Yeh, Tom
Chang, Tsung-Hsiang
Miller, Robert C.
author_sort Yeh, Tom
collection MIT
description We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sikuli also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events. We report a web-based user study showing that searching by screenshot is easy to learn and faster to specify than keywords. We also demonstrate several automation tasks suitable for visual scripting, such as map navigation and bus tracking, and show how visual scripting can improve interactive help systems previously proposed in the literature.
first_indexed 2024-09-23T11:21:53Z
format Article
id mit-1721.1/72686
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T11:21:53Z
publishDate 2012
publisher Association for Computing Machinery (ACM)
record_format dspace
spelling mit-1721.1/726862022-10-01T03:07:25Z Sikuli: Using GUI screenshots for search and automation Yeh, Tom Chang, Tsung-Hsiang Miller, Robert C. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Miller, Robert C. Yeh, Tom Chang, Tsung-Hsiang Miller, Robert C. We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sikuli also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events. We report a web-based user study showing that searching by screenshot is easy to learn and faster to specify than keywords. We also demonstrate several automation tasks suitable for visual scripting, such as map navigation and bus tracking, and show how visual scripting can improve interactive help systems previously proposed in the literature. 2012-09-13T14:57:06Z 2012-09-13T14:57:06Z 2009-10 Article http://purl.org/eprint/type/ConferencePaper 978-1-60558-745-5 http://hdl.handle.net/1721.1/72686 Tom Yeh, Tsung-Hsiang Chang, and Robert C. Miller. 2009. Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology (UIST '09). ACM, New York, NY, USA, 183-192. https://orcid.org/0000-0002-0442-691X en_US http://dx.doi.org/10.1145/1622176.1622213 Proceedings of the 22nd Annual ACM Symposium on User Interface Software and Technology (UIST '09) Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Association for Computing Machinery (ACM) Other University Web Domain
spellingShingle Yeh, Tom
Chang, Tsung-Hsiang
Miller, Robert C.
Sikuli: Using GUI screenshots for search and automation
title Sikuli: Using GUI screenshots for search and automation
title_full Sikuli: Using GUI screenshots for search and automation
title_fullStr Sikuli: Using GUI screenshots for search and automation
title_full_unstemmed Sikuli: Using GUI screenshots for search and automation
title_short Sikuli: Using GUI screenshots for search and automation
title_sort sikuli using gui screenshots for search and automation
url http://hdl.handle.net/1721.1/72686
https://orcid.org/0000-0002-0442-691X
work_keys_str_mv AT yehtom sikuliusingguiscreenshotsforsearchandautomation
AT changtsunghsiang sikuliusingguiscreenshotsforsearchandautomation
AT millerrobertc sikuliusingguiscreenshotsforsearchandautomation