Sikuli Java Tutorial


In this article I will demonstrate how to integrate Sikuli (image based test automation tool) into a Java project for example a JUnit test suite. I will start by providing a general overview about front end test automation then go through the detailed procedure to add visual inspection capabilities to a typical Java project of our choice. Please recall that this is not a Sikuli tutorial however if you have any questions please use the comments section at the bottom of this post (click the article title first) or you may use the contact form on the top. Let us get started.

Front End Test Automation

Front end test automation is indeed a challenging topic for many reasons. I will only mention two main reasons for the sake of brevity. First, user facing applications such as web apps change frequently (for example adding new features, improving GUI, localization for different markets, dynamic content, download delays, etc).  In the context of test automation, software changes are very bad because they defeat the whole purpose of using computers to test computers.  Not only front end test automation but also test automation in general is an investment and one has to do it right otherwise he or she pays the price down the road in terms of maintenance cost trying to adapt test frameworks to evolving software products.  That was one reason on the application side, another reason on a different side is the lack of reliable test automation tools.  In the first case, we had no choice but to keep adding new features and making changes to the GUI due to business needs. On the other side, there are many technologies, frameworks and tools specifically for testing front end applications to select from however picking one technology might not satisfy all needs. If we disregard the fact that some tools are not robust enough to address all our needs still a combinations of technologies and tools is not always straight forward to implement. At the end of the day you select whatever fits your needs but you have to live the life according to the decision you make. In next two sections I will contrast two types of front end test automation solutions (object based and image based solutions).  I am not in favor of one over the other as I think they complement each other. Each methodology satisfies certain needs and has its own strengths and weaknesses.

Object Based Test Automation

Eventually, any test automation tool is expected to simulate user actions programmatically. Such tools have to identify objects on the screen (buttons, labels, etc) then take action against these objects (Clicks, entering text, etc). Object based test automation tools try to identify objects on the screen by means of an ID (unique identifier) or through a deterministic path within some kind of object hierarchy.  The best example on this type is Selenium web test automation framework. Selenium uses locators to identify objects in a given HTML document. Locators can be as simple as an object ID or some complex XPATH or CSS locator. In either case the DOM (document object model) hierarchy is the context in which the identification makes sense. Object based automation tools are good candidates for testing functional aspects of the application. This approach is clearly agnostic to the visual nature of the application for example a visual impairment might not break core functionality. In order to compensate for this limitation quality engineers may explore image based solutions.

Image Based Test Automation

Image based test automation is not as popular as object based test automation. It needs time to mature and get adopted by the industry. Image based testing technology is not aware of what applications are running or which logical objects exist on the screen. Instead they search the computer screen pixel buffer for image patterns utilizing non trivial computer vision algorithms. For example in order to locate a button on the screen the algorithm searches the screen pixel buffer for an image pattern that looks like or close enough to the button we want to locate. Image based solutions gives us the luxury not to care about application types (desktop or web), browsers (IE or FF), object locators (XPATH, CSS, ID) or even operating systems (Windows, Mac).  Most importantly image based testing is used to perform visual inspection of the application under test that otherwise cannot be tested using other technologies. Just to give a taste of what such methodology could provide, here are some possible scenarios that can be best validated using image based technology:

  • Layout problems caused by CSS
  • Truncation for example long translated strings
  • Themes and Skins
  • Ads, Flash and Video
  • Branding and Logos
  • Text alignment in Bidirectional languages for example Arabic.

Theoretically speaking image based testing solutions can be used not only for visual inspections but also for functional testing however the technology is still not mature enough to replace object based solutions such as Selenium for instance. Later in this article I will demonstrate how to integrate an image based tool (Sikuli) into an existing Java application. You can think of the demo as a proof of concept for combining both worlds together. In other words using Selenium (object based) for functional testing and Sikuli (image based) for visual inspection completes the full picture.


Sikuli is an academic and research project at the MIT. It is an open source image based technology to automate testing graphical user interfaces. Sikuli supports major OS platforms (Windows, Mac and Linux). It ships with an integrated development environment (IDE) for writing visual scripts. Sikuli script is a Jython and Java library that provides visual inspection automation. The core of Sikuli script consists from Java robot to deliver keyboard and mouse events to appropriate locations on the screen. Image recognition is provided by OpenCV (Open Source Computer Vision) C++ engine that is connected to Java via JNI (Java Native Interface). We are not going to talk about Sikuli script and Sikuli IDE in this article instead we are going to demonstrate how to integrate Sikuli into a Java application. For more information about Sikuli please visit their web site at

Sikuli Java Integration

Please follow the steps as indicated below. I am running Windows 7 but it should not be that different for other platforms.

  1. Download and install Java development kit from
  2. Install the 32 bit classic version of Eclipse IDE for Java (64 bit version is not compatible with Sikuli). You can download Eclipse ID from
  3. Download Sikuli IDE from I recommend that you download the self extracting installer because it automatically sets system variables for you. Sikuli uses a native C++ library for computer vision. This library has to be in the build path otherwise Sikuli is not going to work.
  4. Launch Eclipse
  5. Go to (File, New, Java Project)
  6. Give your project a name
  7. Click next to go to (Java Settings) page.
  8. Click the (Libraries) tab
  9. Click (Add External JARs)
  10. Navigate to the Sikuli installation folder and select (sikuli-script.jar)
  11. Click (Finish) to create the project
  12. Click (File, New, JUnit Test Case)
  13. Give your test case a name (you can also enter a package name if you wish)
  14. Check the setUp (gets called before executing a JUnit test case) and tearDown (gets called after executing a JUnit test case) method stubs.
  15.  Leave (Class under test) empty because we are not going to use JUnit for actual unit testing however we are going to use it for testing GUI through Sikuli.
  16. Add the test cases you want. I included below the source code needed to visually test logging into Yahoo mail and verifying that the Yahoo brand name exists
//Import Sikuli script
import org.sikuli.script.*;

//Import JUnit
import org.junit.*;

public class Testing
	//Sikuli script object
	private SikuliScript m_sikscr;

	//Computer screen object
	private Screen m_screen;
	//Image of Firefox address bar
	private Pattern m_address;
	//Image of Firefox go image
	private Pattern m_go;
	//Image of Yahoo ID label
	private Pattern m_yid;
	//Image of Yahoo password label
	private Pattern m_pass;
	//Image of Yahoo Signin button
	private Pattern m_signin;
	//Image of Yahoo brand name
	private Pattern m_logo;
	public Testing()
		//Load images from files
		m_address = new Pattern("./img/FirefoxBar.png");
		m_go = new Pattern("./img/FirefoxGo.png");
		m_yid = new Pattern("./img/YahooID.png");
		m_pass = new Pattern("./img/Password.png");
		m_signin = new Pattern("./img/SignIn.png");
		m_logo = new Pattern("./img/Logo.png");

		//Create Sikuli script and screen objects
			m_sikscr = new SikuliScript();
			m_screen = new Screen();
		catch (Exception e)
	//This method is invoked before JUnite test case executes
	@ Before
	public void setUp()
		//Launch Firefox
		m_sikscr.openApp("Full path\\firefox.exe");
		//Wait a bit
		m_screen.wait((double) 3.0);

			//Click Firefox address bar then type address
			m_screen.type(m_address, "");
			//Sikuli Click the go button;
			//Wait a bit
			m_screen.wait((double) 3.0);
			//Find login label 
			Match login = m_screen.exists(m_yid.similar((float)0.50));
			//Sikuli Click below the login label;
			//CTRL-A to select all text
			m_screen.type("a", KeyModifier.CTRL);
			//Type user name

			//Find the password label
			Match pass = m_screen.exists(m_pass.similar((float)0.70));
			//Sikuli Click below the password label;
			//Select all text
			m_screen.type("a", KeyModifier.CTRL);
			//Enter password
			//Sikuli Click the sign in button;
			//Wait a bit
			m_screen.wait((double) 3.0);   
		catch (FindFailed e)

	//This method is invoked after JUnit test case is executed
	@ After
	public void tearDown() 
		//Close Firefox app
		m_sikscr.closeApp("Mozilla Firefox");

	//Test case checks if Yahoo logo exists after login
	public void testLogo() throws Exception 
		//Make sure Yahoo brand name exists
		//Wait a bit
		m_screen.wait((double) 3.0);


Please use the comments section below for questions, corrections or feedback. Thanks for reading.

Search Terms...

Leave a Reply

%d bloggers like this: