Selenium WebDriver – Handling Upload Dialogs using AutoIt

Handling an upload dialog while automating test cases using Selenium WebDriver can be a headache. There is a list of problems, to start with the first, WebDriver only works on the current webpage. Things like Upload dialogs, alerts, save as dialogs are handled by the operating system rather than the webpages.

So as the execution crosses the webpage boundary, things can go out of control. There is support in Webdriver to handle popup alerts but when one has to send some keyboard inputs instead of standard Ok/cancel button responses to a dialog, we do it this way as a user:

  1. Use mouse to browse – hard to replicate using code as mouse movements are complex and need lots of assumptions related to screen coordinates
  2. Set focus to the file name box and send keys – Easier!!

I tried emulating the mouse actions using PyAutoGUI, a python library, still working on it. Till that time, in interest of time, have implemented the second approach using another tool “AutoIT”

AutoIT allows me to interact with the windows components like dialogs. I can control focus on the dialogues and click on any element I want.

The simplest program in AutoIT is:

Local $hwnd = WinWait($CmdLine[1], "", 10)
WinActivate($hwnd)
Send($CmdLine[2])
ControlClick($hwnd, $CmdLine[3], int($CmdLine[4]))

Saved this file as “HandleFileUpload.au3” that is an AutoIt script format. It can be executed directly by double-clicking, but since I used command line arguments in it, only makes sense to execute it from command line. Further, AutoIt scripts can be compiled into windows executables (exe) easily. Refer AutoIT Tutorials for further details.

The program above takes 4 arguments

HandleFileUpload.exe <Window_Title_for_searching_window> <Keys_to_send_to_dialog> <Button_Text> <Button_Id>

The first argument will be used to identify the window/dialog we need to bring focus to (for Firefox, this text is “File Upload”, for IE, its “Choose File to Upload”). The second argument is a string to be sent to dialog, third argument is the text on the button, it can be “Open” or “Close”. Fourth argument is a numeric value, an ID assigned to a button on the active window. It can be determined using an Inbuilt AutoIT Tool named “AutoIT Window Info”.

Once we have the exe, we need to call it from the code. The example here is C#, can be Java or any other:

        ///<summary>
        /// Pass keys input to modal file Upload dialog
        /// There are two modes of passing values used in the function - Direct window method for Chrome and AutoIT exe for Firefox and IE
        /// AutoIT method is more reliable as it lets Autoit handle the dialog and keeps focus while sending keys. Chrome, does not support this function.
        /// The AutoIT exe is in the "ExternalTools" folder, it takes command as:
        ///     HandleFileUpload.exe "UPLOAD dialog title" "Keys to send" "Button text" "Button ID"
        ///     "Upload Dialog title" - The title of the upload window for IE, this is "Choose File to Upload" and for Firefox, it is "File Upload"
        ///     "Keys to send" - the desired keys to send
        ///     "Button Text" - Text on the button to click like "Open" and "Close"
        ///     "Button ID" - Button ID, can be found using AutoIT utility, generally, for both IE and Firefox, the ID for save button is "1" and cancel button is "2", we have used 1
        ///     Very unlikely that these values change with future versions of browsers
        /// </summary>
        /// <param name="strKeys">String to pass</param>
        /// <param name="driver">WebDriver Instance</param>
        public static void sendKeysToUploadDialog(IWebDriver driver, String strKeys)
        {
            if(getBrowserName(driver).Equals("chrome")) //getBrowserName
            {
                System.Windows.Forms.SendKeys.SendWait(strKeys + "~");
            }
            else
            {
                ProcessStartInfo startinfo = new ProcessStartInfo();
                startinfo.FileName = "Path_To_Folder\\HandleFileUpload.exe";
                if(getBrowserName(driver).Equals("firefox"))
                {
                    startinfo.Arguments = "\"File Upload\"" + strKeys + "\"\" \"1\" ";
                }
                else if (getBrowserName(driver).Equals("internet explorer"))
                {
                    startinfo.Arguments = "\"Choose File to Upload\"" + strKeys + "\"\" \"1\" ";
                }
                try
                {
                    using (Process exeProcess = Process.Start(startinfo))
                    {
                        exeProcess.WaitForExit();
                    }
                }
                catch(Exception e)
                {
                    //log messages or handle otherwise
                }
            }
            
        }

/* Function to get the browser name as a string from the webdriver instance */
        public static String getBrowserName(IWebDriver driver)
        {
            ICapabilities cap = ((RemoteWebDriver)driver).Capabilities;
            String strBrowserName = cap.BrowserName.ToLower();
            return strBrowserName;
        }

Note: The Chrome browser has to be handled using windows default support in C# as it does not get handled by the AutoIt code

Thanks for reading.

Advertisements
Selenium WebDriver – Handling Upload Dialogs using AutoIt

Monitoring WebDriver Actions – Using WebDriverEventListener and EventFiringWebDriver

WebDriver performs a sequence of complex actions in the background even for tasks which seem as trivial as navigating to a webpage and entering some test data. Selenium provides useful framework which gives a peep into the busy life of WebDriver. When things become critical, debugging at the level of each event becomes crucial – these events being navigating to a URL, web element’s value changing, script getting executed through webdriver and most useful one, an event just when exception occurs. TestNG provides its own implementations for the root level event tracking like ITestListener and ISuiteListeners, but Selenium has its own way of doing this.

To throw an event, WebDriver gives a class named EventFiringWebDriver, and to catch that event, it provides an interface named WebDriverEventListener. Together, these two can be used to trigger an event, catch it and perform desired action.

There may be more than one listeners waiting for a single event and handle it their own way. It’s done by registering multiple listeners to an EventFiringWebDriver.

The whole flow of things looks like:

  1. Create an EventListener class
  2. Create a WebDriver instance
  3. Create an instance of EventFiringWebDriver by passing driver from step 2
  4. Create an instance of EventListener
  5. Register this EventListener to the EventFiringWebDriver Instance
  6. Done !! Handle the events sent by WebDriver now

An event listener can be created by either:

  1. Implementing WebDriverEventListener interface
  2. Extending AbstractWebDriverEventListener class

I am using the interface in the example here. Implementing the interface makes me define all the methods of Interface. So the code for a class implementing WebDriverEventListener is like:

package org.selenium;
import org.openqa.selenium.*;
import org.openqa.selenium.support.events.WebDriverEventListener;


public class TheEventListener implements WebDriverEventListener{

	@Override
	public void afterChangeValueOf(WebElement arg0, WebDriver arg1) {
		System.out.println("After change of value :" + arg0.toString());
		
	}

	@Override
	public void afterClickOn(WebElement arg0, WebDriver arg1) {
		System.out.println("After click on webelement: " + arg0.getText());
	}

	@Override
	public void afterFindBy(By arg0, WebElement arg1, WebDriver arg2) {
		System.out.println("After find by: " + arg0.toString());
	}

	@Override
	public void afterNavigateBack(WebDriver arg0) {
		System.out.println("After navigating back to : " + arg0.getCurrentUrl());
	}

	@Override
	public void afterNavigateForward(WebDriver arg0) {
		System.out.println("After navigating forward to : "+ arg0.getCurrentUrl());
	}

	@Override
	public void afterNavigateTo(String arg0, WebDriver arg1) {
		System.out.println("After navigating to : "+arg0);
	}

	@Override
	public void afterScript(String arg0, WebDriver arg1) {
		System.out.println("After execution of script : "+ arg0);
	}

	@Override
	public void beforeChangeValueOf(WebElement arg0, WebDriver arg1) {
		System.out.println("Before value change of : " + arg0.toString());
	}

	@Override
	public void beforeClickOn(WebElement arg0, WebDriver arg1) {
		System.out.println("Before clicking on WebElement : " + arg0.getText());
	}

	@Override
	public void beforeFindBy(By arg0, WebElement arg1, WebDriver arg2) {
		System.out.println("Before find by : " + arg0.toString());
	}

	@Override
	public void beforeNavigateBack(WebDriver arg0) {
		System.out.println("Before navigating back from : " + arg0.getCurrentUrl());
	}

	@Override
	public void beforeNavigateForward(WebDriver arg0) {
		System.out.println("Before navigating forward from : "+ arg0.getCurrentUrl());
	}

	@Override
	public void beforeNavigateTo(String arg0, WebDriver arg1) {
		System.out.println("Before navigating to : "+ arg0);
	}

	@Override
	public void beforeScript(String arg0, WebDriver arg1) {
		System.out.println("Before executing the script : " + arg0);
	}

	@Override
	public void onException(Throwable arg0, WebDriver arg1) {
		System.out.println("On exception : " + arg0.getMessage());
		arg1.quit();
	}

}


Now comes the event firing WebDriver. It’s the attention seeking big brother of WebDriver who loves to brag about what he’s going to do and have done. Listeners, tuned-in or, registered with this driver get to know everything and act as they want.

This is the code for an EventFiringWebdriver implementation:

 

package org.selenium;
import org.openqa.selenium.*;
import org.openqa.selenium.firefox.*;
import org.openqa.selenium.support.events.EventFiringWebDriver;
import org.openqa.selenium.support.ui.Select;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;

public class TheDriver {
	
	WebDriver driver;
	
	@BeforeClass
	public void setupBrowser()
	{
		driver = new FirefoxDriver();
		driver.manage().window().maximize();
	}
	
	@Test
	public void BrowserTest()
	{
		
	EventFiringWebDriver eventFiringDriver = new EventFiringWebDriver(driver); //Get EventFiringWebDriver instance
	
	TheEventListener eventListener = new TheEventListener(); //Get Listener instance
	eventFiringDriver.register(eventListener); // Register listener to driver
	
	eventFiringDriver.navigate().to("https://www.wikipedia.org/");
	eventFiringDriver.findElement(By.id("searchInput")).sendKeys("Lorem Ipsum");
	Select selLanguage = new Select(eventFiringDriver.findElement(By.id("searchLanguage")));
	selLanguage.selectByValue("en");
	eventFiringDriver.findElement(By.xpath(".//body[@id='www-wikipedia-org']/div[2]//input[@type='submit']")).click();
	
	eventFiringDriver.navigate().back();
	}
	
	@AfterClass
	public void exit() throws InterruptedException
	{
		Thread.sleep(2000);
		driver.quit();
	}
	

}

On execution, you get every event listed on the console just as we wanted to be handled. It gives an exception which is also expected.

This is how the output looks like:


Before navigating to : https://www.wikipedia.org/
After navigating to : https://www.wikipedia.org/
Before find by : By.id: searchInput
After find by: By.id: searchInput
Before value change of : [[FirefoxDriver: firefox on WINDOWS (8dc5e6de-b235-452e-8884-dfb1d6ea4d0b)] -> id: searchInput]
After change of value :[[FirefoxDriver: firefox on WINDOWS (8dc5e6de-b235-452e-8884-dfb1d6ea4d0b)] -> id: searchInput]
Before find by : By.id: searchLanguage
After find by: By.id: searchLanguage
Before find by : By.xpath: .//option[@value = “en”]
After find by: By.xpath: .//option[@value = “en”]
Before find by : By.xpath: .//body[@id=’www-wikipedia-org’]/div[2]//input[@type=’submit’]
After find by: By.xpath: .//body[@id=’www-wikipedia-org’]/div[2]//input[@type=’submit’]
Before clicking on WebElement :
On exception : Element not found in the cache – perhaps the page has changed since it was looked up

Thanks for reading !!!

Monitoring WebDriver Actions – Using WebDriverEventListener and EventFiringWebDriver