Selenium WebDriver – Handling Upload Dialogs using AutoIt

Handling an upload dialog while automating test cases using Selenium WebDriver can be a headache. There is a list of problems, to start with the first, WebDriver only works on the current webpage. Things like Upload dialogs, alerts, save as dialogs are handled by the operating system rather than the webpages.

So as the execution crosses the webpage boundary, things can go out of control. There is support in Webdriver to handle popup alerts but when one has to send some keyboard inputs instead of standard Ok/cancel button responses to a dialog, we do it this way as a user:

  1. Use mouse to browse – hard to replicate using code as mouse movements are complex and need lots of assumptions related to screen coordinates
  2. Set focus to the file name box and send keys – Easier!!

I tried emulating the mouse actions using PyAutoGUI, a python library, still working on it. Till that time, in interest of time, have implemented the second approach using another tool “AutoIT”

AutoIT allows me to interact with the windows components like dialogs. I can control focus on the dialogues and click on any element I want.

The simplest program in AutoIT is:

Local $hwnd = WinWait($CmdLine[1], "", 10)
ControlClick($hwnd, $CmdLine[3], int($CmdLine[4]))

Saved this file as “HandleFileUpload.au3” that is an AutoIt script format. It can be executed directly by double-clicking, but since I used command line arguments in it, only makes sense to execute it from command line. Further, AutoIt scripts can be compiled into windows executables (exe) easily. Refer AutoIT Tutorials for further details.

The program above takes 4 arguments

HandleFileUpload.exe <Window_Title_for_searching_window> <Keys_to_send_to_dialog> <Button_Text> <Button_Id>

The first argument will be used to identify the window/dialog we need to bring focus to (for Firefox, this text is “File Upload”, for IE, its “Choose File to Upload”). The second argument is a string to be sent to dialog, third argument is the text on the button, it can be “Open” or “Close”. Fourth argument is a numeric value, an ID assigned to a button on the active window. It can be determined using an Inbuilt AutoIT Tool named “AutoIT Window Info”.

Once we have the exe, we need to call it from the code. The example here is C#, can be Java or any other:

        /// Pass keys input to modal file Upload dialog
        /// There are two modes of passing values used in the function - Direct window method for Chrome and AutoIT exe for Firefox and IE
        /// AutoIT method is more reliable as it lets Autoit handle the dialog and keeps focus while sending keys. Chrome, does not support this function.
        /// The AutoIT exe is in the "ExternalTools" folder, it takes command as:
        ///     HandleFileUpload.exe "UPLOAD dialog title" "Keys to send" "Button text" "Button ID"
        ///     "Upload Dialog title" - The title of the upload window for IE, this is "Choose File to Upload" and for Firefox, it is "File Upload"
        ///     "Keys to send" - the desired keys to send
        ///     "Button Text" - Text on the button to click like "Open" and "Close"
        ///     "Button ID" - Button ID, can be found using AutoIT utility, generally, for both IE and Firefox, the ID for save button is "1" and cancel button is "2", we have used 1
        ///     Very unlikely that these values change with future versions of browsers
        /// </summary>
        /// <param name="strKeys">String to pass</param>
        /// <param name="driver">WebDriver Instance</param>
        public static void sendKeysToUploadDialog(IWebDriver driver, String strKeys)
            if(getBrowserName(driver).Equals("chrome")) //getBrowserName
                System.Windows.Forms.SendKeys.SendWait(strKeys + "~");
                ProcessStartInfo startinfo = new ProcessStartInfo();
                startinfo.FileName = "Path_To_Folder\\HandleFileUpload.exe";
                    startinfo.Arguments = "\"File Upload\"" + strKeys + "\"\" \"1\" ";
                else if (getBrowserName(driver).Equals("internet explorer"))
                    startinfo.Arguments = "\"Choose File to Upload\"" + strKeys + "\"\" \"1\" ";
                    using (Process exeProcess = Process.Start(startinfo))
                catch(Exception e)
                    //log messages or handle otherwise

/* Function to get the browser name as a string from the webdriver instance */
        public static String getBrowserName(IWebDriver driver)
            ICapabilities cap = ((RemoteWebDriver)driver).Capabilities;
            String strBrowserName = cap.BrowserName.ToLower();
            return strBrowserName;

Note: The Chrome browser has to be handled using windows default support in C# as it does not get handled by the AutoIt code

Thanks for reading.

Selenium WebDriver – Handling Upload Dialogs using AutoIt