Comprehensive Selenium Architecture: Building Enterprise-Grade Test Automation Frameworks

Comprehensive Selenium Architecture: Building Enterprise-Grade Test Automation Frameworks

Automation testing for web apps? Well, there are plenty of options. Selenium, Puppeteer, Cypress, Playwright—each has its strengths. Choosing the right framework depends on language support, complexity, scale, and team expertise. Still, Selenium remains the top pick for automation testers and developers. Testing web apps isn’t easy. Unexpected challenges arise. We need a reliable tool to handle them. That’s where Selenium WebDriver comes in—a go-to solution for automation testers worldwide.

Curious about Selenium’s internal architecture? You’re in the right place.

In this blog, we’ll break down Selenium Architecture and Selenium WebDriver. You’ll learn how Selenium WebDriver works, as well as its benefits and limitations. 

Let’s dive in and look at what is Selenium.

What is Selenium?

Selenium is a robust automation testing framework. It simplifies web application testing across different browsers. Testers and developers can write automation scripts effortlessly. 

Cross-browser testing? No problem. Selenium runs smoothly on Chrome, Safari, Firefox, Edge, and Opera. It also supports cross-platform testing. Run test cases simultaneously on Windows, Linux, Mac OS, and Solaris.  

Why is Selenium a top choice? Flexibility. Reliability. It lets developers and testers build robust automation test cases with ease.

What are the Components of Selenium?

Selenium isn’t just a test automation framework. It’s a full suite of testing tools. Each tool has its own unique capabilities, making automation framework development and design more efficient. These components can work independently or be combined for even greater functionality.  

The Selenium framework consists of four main components:

  • Selenium IDE
  • Selenium WebDriver
  • Selenium RC(Obsolete now and merged with WebDriver)
  • Selenium Grid

Selenium IDE

Selenium IDE refers to Selenium Integrated Development Environment. It is a Firefox plugin that allows testers and developers to record and playback test scripts. It does not require any prior programming understanding. Usually, the Selenium IDE is a prototyping tool.

Selenium WebDriver

Selenium WebDriver is a key component of the Selenium framework. It powers browser-based automation tests. Acting as a remote control interface, WebDriver enables test programs to interact with browsers, manipulate DOM elements, and control user agent behavior.  

Simply put, it’s the bridge between the Selenium framework and the browser—where the test cases come to life.

After Selenium 1, Selenium RC was merged with the Selenium WebDriver and formed Selenium 2. This was later upgraded to Selenium 3 and further to Selenium 4.

Selenium Grid

This component of the Selenium suite is used to run parallel tests on machines against multiple supported browsers. Since almost all modern browsers and operating systems are supported by Selenium, it is easier for the Selenium Grid to run numerous tests simultaneously on different operating systems with different browsers.

Selenium Architecture

Earlier, Selenium WebDriver and Selenium RC had been blended into a single unit—Selenium 2.0, also known as Selenium WebDriver 2.0. Over time, it advanced with new features and upgrades, leading to the release of Selenium 3.0. In this version, the JSON Wire protocol served as the primary communication channel between the automation check script and the internet browser.

With Selenium 4.0, things changed. The JSON Wire protocol was replaced with the W3C protocol. What does this mean? No more encoding and decoding of test case requests—it’s now a seamless process.  

Selenium Architecture of WebDriver in Selenium 3.0

Selenium 3.0 primarily uses the JSON Wire protocol to communicate between the user test script and the browser. This wire protocol represents a RESTful web service using JSON over HTTP. In Selenium 3.0, the Selenium WebDriver architecture consists of four major components:

  • Selenium Client Libraries/ Language Bindings
  • JSON Wire Protocol
  • Browser Drivers
  • Real Browsers

Selenium Client Libraries:

Automation scripts that interact with Selenium WebDriver can be written in multiple programming languages—Ruby, Java, C#, Python, JavaScript, and more. 

So, what exactly is a Selenium Client Library? Think of it as a specialized JAR file. It includes the necessary methods and classes from Selenium WebDriver, essential for building test automation scripts.  

Installing Selenium core libraries is straightforward—they can be easily set up using the package installers specific to each language. Alternatively, all official Selenium client libraries are available for download from the Selenium website.  

It’s important to note that a Selenium client library is not a testing framework. Instead, it provides an API (Application Programming Interface)—a set of functions that execute Selenium commands from the test script. For instance, Java bindings offer APIs that enable Selenium commands to be written and executed in Java.

JSON Wire Protocol

JSON or JavaScript Object Notation is a very famous data interchange format based on a subset of the JavaScript Programming Language. Selenium WebDriver 3.0 uses JSON to communicate between Selenium client libraries and browser drivers. It provides support for data structures like arrays and objects, making data reading and writing more comfortable.

The JSON requests sent by the client are altered into HTTP requests for the server to understand and converted back to JSON format while sending it back to the client again. This process of data transfer is serialization. In this method, the internal logic of the browser is not disclosed, and the server can communicate with the Selenium client libraries, even if it is unfamiliar with any programming language.

Browser Drivers

Browser drivers act as a bridge between the Selenium client libraries and the real browsers. They help us in running Selenium commands on the browser. It is the main component of Selenium WebDriver responsible for executing user actions, like mouse clicks, page navigation, button clicks, etc., on the browser. For every supported browser in Selenium, we have a unique browser driver. These browser drivers take commands from the Selenium test scripts and pass them to the respective browsers.

Whenever a Selenium automation test is triggered, the following series of actions are performed:

  • Every test command generates a corresponding HTTP request using the JSON Wire Protocol, which is then sent to the browser driver.
  • This HTTP request is routed through the HTTP Server.
  • The HTTP Server directly drives the command execution on the real browser.
  • The browser then sends back the test status to the HTTP Server, which is responsible for forwarding it to the test automation script.

In this way, these browser drivers permit communication between the Selenium automation script with different browsers. Also, the browser driver ensures that communication happens without disclosing the internal logic of those browsers.

Some popular browser drivers in Selenium are ChromeDriver, FirefoxDriver, SafariDriver, OperaBrowser, EdgeDriver, and HtmlUnitDriver.

Real Browsers

A real browser is an application or a software program used for searching and seeing content on the World Wide Web (WWW). This component of the Selenium Webdriver architecture in Selenium 3.0 is pretty straightforward. The browser receives the command and calls the respective functions or methods to perform the desired automation task.

Selenium framework supports almost all popular and modern-age browsers like Google Chrome, Mozilla Firefox, Microsoft Edge, Apple’s Safari, etc.

Selenium Architecture of WebDriver in Selenium 4.0

In Selenium 3.0, the JSON Wire Protocol (over HTTP) served as the communication bridge. However, it had a major drawback—there was no direct connection between Selenium client libraries (C#, Java, Ruby, Python, etc.) and the browser driver. Instead, the protocol acted as a middleman because the server understood only protocols, not programming languages. The result? Slower test execution, more exceptions, and an increased risk of flaky tests.  

Selenium 4.0 fixed this with the W3C (World Wide Web Consortium) WebDriver Protocol. This new standard replaces JSON Wire Protocol, eliminating the need to encode and decode Selenium commands or API requests. With W3C WebDriver, automation scripts can now communicate directly with the browser—no more relaying information through HTTP requests and responses. The outcome? Faster, more reliable, and more stable test execution.

In Selenium 4.0, the Selenium WebDriver architecture consists of the following four major components:

  • Selenium Client Libraries/ Language Bindings
  • WebDriver W3C Protocol
  • Browser Drivers
  • Real Browsers

Basically, all the components in Selenium WebDriver 4.0 are much like the components in Selenium 3.0 except that the JSON Wire protocol is changed with the brand new W3C WebDriver protocol.

WebDriver W3C Protocol

The WebDriver W3C protocol, introduced in Selenium 4.0, is a game-changer. It’s officially endorsed with W3C, the network dedicated to web standard development. The W3C Editor’s Draft and W3C Working Draft are superb assets if you want to stay updated on its progress.

So, what’s different? In WebDriver W3C Protocol, communication happens directly between the client and server—no need for the JSON Wire Protocol. Since both Selenium WebDriver and web browsers now follow the same protocol, automated tests run more consistently across different browsers.  

The best part? Developers and testers no longer need to tweak automation scripts for different browsers. With WebDriver W3C Protocol, testing is more stable, reliable, and hassle-free in Selenium 4.0.

Selenium is a powerful and reliable framework for automated web testing. However, running tests solely on local infrastructure presents challenges like maintenance overhead, limited parallel execution, and scalability constraints. 

LambdaTest, a cloud-based AI-powered testing platform, enables seamless cross-browser testing on real devices and browsers, eliminating these limitations. 

With features like parallel execution, geolocation testing, and real-time debugging, LambdaTest optimizes Selenium test efficiency. The result? Faster execution, improved coverage, and a more scalable test automation strategy.

How does Selenium WebDriver Work Internally?

In a real-time scenario, when we run a Selenium script written in any language using any one of the supported Selenium client libraries (say Java), the browser will launch and start behaving as directed by the script. Now let’s understand what is occurring internally after the Run button is clicked till the launch of the real browser.

You can learn more about WebDriver by visiting this article on what is Selenium WebDriver.

1. As we click on the Run button, the Selenium client library runs Selenium commands from the automation script and converts them in a serialized JSON format (for example, https://www.lambdatest.com will be serialized to {“URL”: “https://www.lambdatest.com”} )using JSON Wire protocol over HTTP sent to the browser driver (say ChromeDriver) for each command. Every browser driver uses the HTTP server to receive an HTTP request.

2. JSON Wire Protocol is responsible for communicating between any client and the server by sharing the data. The browser driver receives the HTTP request via the HTTP Server. This HTTP server performs all the typical actions or instructions on the browser driver, and then the browser driver sends a request to load the URL on the real browser.

3. After performing all instructions and commands, the execution status is sent back to the HTTP Server over the HTTP. The browser driver uses the HTTP server to receive the HTTP request and then send it back to the client library via the JSON Wire Protocol.

In Selenium 4.0, the role of JSON Wire protocol is completely removed. The browser driver directly communicates with the Selenium client libraries to execute various Selenium commands on the real browser.

Wrapping Up

After reading this blog on Selenium Architecture, you’re now equipped with valuable insights, including:  

  • Selenium WebDriver is the core component of the Selenium suite—essentially the brain behind automation. Other components include Selenium IDE, Selenium Grid, and the now-deprecated Selenium RC.  
  • Selenium WebDriver consists of several key elements: Selenium client libraries, the JSON Wire protocol, browser drivers, and actual web browsers. The browser drivers enable seamless interaction with multiple browsers.  
  • In Selenium 3.0, the JSON Wire protocol (over HTTP) was the primary communication channel between Selenium client libraries and the browser driver. Selenium 4.0 replaced it with the WebDriver W3C protocol, making interactions smoother and more efficient.  
About Author

Elen Havens