HTTP Fundamentals in Automated Testing

HTTP Fundamentals in Automated Testing

To tackle unexpected deviations, good knowledge about the surrounding context is required.

Thus said, knowing and understanding how the WWW operates in general, might be the context a QA engineer or software developer needs. In a series of articles, we will explore some fundamental concepts and technologies that every QA professional needs to master if he/she truly desires to understand “what is going on under the hood”. 

What to expect in the following paragraphs:

Definition and main attributes of the protocol. Description of the OSI model.

Why is TCP the weapon of choice when HTTP is dealing with its data packages transportation?

Headers - HTTP property which defines its extensibility.

HTTP Methods

In addition, we will explore what is the role of cookies as means to tackle HTTP’s statelessness. In the end, we will finally move to the future, HTTP’s next version that is.

Answering the question - Do we really need a new transport protocol? - will explain what problems the new QUIC protocol is poised to solve and how will it shape HTTP/3.0.

What in reality is HTTP, how does it work, where did it all start?

According to the official HTTP documentation, supported by the W3C (World Wide Web Consortium), the Hypertext Transfer Protocol is a stateless and extensible application-level request/response protocol that operates by exchanging messages across a reliable transport- or session-layer “connection”. 

Now, what does all that mean?

Back to History Class

In the early days of the WWW, literally, when the web was being envisioned and then implemented by Tim Berners-Lee and his team at CERN, HTTP was one of the fundamental pillars that transformed our lives forever. The initial draft of the protocol had no version; it was later identified as HTTP/0.9 to differentiate from the upcoming versions. Sometimes referred to as a one-line protocol, HTTP/0.9 is simple enough to transport text documents(HTML). With only one possible GET method, any text document could have been returned to the requesting party.

The next phase of this fascinating evolution is HTTP/1.0. was when extensibility was conceived. The notion of headers was introduced, both for the requests and the responses. By providing metadata about the transmitted content, HTTP becomes flexible and extensible. 

Implementation of the protocol’s features was a result of a try-and-see approach over the first half of the 90s. There is no actual standard about its implementation, even though a real definition with the described common practices of the 1.0 version emerges from the stormy early WWW years, up until HTTP/1.1.

The most notable improvements in 1997’s HTTP/1.1 version are:

1. A connection can be reused, saving time to reopen it numerous times to display the resources embedded into the single original document retrieved.

2. Pipelining has been added, allowing to send a second request before the answer for the first one is fully transmitted, lowering the latency of the communication.

3. Chunked responses are now also supported.

4. Additional cache control mechanisms have been introduced.

5. Content negotiation, including language, encoding, or type, has been introduced and allows a client and a server to agree on adequate content to exchange.

6. Thanks to the Host header, the ability to host different domains at the same IP address now allows server colocation.

Although Google would later remove the support for SPDY, this research project for creating a new application-layer protocol will pave the way for HTTP/2. Actually, SPDY’s primal focus was to reduce latency. The basic changes made to HTTP/1.1 to create SPDY included: “true request pipelining without FIFO(First in First Out) restrictions, message framing mechanism to simplify client and server development, mandatory compression (including headers), priority scheduling, and even bi-directional communication”.

In the end, here is a summary of this version’s main differences from its predecessor.

The HTTP/2 protocol has several prime differences from the HTTP/1.1 version:

-> It is a binary protocol rather than text. It can no longer be read and created manually. Despite this difficulty, improved optimization techniques can now be implemented.

-> It is a multiplexed protocol. Parallel requests can be handled over the same connection, removing the order and blocking constraints of the HTTP/1.x protocol.

->It compresses headers. As these are often similar among requests, this removes duplication and overhead of data transmitted.

->It allows a server to populate data in a client cache in advance of it being required through a server push mechanism.

ISO, OSI – Fun Game of Acronyms

Let us start with the stateless part from the former definition. 

The most straightforward explanation is that each request message can be understood by the recipient in isolation of other messages. This simplifies the server design because there is no need to allocate storage to deal with conversations in progress dynamically. 

As part of the Open Systems Interconnection model (OSI Model), HTTP is an application-layer protocol type. In a broader sense, the application layer is responsible for processing the end-users of an application. As shown on the following graph, the HTTP protocol is part of the 7th, last layer. The OSI model, in a nutshell, could be defined as a concept-based model that defines and sets standards in computing or telecommunication systems. If you read aloud the previous sentence again, it might just remind you of the ISO standards used in all kinds of industries. You can not be closer to the truth. The actual body responsible for OSI coming to life is I****nternational Organization for Standardization (ISO). If you dare to dig deeper, check the following link

Graphical user interface Description automatically generated with medium confidence

Next on the line is HTTP’s operational concept.

What Is TCP? Why Does It Matter?

Transmission Control Protocol (TCP) provides communication between an application program and the Internet Protocol (they are frequently written as TCP/IP.) An application does not need to require packet fragmentation on the transmission medium or other mechanisms for sending data to be sent via TCP. While IP handles the actual delivery of the data, TCP keeps track of ‘segments’ - the individual units of data transmission that a message is divided into for efficient routing through the network.

Due to unpredictable network behavior, IP packets can be lost or delivered out of order; TCP detects and minimizes these issues by reordering packet data or requesting redelivery. This accuracy comes with a trade-off in speed. TCP is known more for reliability than UDP, for instance, but this accuracy comes from trading speed, sometimes coming with a delay of several seconds.

The figure above graphically depicts where the IP packet stands to TCP.

The data portion of each IP packet is formatted as a TCP segment. Each segment is divided into a header and data. The header part itself consists of many fields. These are all the blue-colored sectors of the header. Without getting too much into it, we should check out the three characters’ long acronyms, Checksum, Acknowledgment, and Sequence numbers.

Step 1: Establish connection

When two computers want to send data to each other over TCP, they first need to establish a connection using a three-way handshake (TWH). In terms of simplicity, imagine that those two computers meet on the street of a small town.

“Hey there, can you tell a joke?” – Machine A

If you are asking me if I am able to tell jokes, I am proud to confirm that I can and I will.” – Machine B

“Great, I am ready to hear it.” – Machine A

Step 2: Send packets of data

When a packet of data is sent over TCP, the recipient must always acknowledge what they received. Sequence and Acknowledgment number fields are responsible for keeping track of whether data was successfully received, lost, or accidentally sent twice.

Step 3: Close the connection

Either computer can close the connection when they no longer want to send or receive data. To initiate closing a connection, the requesting machine should set the FIN bit to 1. After that, the process is the same as the three-hand shake in step 1.

Losing packages and the wrong order of receiving them are the most common issues when dealing with this type of data transportation. UDP, an alternative to the TCP protocol, can identify when data is corrupted. However, it can’t handle those issues. In this regard, UDP’s faster speed of data transaction is less preferable to TCP, which can take package loss or unintentional package reordering at a speed cost.

Extensibility in the Form of Header Metadata

Headers are metadata used in both requests and responses. There are numerous headers part of the HTTP standard. First introduced in HTTP/1.0, they make the protocol easy to extend. New functionality can even be introduced with a simple agreement between client and server about the new header’s semantics. Simply put for an application-specific logic, HTTP headers can be added. This also means that both server and client need to know about those newly added headers. Otherwise, a server will not recognize the header when receiving a request or vice versa in terms of a client receiving the response.

Headers are to be separated into four groups:

  • General Header

    Headers applying to both requests and responses but with no relation to the data eventually transmitted in the body.

  • Request Header

    Headers containing more information about the resource to be fetched or about the client itself.

  • Response Header

    Headers with additional information about the response, like its location or about the server itself (name and version, etc.).

  • Entity Header

    Headers containing more information about the body of the entity, like its content length or its MIME-type.

Also, If you check the official registry of all HTTP headers, you will notice some of them starting with the “X-” prefix. This is a deprecated convention for custom headers, implemented outside the existing protocol specifications. However, this convention is no longer applicable.

Graphical user interface, text, application, email Description automatically generated

The figure above shows how headers fit in a request/response pair.

Action, Verb, or Method

The HTTP protocol defines 8 methods which are sometimes referred to as verbs, although nowhere in the protocol’s specification verb is used, to notify the server what action should be performed on the identified resource. The first version of HTTP defined only one method – GET. Following the next version. HTTP/1.0 added new methods – HEAD and POST. The third method list expansion happened in the third version – 1.1. where OPTIONS, PUT, DELETE, TRACE, and CONNECT were implemented.

If you haven’t noticed yet, they are case-sensitive and always uppercased. 

The most staggering question that one might ask himself is the possibility of adding new methods. The answer is an absolute yes. An example of that is Web-based Distributed Authoring and Versioning(WebDAV ). It is a set of extensions to the HTTP protocol. The idea behind it is to provide means to edit and manage files on remote web servers collaboratively.  

The following list compromises every standard HTTP method’s name and a short description of its use case.

  • GET

    The GET method is used to retrieve information from the given server using a given URI. Requests using GET should only retrieve data and should have no other effect on the data.

  • Same as GET, but transfers the status line and header section only.

  • POST

    A POST request is used to send data to the server, for example, customer information, file upload, etc. using HTML forms.

  • PUT

    Replaces all current representations of the target resource with the uploaded content

  • CONNECT

    Establishes a tunnel to the server identified by a given URI.

  • DELETE

    Removes all current representations of the target resource given by a URI.

  • OPTIONS

    Describes the communication options for the target resource.

  • TRACE

    Performs a message loop-back test along the path to the target resource

Some of these methods are also known as safer than others. The basic principle on what determines the HTTP method as safe is the results it produces. If the method does not alter the state of the server, i.e., performs a read-only operation, then this method is safe. GET, HEAD, and OPTIONS are safe. On the other hand, PUT, or POST are not safe since they create or alter existing resources.

Before entering the realm of cookies, let us review two concepts. 

If you recall from the protocol’s definition, you might remember its stateless property. That means the client always needs to open a new connection for a new request.

What is a session?

1. The client establishes a TCP connection(or other types of transport layer)

2. The client sends its request and waits for the answer.

3. The server processes the request, sends back its answer, provides a status code and appropriate data (for a detailed list of all status codes, check this formal list or the informal one)

Although, as of HTTP/1.1, the connection is no longer closed after completing the third phase, which means the second and third phases can be performed any number of times, data is still not shared across requests during this session. 

On the other hand, sometimes, we need to keep data from a request across more than one request. This is where Cookies come into play. A cookie is a name/value pair sent to the client browser from the server-side application. 

Back to Fig.4, where is the metadata from the request, there is a Cookie field. Notice that semicolons separate many key/value pairs. 

The most fundamental use cases of cookies are:

Session management

Manages if the user is logged in or in an e-commerce world, where keeping data for items in a shopping cart could be a hideous implementation if it were not cookies coming to the rescue. Those use-cases are part of session management.

Personalization

User preferences, themes, and any other settings for a particular website.

Tracking

Recording and analyzing user behavior – almost all of the digital advertising industry rely on those cookies.

After receiving an HTTP request, a server can send one or more Set-Cookie headers with the response. The browser usually stores the cookie, and then the cookie is sent with requests made to the same server inside a Cookie HTTP header. An expiration date or duration can be specified, after which the cookie is no longer sent. Additional restrictions to a specific domain and path can be set, limiting where the cookie is sent.

Where is HTTP Headed?

HTTP/3.0 is the next iteration in this protocol’s evolution. There are significant advancements and changes to the underlying method of utilization. First of all, HTTP/2 itself is not the issue which potentially will be fixed – rather its implementation by vendors. Because this protocol is often “baked in” to routers, firewalls, and another network, any deviation from HTTP/2 is often seen as invalid, or worse, an attack. These devices are configured only to accept TCP or UDP between contacted servers and their users within a rigorous, narrow definition of what expected traffic should look like – any deviation, such as when a protocol has been updated, or new functionality has been introduced, is almost instantly rejected.

This issue is known as protocol ossification and is a considerable problem in resolving the underlying issues of HTTP/2. In terms of clarity, ossification happens when new Protocol features or changes in behavior are introduced, eventually being considered bad or illegal by systems or devices. New TCP options are either severely limited or outright blocked, so fixing HTTP/2 becomes less an issue of “what do to fix” and more an issue of “how to implement the fix.”

The second most crucial fundamental problem addressed by the Internet Engineering Task Force (IETF), the body responsible for HTTP development and standardization, is TCP’s relatively significant latency. Compared to UDP, TCP data safety features are preventing from transporting data packages fast enough. This is the most relevant difference from UDP.

In a sense, HTTP as a protocol is synchronous. You send a request. You wait for a response. Of course, HTTP provides parallelism by ensuring that any number of requests can be sent independently over separate connections.

In this hostile world for protocol innovation, a superhero, a rescuer, is planned to play a big part in getting things right. Say hello to QUIC. This is not an acronym – it is pronounced as a plain old English word – “quick”.

Based on UDP, it is a transport layer network protocol initially designed at Google, implemented, and deployed in 2012.

The initial QUIC handshake combines the typical three-way handshake that you get with TCP with the TLS 1.3 handshake, which provides authentication of the end-points and negotiation of cryptographic parameters. For those familiar with the TLS protocol, QUIC replaces the TLS record layer with its framing format while keeping the same TLS handshake messages.

Not only does this ensure that the connection is always authenticated and encrypted, but it also makes the initial connection establishment faster as a result: the typical QUIC handshake only takes a single round-trip between client and server to complete, compared to the two round-trips required for the TCP and TLS 1.3 handshakes combined.

In the case of transporting data over TCP, a loss of a segment results in blocking all subsequent segments until a retransmission arrives with no respect to the application streams encapsulated in the following segments. Beyond its faster handshake process, QUIC ensures that lost packets carrying data for an individual stream only impact that specific stream. Data received on other streams can continue to be reassembled and delivered to the application. 

Summary

To summarize, HTTP is one of the pillars that keep WWW up and running. Several improvements jump kept this protocol relevant to the ever-changing tech landscape. From HTTP/0.9 to HTTP/3.0, it has witnessed the introduction of various new properties, functionalities, and extensibility utilization.

We covered the basic mechanisms and some of the most important attributes such as Headers, HTTP Methods and TCP’s principal role. At the end the new QUIC protocol was introduced, which is poised to define the HTTP/3.0 in the near future.

Anton Angelov

About the author

Anton Angelov is Managing Director, Co-Founder, and Chief Test Automation Architect at Automate The Planet — a boutique consulting firm specializing in AI-augmented test automation strategy, implementation, and enablement. He is the creator of BELLATRIX, a cross-platform framework for web, mobile, desktop, and API testing, and the author of 8 bestselling books on test automation. A speaker at 60+ international conferences and researcher in AI-driven testing and LLM-based automation, he has been recognized as QA of the Decade and Webit Changemaker 2025.