Efficient Client-Server Communication: An Overview of Protocols and Techniques
Exploring HTTP, Polling, Webhooks, SSE, and WebSockets in the client-server model
Introduction:
“In the world of web development and real-time communication, understanding various methods of data exchange is crucial. Modern web applications require efficient and seamless communication between the client and the server to deliver dynamic content and provide real-time updates. In this article, we will delve into the diverse protocols and techniques employed in client-server communication, enabling developers to make informed decisions when implementing these interactions.“
Table of contents:
Let’s start with the simple HTTP protocol. It is a widely used protocol in today’s world, and to go further we need to understand it because all the examples in this article are based on it.
HTTP
The Hypertext Transfer Protocol (HTTP) is a half-duplex protocol used for communication between web browsers (clients) and web servers. It serves as the foundation for data exchange on the World Wide Web.
HTTP is a protocol that operates over the TCP¹ (Transmission Control Protocol) transport layer protocol. By default, it uses port 80 for unencrypted connections and port 443 for encrypted connections (using HTTPS).
HTTP — a classic example of synchronous client-server interaction, when the client initiates a request, waiting for the result, and the server responds with a corresponding response.
Key components:
Uniform Resource Identifier (URI) : The client includes a URI in the request to specify the desired resource. The URI consists of a URL (Uniform Resource Locator) or URN (Uniform Resource Name).
HTTP Methods : HTTP defines several methods (also known as verbs) that specify the desired action to be performed on the server. The most commonly used methods are:
GET: Retrieves a representation of a resource.
POST: Submits data to be processed by the server, often used for form submissions.
PUT: Updates or replaces a resource with the provided data.
DELETE: Removes a specified resource.
4. Headers :
The client can include additional headers in the request to provide information to the server. Headers can specify things like the client’s user agent, accepted content types, authentication credentials, and more. Example:
User-Agent: Mozilla/5.0
,Accept: text/html
The server includes headers in the response to provide information back to the client. Headers can specify things like the content type of the response, caching directives, cookies, and more. Example:
Content-Type: application/json
,Cache-Control: max-age=3600
5. Body : For methods like POST and PUT, the client can include a request body that contains additional data to be sent to the server. The format of the request body depends on the data being transmitted (e.g., form data, JSON payload). The response body contains the actual data returned by the server, the format depends on the requested resource.
Structure of HTTP request:
A request typically includes some form of input or data payload that is needed by the server to perform the requested action.
POST /api/v1/users HTTP/1.1
Host: example.com
Content-Type: application/json
Authorization:Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
{
"name": "Alice",
"email": "alice@example.com",
"password": "secret123"
}
In this example, the request uses the HTTP POST method to create a new user resource on the server. The URI or endpoint is /api/v1/users, and the request includes several headers, such as Content-Type
and Authorization
. Finally, the request body includes a JSON object with data for the new user, including their name, email, and password.
The response would look like this:
HTTP/1.1 201 Created
Content-Type: application/json
{
"id": "1234567890",
"name": "Alice",
"email": "alice@example.com",
"createdAt": "2023-05-25T10:15:30Z"
}
There are situations where the client needs to know the status of available data, for example, to display it on a web page. To do this, the client will constantly ask for new data. Such a technique is called polling.
Polling is a method in which the client repeatedly asks the server for new data. There are two types of polling: short polling and long polling.
Short polling
In short polling (a.k.a. AJAX polling), the client requests data from the server at regular intervals and if it is available, the server returns a response, otherwise, it returns an empty response.
In essence, this is a simple HTTP request, which is called repeatedly.
Process of communication:
Client request. The client makes a request to the server.
Server response. The server responds either with the data itself or an empty response.
Repeat the action. Once the client receives a response from the server, it will wait for the specified interval and repeat the previous actions.
Pros & cons:
Pros of Short Polling:
Simplicity: Short polling is straightforward to implement, requiring minimal server-side setup. It involves making periodic HTTP requests from the client to the server at regular intervals.
Compatibility: Short polling works well with existing HTTP infrastructure and is compatible with most web servers and browsers. It can be easily implemented using standard AJAX techniques.
Low server overhead: Short polling does not require a persistent connection, which means the server does not need to maintain and manage numerous open connections. This can help reduce server load and resource consumption.
Cons of Short Polling:
Increased network traffic: Short polling involves making frequent HTTP requests even when there is no new data available. This can result in increased network traffic, especially in scenarios with high client concurrency, leading to unnecessary bandwidth usage.
Latency and delay: Short polling introduces inherent latency and delay because the client must wait for each request to complete before sending the next one. This can result in a suboptimal real-time experience, as there can be a noticeable delay between data updates.
Server load and scalability: As the number of clients increases, short polling can put a significant load on the server due to the frequent requests it generates. Handling a large number of concurrent requests can affect the scalability and performance of the server.
Long polling
With long polling, the client also repeatedly requests data from the server. But compared to short polling, the server is responsible for the waiting part.
Process of communication:
Client request. The client makes a request to the server.
Server response. If the data is available on the server, it responds with the data itself, otherwise, the server keeps the connection until the data is available.
Repeat the action. Once the client receives a response from the server, it makes a new request without waiting.
Long polling communication diagram
Pros & cons:
Pros of Long Polling:
Reduced network traffic: Long polling helps minimize unnecessary network traffic compared to short polling. Instead of making frequent requests, the server holds the connection open until there is new data to send. This reduces the number of requests made by the client, resulting in lower bandwidth usage.
Near real-time updates: Long polling allows for relatively immediate updates as compared to traditional polling methods. The server can hold the connection open until new data is available, and once it is, the response is sent to the client immediately. This enables near real-time updates and a more responsive user experience.
Improved server efficiency: Long polling reduces the load on the server by avoiding the need for constant polling. The server only needs to respond when new data is available, leading to improved resource utilization and better server efficiency.
Compatibility: Long polling can be implemented using standard HTTP requests and doesn’t require any specialized protocols or infrastructure. It can work with most web servers and browsers without any additional setup.
Increased server overhead: While long polling reduces the number of requests made by the client, it can increase the server’s overhead compared to short polling. Holding connections open for an extended period can consume server resources, especially when there are many long-polling clients.
Cons of Long Polling:
Scalability Challenges: Long polling can pose scalability challenges for highly concurrent applications. Each long-polling connection consumes server resources, and handling a large number of concurrent connections can impact the server’s ability to scale effectively.
Timeout issues: Long-polling requests are typically set with a timeout value (time for request). If new data is not received within the specified timeout period, the server sends a response indicating no updates. This introduces the possibility of timeouts occurring and the need for the client to re-establish the connection, potentially leading to delays in receiving updates.
Complex implementation: Implementing long polling requires more complexity than traditional short polling. It involves managing open connections, timeouts, and handling potential edge cases related to network connectivity and server resource management.
Next, let’s move from request/response exchange to event exchange. Webhook, Server-Sent Events, and WebSocket are primarily designed for event-driven communication. Instead of constantly requesting information or triggering actions, these techniques allow applications to be notified or listen for specific events.
An event is a message or notification that indicates that something has happened in the system. An event can represent any occurrence that might be of interest to other parts of the system, such as a user action, a sensor reading, or a change in the state of an object.
Webhook
Webhook is a method of communication that allows one application to automatically send data or trigger events to another application in near real-time. It provides easy integration and allows one system to notify or pass information to another system based on specific events or triggers.
You can think about Webhook as a user-defined callback over HTTP.
One important feature of Webhook is the “reverse API”. You don’t have to constantly poll to see if the new information is ready or not. Instead, you register an HTTP call to a specific URI, and when the new information is ready on the server side, it will call your URI and thus notify you.
Process of communication:
Configuration : The system that generates the Webhook provides a configuration mechanism to specify the endpoint or URL of the receiving system. This endpoint is where the Webhook data will be sent.
Event triggering : When a predefined event or trigger occurs in the system generating the Webhook, it initiates the Webhook process.
HTTP request : The system generating the Webhook sends an HTTP request to the specified endpoint (usually GET or POST), containing relevant data or payload related to the event.
Receiving and processing : The receiving system, which acts as the Webhook listener or endpoint, receives the HTTP request. It extracts the data from the payload and processes it according to its requirements. This may involve updating databases, triggering actions, or any other desired operation.
Usually, by the client, we mean another system, such as a web server, because we can’t expose the URL in an isolated browser environment.
Examples of Webhook:
Payment Notifications: Many payment gateways and platforms utilize Webhooks to provide real-time notifications of payment events. For example, when a successful payment is made or a refund is issued, the payment gateway can send a Webhook to the merchant’s system, allowing them to update their records, trigger order fulfillment processes, or send confirmation emails to customers.
Notifications and Alerting: For example, a messaging or collaboration platform may use Webhooks to deliver notifications to external systems or applications. This could include notifying a chatbot about new messages, updating external dashboards with new metrics, or sending alerts to a monitoring system when certain conditions are met.
In Webhook, we solve the problem of additional resources used in polling by allowing the event provider to notify the client side (reverse API). But that may not be enough when it comes to real-time communication. SSE and Websocket fix this problem.
Server-sent events
Server-Sent Events (SSE): SSE is a one-way, server-to-client communication technique where the server sends continuous streams of data to the client over a single HTTP connection. SSE is commonly used in real-time updates for web applications such as social media feeds and stock market tickers.
SSE follows an event-driven model, where the server sends a stream of events to the client as they occur. This allows asynchronous communication.
The server initiates the SSE connection by responding to the client’s initial request with a specific content type text/event-stream
indicating that it will be sending a continuous stream of events.
Structure of event:
Here’s an example of a complete SSE structure of an event with event type, event data, event ID and event retry fields:
event: eventType
data: This is the event data
id: eventID12345
retry: 5000
Description of fields:
event: The event type provides additional context. Default values are
message, open, error
. The server may specify a custom type of event.data: The data field for the message. Can be plain text, JSON or any other string type.
id: Event identifier, allowing clients to keep track of the order and uniqueness of events.
retry: Reconnection time interval in milliseconds used in case of connection loss. The browser will take and wait for the specified time before trying to reconnect.
All other fields will be ignored.
Process of communication:
Client-Side EventSource : The client establishes an EventSource object in JavaScript, which serves as the interface for receiving server-sent events. The client typically creates this object by specifying the URL of the SSE endpoint.
SSE Endpoint : On the server side, a dedicated endpoint is set up to handle SSE requests. When the client establishes a connection to this endpoint, the server keeps the connection open and begins sending events as they occur.
Event Stream : The server sends events to the client as text-based UTF-8 encoded messages within the SSE stream. Each event is encapsulated in a specific format, consisting of one or more lines. The minimum requirement for an event is to include a “data” field, but additional fields like “event” (for specifying the event type) and “id” (for assigning an event identifier) can also be included.
Client Event Handling : As events are received by the client, the EventSource object triggers the appropriate event handlers in JavaScript, allowing the application to process the received data. The client can react to events, update the user interface, or perform any desired actions based on the received information.
Closing the connection : If the connection is closed, the client will automatically reconnect.
If the server wants the browser to stop reconnecting, it should respond with HTTP status 204.
If the browser wants to close the connection, it needs to call the .close() method in the EventSource instance.
In addition, there will be no reconnection if the response contains a
Content-Type
different fromtext/event-stream
SSE offers several advantages over other real-time communication techniques. It utilizes a standard HTTP connection, making it widely compatible with existing infrastructure and browsers. It provides a lightweight and efficient approach, as the connection remains open for as long as necessary, eliminating the need for continuous request-response cycles. SSE is particularly suitable for scenarios where the client only consumes a continuous stream of updates, such as stock tickers, social media feeds, or real-time monitoring data.
Although SSE can be used for most requirements, its main drawback is one-way communication. Next, we will look at a protocol that allows two-way communication in real-time.
WebSocket
WebSocket is a protocol that enables real-time, two-way, full-duplex communication between client and server over a single, long-lived connection. It utilizes the HTTP protocol to establish a connection. WebSocket can be used for a variety of real-time applications such as chat applications, real-time gaming, and live streaming.
WebSocket, like SSE, provides a persistent connection. But in contrast to SSE, it enables full-duplex communication, which means that both client and server can communicate over the same channel. This allows you to implement synchronous and asynchronous communication in real-time.
Structure of message:
In the WebSocket protocol, the information transmitted is called a message.
Once the WebSocket connection is established, the communication thereafter operates using WebSocket frames. WebSocket frames contain the actual data being transmitted and have their own structure, separate from the initial human-readable handshake. WebSocket frames include control frames (e.g., connection close
, ping
, pong
) and data frames (e.g., text
or binary
messages).
The actual WebSocket frame looks like this:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
Process of communication:
Handshake : WebSocket communication begins with a handshake process.
The client sends an initial HTTP request to the server, specifying the WebSocket protocol by upgrading the connection. This request includes a special header called “Upgrade” with the value “websocket” (
Upgrade: websocket
) and a “Connection” header set to “Upgrade” (Connection: Upgrade
).The server supporting
ws
protocol responds with the same headers and HTTP 101 status code, indicating a successful protocol upgrade.Once the handshake is complete, the WebSocket connection is established. The WebSocket protocol utilizes a specific URL format, typically starting with
ws://
for unencrypted connections orwss://
for encrypted (secured) connections, similar toHTTP
andHTTPS
protocols.Data Exchange : After the connection is established, the client and the server can exchange data in a full-duplex manner. Each party can send messages to the other party at any time without the need for explicit request-response cycles. Messages can be of various types, including text, binary data, or even structured data like JSON.
Connection Termination : The WebSocket connection can be terminated explicitly by either the client or the server, or it can be closed unexpectedly due to network issues or server-side errors.
To close the connection, either party sends a close frame, indicating the intention to terminate the connection. The other party responds with a close frame as well, acknowledging the closure. Both parties then perform cleanup operations and release associated resources.
When dealing with complex applications built on the WebSocket protocol, it is common to utilize an additional protocol layered on top of it. This additional protocol allows for the definition of event structures, such as a routing system, metadata, and data format.
Unlike HTTP, defining a route in a URI of a path is not a good idea, because WebSocket has a persistent connection and creating another one will increase the use of system resources and just take away all benefits of this protocol (for example if we only open and close connections for one request).
Instead, it is better to use the routing mechanism in the event itself. This way we keep one connection and can access to listen and send notifications for different events. By default WebSocket doesn’t have routing mechanism and only supports open
, message
, error
, close
events.
For example, we can define the following structure:
{
"route": "/messages/create",
"payload": {
"sender": "John",
"content": "Hello, everyone!"
}
}
Where “route” field specifies the destination or purpose of the event and “payload” field contains the actual data associated with the event. And on the server side we can parse new input messages to determine how to handle the event.
But there are also ready-made solutions. One of them is the Socket.IO protocol.
Conclusion
In this article, we took an overview of popular techniques and protocols used in client-server communication. Understanding these methods of data exchange is essential for creating efficient and interactive client-server communication systems.
We began by delving into the foundation of web communication, the HTTP protocol, which governs the exchange of requests and responses between clients and servers. We learned about the structure of HTTP messages, including headers, methods, and the transmission of data.
Next, we examined different types of polling, namely short polling and long polling. Short polling involves frequent requests from clients, while long polling enables near real-time updates by maintaining connections until new data becomes available. We discussed the advantages and considerations associated with each approach.
Moving on, we explored the concept of Webhooks, which allow servers to notify other systems through HTTP calls. Webhooks enable seamless integration and event-driven communication, making them invaluable in various scenarios such as automated notifications and triggering actions.
We also covered Server-Sent Events (SSE), which provide a unidirectional channel for real-time data streaming from servers to clients. SSE facilitates the delivery of continuous updates, making it an effective solution for applications requiring live data feeds or dynamic content.
Lastly, we examined WebSocket, a bidirectional communication protocol that enables full-duplex, real-time interaction between clients and servers. WebSocket is great for scenarios where instant data exchange, push notifications, and collaboration features are required.
By understanding these protocols and techniques, developers can choose the most suitable method for their specific requirements. Whether it’s the simplicity of HTTP, the responsiveness of long polling, the event-driven nature of Webhooks, the continuous updates of SSE, or the bidirectional communication of WebSockets, each approach offers unique benefits and use cases.
Notes
- HTTP/3 does not use a TCP connection, but instead, QUIC over UDP is used.
Thank you for taking the time to read this article. I appreciate your feedback and and would love to hear your thoughts in the comments below.