Reducing Cold Start Delays by 50% in Serverless and FaaS Environments with eBPF
Eliminating FaaS cold starts with eBPF
It was back in 2021, just before the COVID-19 crisis erupted, that I first encountered some personal use cases that could benefit from the concept of Serverless and FaaS.
What I really liked about it was that, in theory, you only pay for what you use, and when you’re not utilizing it, the resources are scaled down to zero until the next time an event triggers them.
While solutions like AWS Lambda or Google Cloud Run are easy to use, behind this abstraction, they run your code in containers. And with that comes an initial delay—the container needs to initialize before your code can actually execute.
Just think about when you want to run a Docker container locally—it always takes around a second or so before your code starts executing.
For most users, this delay is acceptable since it’s still better than keeping a service running 24/7.
But when it comes to latency-critical applications like web services, industrial monitoring, or stock trading, these delays can significantly impact performance or user experience.
A major contributor to these delays is so called cold start—the time it takes for a serverless function or container to initialize when it has been non-active for some time or is launching for the first time.
While exploring this issue during one of my research projects, I started thinking: Could eBPF help mitigate some of these cold start overheads?
At a high level, Serverless usage patterns can be categorized into three types:
Stream pattern: A steady flow of events where FaaS resources scale dynamically based on metrics like memory or CPU usage.
Burst pattern: Predictable events, such as from a scheduled cron job, allowing the provider to anticipate demand and pre-initialize resources just in time.
Random pattern: The most common pattern, where invocations occur sporadically, making deterministic pre-warming quite challenging.
💡 FaaS is a subset of Serverless (computing) that runs event-driven functions, while Serverless is a broader concept where infrastructure management, including databases, storage, and APIs, is abstracted from the user.
And while FaaS providers offer different solutions like (AI-based) prediction or advanced container state caching to mitigate cold starts, none really eliminate the issue without adding extra costs — something that we try to avoid in the first place.
One of my colleagues confirmed the impact of cold starts in his thesis (in Slovene), benchmarking different FaaS provider by running a simple function that just stored event data in various databases within the same geographical region.
The best way to interpret the following graphs is to observe how both client-side response time (top row) and backend execution time (FaaS function, bottom row) drop significantly for subsequent requests.
💡 Behind the scenes, Vercel Serverless Functions instantiate AWS Lambda functions.
Vercel also provides Vercel Edge Functions, enabling users to deploy FaaS functions in two distinct settings:
Global: Availability across multiple geographical locations for optimal performance and reduced latency
Regional: Deployment within a specific region to optimize performance for users in that region
Another interesting FaaS solution comes from Cloudflare, which instead of containers uses so called Workers—lightweight primitives (V8 isolates). In theory, these isolates require less time to instantiate.
💡 Cloudflare employs a pre-warming technique where, based on the SNI field in the first TLS packet, it begins initializing the corresponding function early.
While the graphs clearly show the cold start effect, it's also important to consider the choice of database, as it significantly impacts response time—but this is beyond the scope of this post.
Additionally, FaaS and database providers are not always strictly deployed in the same region. Often, the FaaS provider is positioned closer to the client at the edge, while data is stored in a distant region e.g. due to regulatory or compliance constraints—further increasing delays.
But when you think about FaaS functions running in the cloud, delays aren't just about container initialization—network latency is also a major factor.
It’s just unlikely that your FaaS function is used exclusively by people near the data center where it is hosted.
💡 Latency data is sourced from Global Ping Statistics, while also considering that TCP + TLS requires more round trips compared to a simple ICMP ping request-response.
In other words, when a client sends an event to a cold FaaS function, basic networking principles tell us that before any application traffic can be exchanged, a TCP handshake must first be completed, followed by a TLS handshake.
Seems unavoidable, right?
I thought so too—until I discovered that by leveraging eBPF, we can cut response times by up to 50%.
eBPF Solution
The core idea behind using eBPF to reduce cold start impact revolves around network latency.
Here’s what typically happens:
The client sends a TCP SYN packet.
The server responds with a TCP SYN+ACK packet.
The client completes the handshake with a TCP ACK packet and initiates a TLS handshake.
Only then is the HTTP request sent.
For a FaaS provider, it's the HTTP request that triggers container (FaaS function) initialization. But what if we could start the process much earlier?
That’s exactly what we can achieve with eBPF.
I’ve developed two different solutions to make this possible.
Keep reading with a 7-day free trial
Subscribe to eBPFChirp to keep reading this post and get 7 days of free access to the full post archives.