OSMC 2025 | CAPACITY PLANNING WITH THE UNIVERSAL SCALABILITY LAW

22 Mai, 2026

Marcel Hoffmann

Junior Consultant

Marcel hat seine Ausbildung im September 2024 als Fachinformatiker für Systemintegration bei NETWAYS Professional Services gestartet. Seine Aufgaben sind das Professional Services Team als Junior Consultant zu unterstützen. Seine Hobbys sind Fotografieren in der Natur, Reisen und Zocken.

von Marcel Hoffmann | Mai 22, 2026

OSMC

At OSMC 2025, Daniel Mitterdorfer delivered a talk that hit a sore spot for many engineers, SREs, and architects in the room: Capacity Planning with the Universal Scalability Law.

The session tackled one of the most uncomfortable (and common) questions in infrastructure engineering: “We expect 50% more users next quarter. How many more servers do we need?” Instead of relying on gut feelings or optimistic vendor promises, Daniel introduced us to a mathematical model that turns „guessing“ into „calculating“ based on measurable reality.

The Dream: Linear Scalability

Daniel started by addressing the ideal case we all secretly hope for: linear scalability. In a perfect world, if 10 servers handle 10,000 requests per second, 15 servers should handle 15,000. It’s clean, it’s intuitive, and according to Daniel, it simply does not exist in real-world systems.

X(N) = lambda * N

N = number of servers (or threads, CPUs, etc.)
lambda = throughput per unit
X(N) = total throughput

That statement immediately grabbed the audience’s attention. If linear scaling is a myth, what actually happens when we grow?

Why Systems Don’t Scale Linearly

Daniel broke down the two fundamental „villains“ that limit our ability to scale: Contention and Cross Talk.

Contention (Serialization)
Even in highly parallel systems, some work must happen sequentially. Daniel used a great analogy: election vote counting. While counting votes locally is a parallel task, aggregating them at a national level is inherently sequential
This effect is captured by the serialization coefficient (σ). He showed us that if just 5% of your workload is serial, your maximum theoretical speed up is limited to 20x—no matter how many hundreds of servers you throw at it. It was a powerful „aha!“ moment for the room; many of us have likely underestimated how damaging even a tiny amount of serialization can be.

Cross Talk (Coherency Delay)
The second effect is even more dangerous. When distributed nodes have to coordinate shared state, communication overhead grows quadratically. Daniel referenced the „handshake problem“:

N(N−1)/2

Doubling the participants doesn’t double the communication, it explodes. In our world, this manifests as:

Cluster coordination
Metadata synchronization
Distributed locking

As nodes increase, this „cross talk“ eventually dominates the actual work, leading to retrograde throughput, a nightmare scenario where adding more servers actually reduces total performance. You could see plenty of people in the audience nodding (and perhaps wincing) in recognition.
The Universal Scalability Law (USL)

Daniel then brought it all together into the full formula:

X(N)=λN/((1+σ(N−1)+κN(N−1))

σ (sigma): models contention
κ (kappa): models cross talk

While the math might look intimidating at first glance, Daniel did an excellent job building it step by step with intuitive examples, making the logic surprisingly accessible.

Real-World Case Study: Elasticsearch

Since Daniel Mitterdorfer works at Elastic, he shared a fascinating deep dive into Elasticsearch bulk indexing. Using nonlinear regression (specifically curve_fit from SciPy), he derived real coefficients:

Serialization: ~1%
Cross Talk: ~0.00009

Even with these tiny numbers, the system hit a ceiling at around 30 nodes before performance started to dip. Daniel was careful to clarify: this doesn’t mean Elasticsearch can’t scale further, but rather that scalability bottlenecks are often configuration problems. In this case, shard distribution was the culprit, a pain point many of us have experienced firsthand.

Debunking Vendor Claims

One of the highlights was when Daniel analyzed benchmark data from a vendor claiming „almost linear scalability.“ By applying the USL model, he found a 7% contention rate, showing the performance would actually flatten out around 20 threads.

His advice for when a vendor promises linear growth?

„Turn around and run. Or ask for the raw data and apply the USL yourself.“

That line got a good laugh, but the message was clear: Use the model to validate marketing claims before you sign the check.

My Personal Takeaways

This talk shifted my perspective on scaling. Adding nodes is the easy part; understanding when they stop being helpful is the real challenge.

Key Lessons:

Linear scalability is a myth: Stop chasing it and start measuring the real limits.
Teams scale like systems: One of the coolest insights was that the USL applies to organizations too. Contention is „waiting for approvals,“ and cross talk is „too many meetings.“
Budgeting for the Cloud: Just because you can spin up 100 instances in AWS doesn’t mean your software will know what to do with them.

Further Resources

If you want to dive deeper into the rabbit hole, Daniel Mitterdorfer recommended:

Guerrilla Capacity Planning by Neil Gunther
Practical Scalability Analysis with the Universal Scalability Law (a great free ebook)

Final Thoughts: This session is exactly why I love OSMC. Instead of just hearing „add more servers,“ we walked away with a mathematical framework to actually understand our infrastructure. If you missed it, the video recording in the archive is a must-watch!

This year’s Open Source Monitoring Conference is happening from Nov 17 – 19 in Nuremberg. Join us as we celebrate a very special 20th anniversary edition! Stay tuned and get your ticket!

Events