This is just a quick draft that I’ll expand later with some concrete examples.
For simplicity I’ll use “thread” here in a sense that is synonymous with “process” or “task”, in that it’s simply a concurrent path of execution, using whatever ready mechanism is provided by your chose programming-language.
Many developers, even those with substantial experience, often turn to “multi-threaded” methods in their designs to handle the scaling-out of their program’s workloads. I submit that in the context of an application whose workload may scale up by many orders-of-magnitude, this ought to be approached in a more circumspect way.
In such cases, allocating your program’s work to more threads on a per-user-request basis, can be counter-productive. I’ll explain with a simplistic example..
Let’s say your program is public-facing and is intended to serve users who logon and make requests. You have already dedicated threads to such major areas as the UI and writing a log. Now, when each user comes on and needs your program to serve er requests, you spawn a new thread that is dedicated to that logged-on user, and retire it afterward. Let’s also assume that you have already optimized this somewhat by using a thread-pool, to avoid the overhead of creating and destroying thread-objects for each user-logon.
The problem, here, is that regardless of what CPU your program is running on, you only have but just so many CPU-cores and thus just so many threads that can run concurrently. Your server might be running an Intel i7 that has 4 cores and 8 threads available (at most – many will also be allocated to the operating system and it’s many subsystems). What will happen if 900 users suddenly want to logon and make requests? 1-3 of them may get served fairly quickly, and then the rest are to some degree frozen, their processing-threads blocked while awaiting the others to complete. If these block awaiting access to low-level resources such as the disk or database, you may have race-conditions, or a massive logjam that fails completely.
There are some tasks that deserve to have a thread dedicated to them, which simplifies your design, and those should be few in number.
For those tasks that are truly variable, and can scale up dynamically from zero to any huge number — you may want to contemplate a different approach.
Consider the message-queuing (MQ) pattern. Here, you actually dedicate just one thread to serving all of those users, and yet it may run far more efficiently and much faster. Consider…
If the system places the user-requests onto a queue, in the form of discrete messages, and your service thread then runs asynchronously taking those messages *from* that queue and serving them one-by-one, you can very effectively decouple the system from this user-servicing job. If you do have more cores that are readily available to you – you may consider factoring-out various tasks such as database-updates to their own threads, wherein those tasks can run asynchronously to your main service-thread, but the job of servicing these messages is still kept to a low, stable number, and your server should not become thread-starved as your load scales skyward.
Within the context of this message-servicing task, your program is running synchronously, serving exactly one message at a time. By running each user-request synchronously, you can now protect your system against race-conditions far more easily, and efficiently in terms of speed. It serves the message, kicks-off the write to disk or database, etc. and completes it — and then proceeds on to serve the next message. No thread-spinup or spindown overhead, no unnecessary blocking, and that one thread should be now free to run full-bore.
The key is that the number of threads now does not balloon upward in response to your increasing load. It might still fail to serve a massive number of users expeditiously if your system just cannot keep up in terms of raw throughput, but with the right kind of MQ-strategy at least it won’t freeze-up and crash your system (ideally). It will just run as fast as it can, falling behind perhaps — but then catching up as soon as it can, like a loyal Greyhound who stops to sniff the grass but then runs to keep up.
There are several excellent message-queuing components available on the open-source marketplace, and most of those are intended for communication across networks. But when all components are to run on the same server, you can very easily implement this strategy with fairly simple code. In fact, I believe that often this is a far simpler way to go about it than most other approaches.
Simplicity, is golden. Always strive to achieve simplicity.
Please let me know your thoughts. Thank you.
James W. Hurst