This is just a quick draft that I’ll expand later tonight…
To keep this simple, I’ll use “thread” here in a sense that is synonymous with “process” or “task”, in that it’s simply a concurrent path of execution, using whatever ready mechanism is provided by your chose programming-language.
A lot of developers, even those with substantial experience, often turn to “multi-threaded” methods in their designs to handle the scaling-out of their program’s workloads. I submit that this ought to be approached in a more circumspect way.
Allocating your program’s work to more threads, can be counter-productive. I’ll explain with a simplistic example..
Let’s say your program is to serve users who log on and make requests. You have already dedicated threads to such major areas as the UI and writing a log. Now, when each user comes on and needs the program to serve his requests, you spawn a new thread that is dedicated to that user, and retire it afterward. Let’s also assume that you have already optimized this somewhat by using a thread-pool, to avoid the overhead of creating and destroying thread-objects for each user-logon.
The problem, here, is that no matter which CPU your program is running on, you only have but a few cores, and thus but a few threads, that can run concurrently. Your server might be running an Intel i7 that has 4 cores and 8 threads available (at most). What will happen if 300 users suddenly want to logon and make requests? 1-3 of them may get served fairly quickly, and then the rest are frozen, their processing-threads blocked while awaiting the others to complete. If these block awaiting access to low-level resources such as the disk or database, you may have race-conditions, or just a massive logjam that fails completely.
There are some tasks that deserve to have a thread dedicated to them, which simplifies your design, and those should be few in number.
For those tasks that are truly variable, and can scale up dynamically from zero to any huge number — you need a different approach.
Consider the message-queuing pattern. Here, you actually dedicate just one thread to serving all of those users, and yet it may run far more efficiently and much faster. Consider…
If the system places the user-requests onto a queue, in the form of discrete messages, and your service thread then runs asynchronously taking those messages *from* that queue and serving them one-by-one, you can very effectively decouple the system from this user-servicing job.
Within the context of this message-servicing task, your program is running synchronously, serving exactly one message at a time. By running each user-request synchronously, you can now protect your system against race-conditions far more easily. It serves the message, writes to disk or database, etc. and completes it — and proceeds on to serve the next message. No thread-spinup or spindown overhead, no blocking, and that one thread should be now free to run full-bore.
The key is that the number of threads now does not balloon up in response to a load. It might still fail to serve a massive number of users, but at least it won’t freeze-up and crash your system (ideally). It will just run as fast as it can, falling behind perhaps — but then catching up as soon as it efficiently can.
There are several excellent message-queuing components available on the open-source marketplace, and most of those are intended for communication across networks. But you can implement this strategy even within a given program, with all components on the same box — with a fairly simple design. In fact, I believe that often this is a far simpler way to go about it than most other approaches.
Please let me know your thoughts. Thank you.
James W. Hurst