Before we dive into the various considerations when choosing the parallel computing strategy that fits your needs (multithreading vs. multiprocessing and the difference between multithreading and multiprocessing), I want to start off by discussing Moore’s law. As I’m sure you know, Moore’s law claims that a processor’s clock frequency doubles every two years. This was true for many years but has lately become irrelevant, as the clock frequency doesn’t improve as it used to. In fact, Moore’s law is going to come to an end quite soon (I know, it’s devastating).
Computer manufacturers dealt with Moore’s law reaching its limits by introducing multiprocessors. Multi-core architecture is the present and future way for the market to address Moore’s law limitations. Todays’ workstations have 4, 8, 16, 32, 64 cores; leveraging multiple cores to increase application performance and responsiveness is expected, especially from classic high-throughput workloads such as rendering, simulations, machine learning, and other time-consuming, heavy computation problems.
When you want to write software that needs to benefit from the current hardware architecture of multiple processors, it’s crucial to consider the right architecture for the job. For the most part, you’ll have the choice between multithreaded and multiprocessor, or both. Your preference will carry implications on your software performance, future maintenance, scalability, memory usage, and other factors. There are pros and cons to choosing either one – you just need to get acquainted with them to make the right decision for you. In this post, I’ll review various considerations for choosing the correct multi-core strategy for your application’s requirement. In other words, investigate the pros and cons of multithreading vs. multiprocessing development according to various scenarios. So… without further ado, let’s jump right in.
Multithreading Development: Pros
The most prominent advantage of multithreading is the ease with which you can share data between threads (by using variables, objects, and others). It’s also very easy to communicate with the thread’s parent process.
If you handle large datasets that can’t be divided into subsets, multithreading would be beneficial because duplicating your datasets (as you do in multiprocessing) can take up a lot of time and memory while working with shared memory data introduces complexities to the software being developed.
Another well-known benefit of multithreading is that it’s supported by many 3rd party libraries (open source and commercial). Many available libraries today support multithreaded apps by providing a ‘thread-safe’ interface. Components, classes, functions, etc., with pre-built support for multithreading, enable developers to easily develop multithreaded code.
But it can’t all be good…
Multithreading Development: Cons
One of the major drawbacks of multithreaded code is that if one of the threads crashes, the entire application will crash. This is not the case with multiprocessing, where one process crashing does not necessarily affect other processes.
Another con is that it’s difficult to debug multithreaded applications. This presents a problem since you’ll undoubtedly encounter bugs. Usually, the debugger is not the best tool to handle multithreaded bugs, and you’ll need to use logs to track the bugs and figure out which thread is causing it (or communication between threads). Simply put, it will take you quite a bit of time to debug. It also requires a more experienced developer to develop and debug a multithreaded application correctly, so if you know in advance that your team consists of newbies, you’ll need to take it under consideration.
Another problem arises when too many threads are executed simultaneously. The processor may spend a lot of time context-switching at the expense of actual processing, dumping memory chunks to the file system, and causing I\O bottlenecks that end up slowing down the entire application and clogging the host machine.
There’s also the memory issue. All the threads use the same process memory, which is very good if you want to communicate between the threads. But if each thread requires more memory, your threads will be limited to the process memory space. This issue does not exist in multiprocess execution, where each process gets its own allocated memory space.
Speaking of multiprocessing, let’s review its pros.
Multiprocessing Development: Pros
As I mentioned earlier, if one of your processes crashes, it doesn’t mean the entire application crashes, which is a significant advantage (unless this is a kernel-space process we are talking about). So, if something fails and you wrote your application to be resilient, you can easily recover.
Another advantage is the debug issue, which we now know is a drawback of multithreading. It’s much easier to debug in multiprocessing since it’s easier to treat a small atomic process than a multithreaded application where threads run parallel in the same process memory space.
You’ll also have fewer locking issues. Yes, if the application’s processes are implemented similarly to multithreading (for example, working with the same shared-memory space), you will suffer the same complexities. Still, if the data is replicated (and then merged back when and if needed), the locking issues no longer exist.
Finally, multiprocessing is scalable. You can execute processes elsewhere, meaning offload them to remote machines or the cloud, while threads will always need to remain in the context of the process’s memory space.
But scaling isn’t everything, and there are drawbacks to multiprocessing as well.
Multiprocessing Development: Cons
Communication is a major drawback. Communicating between processes is more complex than communicating between threads. It requires custom development in order to share data and maintain locks and synchronization (where needed).
Additionally, the number of supporting libraries (what we call “process-safe” libraries) compared to thread-safe libraries is relatively small.
So now that we’ve reviewed the pros and cons of each strategy – let’s dive into the various considerations of choosing one over the other, as the benefits of each highly depend on the use case.
Multithreading vs. Multiprocessing: List of Considerations
Before deciding which architecture suits you most, there is a list of consideration to take into account:
- Syncing – Do you have a lot of data syncing to do, and do you need to maintain states between processes? Data synchronization is more effortless in multithreading because all threads share the same process memory space. If your parallel elements (either threads or processes) require extensive synchronization and are not encapsulated elements, multithreading can be a benefit in terms of ease of development.
- In multiprocessing, synchronization requires custom development, applying your own logical mechanisms and implementing an element through which the processes can communicate or synch – such as server application, shared memory, direct (peer to peer), TCP communications, database, etc.
Do you have large data sets? In this case, you need to ask yourself whether it’s read-only (in which using shared memory for multiprocess access is not a big pain) or whether you need complex read or write, thread-locks provide a more robust architecture for data accessing. At the same time, the ability to load the dataset to standard objects and classes to the easily-shared process space is much easier than implementing shared memory access for multiprocess access.
- Ease of development – Who will be involved in the development? Is it an experienced developer or a newbie?
On the one hand, multithreading includes many pre-made ‘thread-safe’ libraries. On the other hand, it’s easier to debug a single process in a multiprocessing environment. Additionally, code understanding is much easier in multiprocessing. Simply put, multiprocess development is simpler because encapsulation is easier to maintain; less experienced developers can maintain the code with less difficulty.
- Scaling – The scope of the project. How many tasks do you need to execute simultaneously? What will be the scope of the project in the future? How long does it take to execute each task? What are the dependencies between the tasks? How much memory will each task require? Multiprocess code is more scalable because you can easily use clusters or grid solutions or even the public cloud to get more compute resources on demand. When using a multiprocess, you can shift your code from working on a single machine to multiple machines. There are even some solutions (hint: Incredibuild) that allow you to scale and use idle machines and cores across your network or the public cloud without having to code anything. So you can achieve scalability, effectively transforming the machine running your application into a supercomputer with thousands of cores and gigs of memory. Do you think that in the future, your software may require and benefit from such compute power?
So far, I have discussed the difference between multithreading and multiprocessing in theory. But to really understand the difference, I wish to examine real-life scenarios such as Maven vs. Make.
Maven vs. Make
Maven and Make are both popular and common build tools. Maven is usually used to compile Java, while Make is mainly used to compile C/C++. Both are used to build a big project out of small files. Both have many (hundreds and even thousands) of compilation tasks, which are usually atomic and independent of one another. They have minimal communication, and the data set is small.
Now let’s review the parallel processing architecture decision made with each of these tools. Maven is multithreaded in a single-build context. It was single-threaded years ago, but over time it became multithreaded, which is a positive change because it allows us to utilize all eight cores instead of just one. However, it seems that multiprocess execution would have been more beneficial to users who ask for it to increase build speed. Why? There are several reasons:
First off, devs who build using open-source often need to compile open-source code as part of their project. Hence, they have a large source code which one machine, however powerful it may be, is slow to compile. This has implication for their compilation time, productivity, and time-to-market, especially when shifting to agile continuous integration.
Another example is DevOps, which requires a build to include more tasks such as automated testing, code analysis, packaging, etc., which makes compilation time even slower.
Finally, a multiprocess architecture would allow process distribution solutions to break the boundaries of a single machine and distribute the compilation workload to be executed across multiple machines simultaneously, resulting in faster and more scalable builds.
The Make build system architecture, as opposed to Maven, is multiprocess. Process distribution can only be applied to multiprocesses, not multithreaded solutions; threads can’t be distributed outside their parent process memory space while in multiprocess. Each process has its own memory space, environment variables, etc. In fact, most modern build tools are multiprocess and not multithreaded (such as Visual Studio’s MSBuild, CMake, Scons, Ninja, JAM, JOM, WAF, and many more). The reason is that it offers better memory usage, easier development, and scalability.
For example, it takes around 16 minutes to execute Qt on either Maven or Make with an eight-core machine. But if you’re using multiprocessing architecture, like Make, you can use distributed computing (Incredibuild), executing all compilation tasks remotely (across other machines in the network or the public cloud), reducing the build time to only 1 minute and 40 seconds! Think about it: 1 minute and 40 seconds instead of 16 minutes…
Of course, there are other real-life scenarios we can look at.
One such scenario is applications that require all data to be in-memory simultaneously (for example, scientific applications, weather forecasting, genetic algorithms, and so on). In these cases, multithreading accompanied by supercomputing is the way to go. You wouldn’t want to replicate massive datasets across remote machines because it will create a lot of network traffic and slow down the entire process. In scenarios like genetic algorithms, changes to the data made by a single thread must be viewed immediately by the other threads; making it work in a distributed multiprocess architecture will require frequent synchronization of the data, creating a very negative impact on performance.
Another real-life scenario in which multithreading is usually used is real-time applications such as financial transactions, defense systems, and automotive devices. These typically need to provide a real-time response that can’t wait for the process’s initialization time.
Database applications (such as CRM, ERP, SAP) are yet another example of multithreading usage in scenarios where most queries are read-only, and the minority are write-operations. The reason being that you’d like to easily share the dataset as part of your process space and have it readily accessible by the multiple compute threads working on it.
To sum up the scenarios in which multithreading is optimal, I would say you might prefer multithreading when you:
- Encounter simple scenarios (when there’s no need for complex multiprocessing execution);
- Deal with real-time applications;
- Require long initialization time (such as dataset loading) with short computation time;
- Experience complex communication;
- Your application includes only a few compute-intensive tasks.
Now let’s have a look at some scenarios in which multiprocessing has the advantage.
One such scenario is when you have large independent datasets with low dependencies between data entities, and you don’t need all of the data in the memory at the same time.
Let’s take, for example, financial derivatives, where you have end-of-day calculations for each stock. Each stock is a different data set and can have a separate process to compute the model. Another example is rendering, where you can break down the data into smaller chunks, and each movie frame will be computed by a different process.
And there’s the scenario when you have many calculations with small business units. Let’s take Sarine Technologies, for example. Sarine sells HW and software that runs tons of simulations on raw diamonds and other gems to find the best way to cut and generate the best revenue out of each piece. As millions of independent, parallel simulations can be executed for the sake of analyzing large diamonds, having the ability to run in a multiprocess manner allows Sarine Technologies to distribute the simulation processing to additional compute power in connected machines in the network or on the public cloud.
Streaming services also tend to be multiprocess. The Nvidia Shield, for example, is a set-top box that allows anyone to play high-quality heavy processing graphic games from the comfort of their couch using a remote farm of high-performance computers. The OnLive service is similar, enabling users to install the OnLive application on any machine. Netflix is another example that requires no elaboration. When the output is remote, we can always use load-balancing to ensure enough resources and back-end processes are available to serve the end-users according to dynamic demand.
With the unlimited capacity offered by the public cloud today, if your software may require scaling in the future, it is highly advisable to make sure that the architecture fits the ability to scale to multiple hosts. The cloud also offers pricing models that are based on scale and usage, which can provide an additional source of revenue by your software.
So, to conclude, when going over the architectural considerations and choosing multithreading vs. multiprocessing, you can ask yourself the following questions:
- Can my execution be broken down into many independent tasks?
- Can the work be distributed over more than a single host?
- Do I have large datasets I need to work with?
- How much time would it take my customers to execute a big scenario – do I have large scenarios?
- Do I have any special communication and synchronization requirements?
- Dev complexity – will my software require complex locking? Can I avoid that? And how often should I expect problems such as race conditions, timing issues, sharing violations, etc., that will make my software development and maintenance complex and costly?
- Do you want to scale processing performance by using the private or public cloud?
There are no magic answers here. This article is an invitation for a discussion with some relevant points to consider. With more than 25 years of managing and consulting software organizations, I found that starting such a discussion before diving into coding will save you a lot of frustration later on. It can even affect the success of your product as it matures and becomes more complex, with greater demand for computing abilities and scalability.
I hope that in bringing some of these aspects for your consideration, I’ve been able to foster these kinds of discussions in the design and development process.
Happy coding 🙂