Exploring multilingual programming

Python is a prominent language in the ML and scientific computing space, and for good reason. Python is easy-to-learn and readable, and it offers a vast selection of libraries such as NumPy for numerical computation, Pandas for data manipulation, SciPy for scientific computing, TensorFlow, and PyTorch for deep learning, along with RDKit and Open Babel for cheminformatics. It is understandably an appealing choice for developers and researchers alike. However, a closer look at many common Python libraries reveals their foundations in C++. 

Revisiting C++ Advantages

Many of Python libraries including TensorFlow, PyTorch, and RDKit are all heavily-reliant on C++. C++ allows developers to manage memory and CPU resources more effectively than Python, making it a good choice when handling large volumes of data at a fast pace. A previous post on this blog discusses C++’s speed, its utility in GPU programming through CUDA, and the complexities of managing its libraries. Despite the steeper learning curve and verbosity compared to Python, the performance benefits of C++ are undeniable, especially in contexts where execution speed and memory management are critical.

Rust: A New Contender for High-Performance Computing

Rust, while newer to the scene than C++, has quickly gained recognition for its memory safety, speed, and concurrency without a garbage collector. Its growing ecosystem and tools like PyO3 and Maturin make it increasingly viable for integrating with Python, offering safe and fast alternatives for performance-critical parts of applications, namely in the backend. While Rust’s adoption into the ML community is still in the early stages compared to C++ and Python, its potential for enhancing the performance of Python applications and as the primary language for a standalone application is significant, particularly in areas where safety and concurrency are concerned.

Visualising Benchmark Results

To visualise the performance disparities between Python, C++, and Rust, let’s examine some benchmarks. Bugden et al. examined the CPU time and memory usage of a variety of programming languages including Python, C++, and Rust using three different algorithms.

These CPU time benchmark results highlight Python’s considerable lag in CPU time compared to Rust, C++, and others (C, Go, Java) displaying markedly better performance. Rust demonstrates the highest performance among the languages tested, closely followed by C++, and significantly outperforming Python.

These memory usage benchmark results place Java and Python on the less efficient end of the spectrum, with Python outperforming only Java. Rust and C++ exhibit superior memory management, with Rust slightly edging out C++ and closely following C, the most memory-efficient language tested. 

These figures highlight the benefits of incorporating C++ or Rust into projects where Python’s performance may be a limiting factor. By leveraging the strengths of each language, developers can achieve a balance between development efficiency and computational performance.

Considering Alternatives to Python

The choice to incorporate C++ or Rust alongside or instead of Python relies on specific project requirements. C++ and Rust can offer substantial benefits for when handling CPU-bound tasks or operating within memory-constrained environments. Existing libraries in C++ and even some newer crates in Rust provide a foundation that ML developers can build off for their own high-performance tasks. As the demands on computational resources expand with the advancement of ML techniques and data collection, especially in the cheminformatics space, the need for a broader programming toolbox becomes clear. Admittedly, Python is far more readable which makes sharing code significantly easier. However, incorporating C++ and Rust alongside Python allows developers and researchers to tackle a wider array of challenges more effectively. This multilingual approach not only optimises performance but also ensures that the chosen tools are best suited to the task at hand.

Author