Back to News
quantum-computing

New Software Accelerates Complex Calculations by up to 500times

Quantum Zeitgeist
Loading...
4 min read
0 likes
⚡ Quantum Brief
Researchers from Caltech and the Flatiron Institute developed lrux, a JAX-based software that accelerates Monte Carlo wavefunction evaluations by up to 1,000× on GPUs. It optimizes low-rank updates for determinants and Pfaffians, slashing computational costs for quantum simulations. The package leverages JAX’s just-in-time compilation and automatic differentiation, enabling seamless integration with existing quantum Monte Carlo workflows. It supports both real and complex data types, broadening its utility across diverse quantum systems. Benchmark tests on NVIDIA A100 GPUs showed 200× faster determinant and 1,000× faster Pfaffian calculations for 1,024×1,024 matrices. Delayed-update strategies further boost performance by 20–40% by reducing memory traffic. Lrux reduces computational scaling from O(n³) to O(n²k) for successive updates, where k is the update rank. This breakthrough enables high-performance simulations of fermionic neural quantum states with low-rank orbital transformations. The team recommends double-precision arithmetic for numerical stability in large-scale simulations. The open-source tool is designed for modern accelerator hardware, paving the way for next-generation quantum sampling algorithms.
New Software Accelerates Complex Calculations by up to 500times

Summarize this article with:

Scientists have developed a new software package, lrux, to accelerate a key computational step in Monte Carlo methods. Ao Chen from the Division of Chemistry and Chemical Engineering at the California Institute of Technology, alongside Ao Chen and Christopher Roth from the Center for Computational Quantum Physics at the Flatiron Institute, present a JAX-based solution for fast low-rank updates of determinants and Pfaffians.

This research significantly reduces the computational cost of wavefunction evaluations, potentially achieving speedups of up to 100x for large matrices and enabling more scalable, high-performance simulations of quantum systems. lrux’s native integration with JAX transformations and support for both real and complex data types position it as a versatile component for a broad range of quantum Monte Carlo workflows. This innovation enables scalable, high-performance evaluation of antisymmetric wavefunctions, crucial for modelling complex quantum systems. The core of this achievement lies in the implementation of low-rank updates for both determinants and Pfaffians, alongside delayed-update strategies that optimise performance on modern accelerator hardware. By leveraging JAX transformations, including just-in-time compilation, vectorisation, and automatic differentiation, lrux seamlessly integrates into existing QMC workflows. The package supports both real and complex data types, broadening its applicability across diverse quantum simulations. Benchmarking on graphics processing units (GPUs) has demonstrated speedups of up to 1000× at large matrix sizes, signifying a substantial leap in computational efficiency. Furthermore, the approach is applicable to fermionic neural quantum states, provided the orbital transformations admit a low-rank representation. Illustrative examples and recommended environment settings are provided to facilitate adoption, with a strong recommendation to enable double-precision arithmetic for improved numerical stability in large-scale simulations. The work leverages the matrix determinant lemma to compute determinant ratios using a k × k matrix, Rt, rather than recalculating the full determinant. lrux natively integrates with JAX transformations, including just-in-time compilation, vectorization, and automatic differentiation, enabling efficient utilization of modern GPU architectures. Benchmarks conducted on GPUs demonstrated speedups of up to 1000× at large matrix sizes, highlighting the performance gains achieved through parallelization and optimized linear algebra operations. Furthermore, the package supports delayed-update strategies, trading increased floating-point operations for reduced memory traffic, allowing users to tailor performance to their specific hardware configurations. Low-rank updates deliver substantial speedups in determinant and Pfaffian calculations A speedup of up to 1000x in Pfaffian computation has been demonstrated using a new software package, lrux, on an A100-80GB GPU. Benchmarks reveal that for a matrix size of 1024, lrux accelerates determinant calculations by approximately 200x and Pfaffian calculations by approximately 1000x. The work details an implementation in JAX that achieves O(n2k) scaling for successive updates, where n represents matrix size and k is the update rank. The research presents a low-rank update technique that reduces the computational cost of wavefunction evaluations from O(n3) to O(n2) when the update rank is smaller than the matrix dimension. Time cost analysis, performed with parallel computations of 1024 determinants and Pfaffians, shows that direct computation scales with O(n3), while lrux achieves O(n2) scaling for large matrix sizes. Specifically, the time cost for both determinants and Pfaffians using lrux remains below 10−2 seconds even as the matrix size increases to 1024. Further optimization through delayed updates provides an additional speedup of 20% to 40%. These delayed updates were tested using parallel computations of 16384 determinants and Pfaffians with a matrix size of 128, with the optimal delay parameter determined to be 16 for determinants and 4 for Pfaffians. The study establishes that the efficiency of delayed updates is comparable to direct updates, offering a pathway to maximize computational performance when LRU is the primary bottleneck. The package supports both real and complex data types and integrates natively with JAX transformations including JIT compilation and vectorization. The package achieves this through efficient low-rank update strategies, reducing the computational cost of successive wavefunction evaluations from O(n3) to O(n2k) when the update rank, k, is smaller than the matrix dimension, n. Both determinant and Pfaffian updates are supported, alongside delayed-update options that balance computational effort with data transfer on modern processors. Lrux integrates seamlessly with JAX, a high-performance numerical computation library, benefiting from features like just-in-time compilation, vectorisation, and automatic differentiation, while accommodating both real and complex data types. Benchmarking on graphics processing units demonstrates speedups of up to 50x at large matrix sizes, indicating substantial performance gains. While direct matrix inverse updates offer a more reliable approach if parameter tuning is undesirable, lrux provides a flexible and robust foundation for large-scale simulations, enabling efficient evaluation of antisymmetric wavefunctions and paving the way for next-generation sampling algorithms. 👉 More information 🗞 lrux: Fast low-rank updates of determinants and Pfaffians in JAX 🧠 ArXiv: https://arxiv.org/abs/2602.05255 Tags:

Read Original

Tags

quantum-finance
government-funding
quantum-simulation

Source Information

Source: Quantum Zeitgeist