Back to News
quantum-computing

From Training to Simulation to Real-Time Feedback: A GPU Tour of Infleqtion’s Q4Bio Project Toward Quantum-Enabled Biomarker Discovery

ColdQuanta
Loading...
8 min read
0 likes
⚡ Quantum Brief
Infleqtion secured a $2M Phase III contract for its Q4Bio project, partnering with UChicago and MIT to develop a hybrid quantum-classical platform for biomarker discovery in cancer data using 12 logical qubits. GPUs accelerate quantum model training via NVIDIA’s CUDA-Q and Xanadu’s IQPOpt library, enabling rapid optimization of IQP circuits—critical for identifying high-impact biomarkers from complex datasets. CUDA-Q’s GPU-accelerated simulators validate quantum models pre-execution, achieving 50x speedups on Perlmutter’s A100 GPUs, reducing hardware trial-and-error costs for noisy and noiseless scenarios. NVQLink enables real-time QPU-GPU feedback loops with 3.96µs latency, allowing adaptive error correction and calibration—key for scaling fault-tolerant logical qubits in dynamic experiments. The integrated workflow achieved 0.04% error in a 12-logical-qubit biomarker test, demonstrating hybrid quantum-classical efficiency in precision medicine applications.
From Training to Simulation to Real-Time Feedback: A GPU Tour of Infleqtion’s Q4Bio Project Toward Quantum-Enabled Biomarker Discovery

Summarize this article with:

How Infleqtion is integrating GPUs across the lifecycle of logical-qubit experiments for biomarker discovery—design, validation, and real-time control. In February, Infleqtion was proud to announce our selection for a $2M Phase III contract from Wellcome Leap’s Q4Bio program, alongside our academic partners at UChicago and MIT. Through Q4Bio, we are delivering a hybrid quantum-classical biomarker discovery platform that identifies compact, high-impact feature sets from complex multimodal cancer data, capturing higher-order correlations that are difficult to access with conventional methods and advancing precision diagnosis and treatment. Our Q4Bio project on quantum-enabled biomarker discovery is built around hybrid quantum computing that leverages GPUs to supercharge our QPU capabilities. In this post, timed alongside Infleqtion’s presence at NVIDIA GTC conference, we are spotlighting how combining GPU acceleration with Infleqtion’s Sqale QPU has force-multiplied our capabilities: We use GPUs to discover and train quantum models rapidly. We use GPU-acceleration through the NVIDIA CUDA-Q platform to simulate and validate what those trained models will do. We are using NVIDIA NVQLink to build toward a world where GPUs and QPUs operate in a real-time loop, with microsecond-scale feedback for fault-tolerant execution. Let’s dive into each of these three areas and how they’ve culminated in breakthrough results with 12 logical qubits on Infleqtion’s Sqale neutral atom quantum computer. Infleqtion’s end-to-end workflow for quantum-enabled biomarker discovery leverages GPU and QPU resources together. Our work on Infleqtion’s Sqale neutral atom quantum computer has led to applications with 12 logical qubits. Training: GPU-accelerated IQP circuit optimization The crux of our Q4Bio quantum-biomarker discovery application boils down to the task of training a “quantum neural network” to optimize a target and then inferencing from that trained model—just as in classical AI approaches. In the past year, a family of quantum neural networks has emerged, known as Instantaneous Quantum Polynomial (IQP). These IQP models sit in a “sweet spot” where: They support classical estimation of relevant objective functions, including with GPU acceleration, enabling fast training loops. Once trained, sampling (inference) from the resulting model is exponentially faster with quantum computing than with classical computing. Much as modern classical AI has evolved to separate hardware modes for training vs. inference, we find that hybrid IQP model training is well-suited to leverage GPU in the training state and QPU in the inference stage. To this end, Infleqtion has leveraged the open-source GPU-accelerated IQPOpt library, developed by Xanadu, to train models relevant to biomarker discovery. Under the hood, IQPOpt uses JAX (automatic differentiation + JIT compilation) to perform efficient training over optimization landscapes. The result is a fast loop for training large IQP instances—up to thousands of qubits—where the heavy lifting is batched linear algebra that maps naturally onto GPUs. In fact, this pattern is not limited just to IQP models, but also extends to a variety of other families such as Recursive-QAOA (RQAOA) circuits which can be trained via GPU acceleration. In Infleqtion’s Q4Bio Phase III, we are using GPU-accelerated model optimization to target instances operating on dozens of logical qubits. In addition to running on our NVIDIA GH200 Grace Hopper superchips purchased with our Q4Bio funding, we have also scaled up our work to 24,000 NVIDIA A100 GPU node hours of time on the Perlmutter GPU-centric supercomputer at NERSC, with tightly-integrated CUDA-Q support. Here is an example of one of the trained IQP models produced by our workflow: Hypergraph view of a trained 12‑logical‑qubit IQP model. Each hyperedge/triplet corresponds to a CCZ interaction; the learned structure highlights how the optimizer “chooses” multi-qubit correlations rather than only pairwise couplings. In this diagram, known as a hypergraph (generated by HyperNetX), each of the 12 black dots represents a logical qubit and the colored groups correspond to Z, CZ, and CCZ gates for singletons, pairs, and triplets respectively. This particularly trained IQP model succeeds in optimizing the objective function associated with our root biomarker discovery problem. Simulation: GPU-accelerated via CUDA-Q for Noiseless Ceilings and Noisy Validation Training is only part of the story. Before we spend scarce time on our Sqale QPU, we want to answer two pragmatic questions: Noiseless: If we had perfect hardware, what is the ceiling? Noisy: With our hardware noise model (and target trained IQP model), what should we actually expect to see? Integration with GPU-accelerated simulation via CUDA-Q has been central in our ability to build an end-to-end workflow where the same code path can be simulated and then executed on real QPUs. In our prior logical-qubit collaboration (publication) with NVIDIA, CUDA‑Q’s GPU-accelerated simulators, parameterized kernels, and support for custom gates/noise models and mid-circuit measurement/conditional logic were key enablers of fast iteration. For Q4Bio, we have taken that same approach: Use GPU simulation to explore ansatz choices and parameter regimes rapidly. Use noisy simulation to de-risk experiments and interpret results. Keep the “handoff” from simulation to execution as frictionless as possible. Our typical workflow for this looks like this: # Pseudocode: 1) noiseless ceiling 2) noisy validationimport cudaq # Define a parameterized kernel (IQP-like / commuting structure)@cudaq.kerneldef ansatz(params: list[float]): q = cudaq.qvector(N) h(q) # apply learned commuting phases (e.g., CZ/CCZ pattern) for (i,j,k), theta in zip(ccz_triplets, params): # conceptual placeholder for a diagonal 3-body phase / CCZ-like gate ccz(q[i], q[j], q[k]) h(q) mz(q) # 1) Noiseless: set an upper boundideal = cudaq.sample(ansatz, trained_params, shots=SHOTS) # 2) Noisy: validate with a device-calibrated noise modelcudaq.set_noise_model(my_device_noise)noisy = cudaq.sample(ansatz, trained_params, shots=SHOTS) As referenced previously, many of our largest simulations have been performed on the A100 GPU instances on the Perlmutter supercomputer at NERSC. We have routinely observed > 50x speedups in runtimes for simulation via these GPU-centric simulation workflows, which build our confidence prior to execution on hardware. GPU-accelerated simulation results validating performance of a biomarker discovery model across a range of problem sizes and model depths. We observe > 50x speedups for simulation via GPU acceleration with CUDA-Q. These results were collected on the Perlmutter supercomputer. Real-time Feedback: Synchronizing QPU Execution with GPU Co-processing via NVQLink Training and simulation are “offline” loops. But for logical qubits at scale, the most demanding classical work is often a “online” loop: decoding, feedback, calibration, and adaptive execution. That is why Infleqtion’s pioneering announcement of NVQLink integration is important: it is designed as a standardized, low-latency connection between quantum control systems and GPU-accelerated compute, and it has been evaluated at microsecond-scale round-trip latencies (reported maximum round-trip latency of 3.96 μs). What this ultimately looks like at Infleqtion for real time integration of GPU with our Sqale QPU is: Our QPU executes a cycle (or a segment of a circuit) and produces measurement outcomes. Using its tightly integrated NVQLink interface, the QPU transfers measurement data directly to GPU memory and triggers execution of a callback function with deterministic low latency. A GPU kernel performs compute-heavy work (e.g., syndrome decoding, inference, or parameter updates). Infleqtion’s CUDA-Q backend translates GPU kernel results into QPU actions. These actions are immediately distributed to pulse processing units (PPUs) for execution of corrective operations, feedback actuation, and adaptive branching. This concrete control-flow maps cleanly to error correction and adaptive protocols. We emphasize that scaling logical qubits is not only a quantum hardware problem—it is also a throughput + latency problem for classical compute. NVQLink is explicitly motivated by workloads that need fast QPU↔GPU feedback for calibration and error correction, and aims to make those feedback loops practical through standardization and low latency. Infleqtion’s Sqale quantum computer on demo at GTC DC, where we debuted as a NVQLink launch partner. For those at GTC 2026, check out our Sqale demonstration at the NVIDIA booth (#345) Conclusion In Infleqtion’s Q4Bio project, GPUs are not just an accessory—they are the backbone of the workflow. The “tour de GPU” highlighted in this post exemplifies how hybrid quantum-classical computing supports the entire lifecycle of a quantum-accelerated application: from training to simulation to real-time execution shared across GPU and QPU. Finally, to tease how all of this comes together in a real-world demonstration: Infleqtion combined these core ingredients into an end-to-end workflow that ran on our Sqale quantum computer. After GPU-accelerated training of IQP models for biomarker discovery and validation via classical simulation, we ran trained models on Sqale with 12 logical qubits. Upon collecting data, we sampled a logical bitstring that achieved just 0.04% relative error versus the optimal solution to the underlying biomarker discovery problem! We plan to share more in the coming weeks as the Q4Bio program reaches the finish line. At NVIDIA GTC in San Jose, we encourage attendees to check out the Sqale + NVQLink demo at the NVIDIA booth (#345), where we are highlighting our integration. In addition, at the Infleqtion booth nearby (#438), attendees can dive deeper into Infleqtion’s applications across quantum computing, sensing, and software. Funding acknowledgment: the GPU-accelerated results described here used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC award NERSC DDR-ERCAP0032212, DDR-ERCAP0030280, and DDR-ERCAP0037400.

Read Original

Tags

neutral-atom
quantum-machine-learning
quantum-computing
quantum-hardware
coldquanta

Source Information

Source: ColdQuanta