Gpu-accelerated Edge Inference Enables Real-Time ISAC with 75% Improvement on NVIDIA ARC-OTA

Summarize this article with:
The convergence of communication and sensing represents a significant step towards future 6G networks, yet realising this potential demands innovative approaches to signal processing and data analysis. Davide Villa, Mauro Belgiovine, Nicholas Hedberg, Michele Polese, Chris Dick, and Tommaso Melodia present a programmable framework that accelerates real-time, artificial intelligence-powered inference directly on edge radio access network infrastructure. This work addresses the critical challenge of integrating sensing capabilities without requiring dedicated hardware or altering existing network protocols, achieving latency under 0. 5 milliseconds for complex data extraction. Demonstrating the framework’s capabilities, the team developed cuSense, an indoor localisation application that, using only standard 5G signals, achieves a mean localisation error of 77cm, with 75% of predictions falling within one metre, paving the way for truly AI-native radio access networks and advanced integrated sensing and communication applications. Summary of the Research Paper: dApps for Real-Time RAN Control: Use Cases and Requirements and Related Works. This document summarizes research focused on dApps (Distributed Applications) for real-time control within Open RAN (O-RAN) environments, emphasizing the use of GPU acceleration and artificial intelligence for improved performance. These dApps are designed to run flexibly across the network, enabling faster response times, integrating machine learning for tasks like interference management, and allowing for easier deployment of new features. A central theme is leveraging GPUs to accelerate computationally intensive tasks, often utilizing NVIDIA’s Triton Inference Server for efficient AI model deployment. Research explores diverse use cases, including interference detection and mitigation, spectrum classification, anomaly detection, indoor localization, and integrated sensing and communication. Scientists are developing open-source testbeds like X5G, OpenAI Cellular, and NVIDIA’s Aerial RAN co-lab to facilitate O-RAN research. A significant trend is the development of Integrated Sensing and Communication (ISAC), combining communication and sensing functionalities within the same system, seen as a key enabler for 6G and beyond.
This research highlights the crucial role of GPU acceleration and open-source testbeds in driving innovation, suggesting a future where wireless networks provide both communication and valuable sensing capabilities. Real-time AI on Open RAN for Sensing This research pioneers a programmable framework for integrating sensing capabilities into future 6G cellular networks, addressing the challenge of extracting information from limited bandwidth communication signals.
The team engineered a system that processes real-time, GPU-accelerated artificial intelligence applications directly on the edge of the Radio Access Network, building upon the Open RAN dApp architecture. This innovative approach interfaces with a GPU-accelerated base station, feeding physical layer and medium access control data to custom AI logic with a latency of under 0. 5 milliseconds, enabling rapid channel state information extraction. To demonstrate the framework’s capabilities, scientists developed cuSense, an indoor localization dApp that operates without dedicated sensing hardware or modifications to existing network infrastructure. The system consumes uplink demodulation reference signal channel estimates, effectively removing static multipath components, and then employs a neural network to infer the position of a moving person. Experiments were conducted using a 3GPP-compliant 5G New Radio deployment, involving a custom experimental setup with a Foxconn radio unit and a Samsung user equipment. Data collection involved meticulously synchronized measurements of channel state information and video recordings, capturing over 400,000 CSI records and 30,000 video frames.
The team accounted for differences between International Atomic Time and Coordinated Universal Time, compensating for both a constant offset and residual clock skew. A computer vision pipeline, based on YOLOv8, extracted ground-truth 2D trajectories from the video, enabling accurate labeling of the collected data. The resulting dataset trained and tested the cuSense neural network, ultimately achieving a mean localization error of 77 centimeters, with 75% of predictions falling within a one-meter radius. This demonstrates the potential of the framework to deliver accurate and reliable sensing capabilities within existing cellular infrastructure. Low-Latency Signal Processing for 6G Networks Scientists have developed a programmable framework for processing signals from future cellular networks, achieving latency under 0. 5 milliseconds for complex channel state information extraction. This work centers on integrating sensing capabilities, crucial for 6G standardization, with existing communication infrastructure while maintaining high reliability and performance.
The team built a system leveraging GPU acceleration and the Open RAN dApp architecture, interfacing with a GPU-accelerated base station to feed data to custom artificial intelligence logic. Experiments demonstrate the framework’s capabilities through cuSense, an indoor localization application. Utilizing uplink data, cuSense removes static multipath components and employs a neural network to determine the position of a moving person. Evaluated on a standard 5G network deployment, cuSense achieves a mean localization error of 77 centimeters, with 75% of predictions falling within a 1-meter radius. This level of accuracy was achieved without requiring dedicated sensing hardware or modifications to the existing network infrastructure. The research team implemented a data sharing mechanism using ping-pong buffers in memory, allowing continuous data collection without interruption. This system copies selected data from device memory to pinned host memory via asynchronous CUDA transfers, minimizing impact on real-time signal processing. The framework supports multiple agents, each dedicated to a specific RAN function, and allows dApps to subscribe to data streams at different sampling rates. Data is shared via shared memory, enabling zero-copy access for dApps regardless of their execution location, even across isolated memory partitions. This approach ensures portability and supports concurrent access by multiple applications while meeting the stringent latency requirements of integrated sensing and communication.
Realtime Localization Within Existing 5G Networks This research presents a programmable framework and accompanying application, cuSense, that demonstrate real-time integrated sensing and communication capabilities within existing 5G networks.
The team successfully implemented an indoor localization system that operates directly on a 3GPP-compliant 5G base station, utilizing readily available signals and GPU acceleration to achieve sub-meter accuracy. Specifically, cuSense infers the position of a moving person by processing channel estimates from uplink communication signals, removing static interference, and applying a neural network, all without requiring dedicated sensing hardware or modifications to the network infrastructure. The framework and cuSense application represent a significant advancement by enabling real-time data processing on the edge of the radio access network with low latency, under 500 microseconds. Experiments confirm consistent and robust performance, demonstrating the potential for wider deployment of similar sensing applications within future networks. While the current system relies on a collaborative uplink transmission and requires environment-specific calibration, the authors acknowledge these limitations and plan to extend the work to dynamic environments, multi-cell configurations, and scenarios requiring fewer collaborative devices. To facilitate further research, the team intends to release both the framework and cuSense pipelines as open-source resources for the wider scientific community. 👉 More information 🗞 Programmable and GPU-Accelerated Edge Inference for Real-Time ISAC on NVIDIA ARC-OTA 🧠 ArXiv: https://arxiv.org/abs/2512.06493 Tags:
