Back to News
quantum-computing

Quantum Machine Learning Improves Data Grouping with New Distance Measures

Quantum Zeitgeist
Loading...
6 min read
0 likes
⚡ Quantum Brief
Researchers led by Syed M. Abdullah developed a quantum-enhanced k-means clustering method that replaces Euclidean distance with quantum kernels, achieving 91.0% accuracy on breast cancer datasets—surpassing classical techniques. The approach embeds classical data into higher-dimensional quantum Hilbert space using feature maps like SU2, improving cluster separation for complex, high-dimensional datasets where traditional metrics fail. Tested on NISQ-compatible shallow circuits, the hybrid algorithm demonstrated 88.6% accuracy on the Iris dataset, proving feasibility on current quantum hardware with limited qubits and noise resilience. Quantum superposition and entanglement enable non-linear decision boundaries, offering superior performance in medical diagnostics like cancer subtype identification compared to computationally expensive classical methods. While promising, scaling to larger datasets and mitigating quantum decoherence remain challenges before real-world deployment, with further research focused on optimization and error correction.
Quantum Machine Learning Improves Data Grouping with New Distance Measures

Summarize this article with:

Syed M. Abdullah at Lahore University and colleagues present a new approach to k-means clustering. The method addresses limitations of the classical algorithm, including sensitivity to initial conditions and difficulty with complex data, by integrating quantum computing techniques. Specifically, the team replace the standard Euclidean distance metric with a quantum kernel derived from feature-mapped quantum states, effectively embedding classical data into a higher-dimensional space to improve cluster separation.

Results demonstrate improved clustering stability and competitive accuracy on both the Iris and breast cancer datasets, with the SU2 feature map achieving 88.6% and 91.0% accuracy respectively, suggesting quantum kernels offer a key pathway towards more strong unsupervised learning with currently available quantum hardware. Quantum kernels surpass classical methods in breast cancer subtype identification A 91.0% accuracy rate was achieved on the breast cancer dataset using the SU2 feature map, a result previously unattainable with classical k-means clustering methods. Traditional algorithms often struggle with the high dimensionality and complex relationships within medical datasets, necessitating extensive manual analysis or computationally expensive techniques. The inherent challenge lies in accurately representing the similarity between data points when each point is defined by many features, often representing gene expression levels or imaging characteristics. Classical distance metrics, such as Euclidean distance, can become less meaningful in these high-dimensional spaces, leading to poor clustering performance. By embedding classical data into a higher-dimensional Hilbert space via quantum kernels, the algorithm creates a richer similarity field, enabling more accurate cluster separation and potentially revolutionising diagnostic procedures. This embedding process leverages the principles of quantum superposition and entanglement to create non-linear decision boundaries that are difficult to achieve with classical methods. The potential impact on diagnostic procedures stems from the ability to more accurately identify subtypes of breast cancer, leading to more personalised and effective treatment strategies. The breast cancer dataset used in this study likely contained features representing various biomarkers and clinical characteristics of patients, allowing the algorithm to differentiate between different cancer subtypes based on their unique feature profiles. The SU2 feature map, employing entangled quantum circuits to transform data, also demonstrated 88.6% accuracy when applied to the widely used Iris dataset, highlighting its flexible nature beyond medical applications. The Iris dataset, a classic benchmark in machine learning, consists of measurements of sepal length, sepal width, petal length, and petal width for three different species of iris flowers. Achieving 88.6% accuracy on this dataset demonstrates the algorithm’s ability to effectively cluster data with lower dimensionality and simpler relationships, suggesting its general applicability. The ZZ circuit, another quantum feature map tested, achieved comparable results, indicating multiple pathways to enhance clustering performance and offering versatility in algorithm design. Different quantum feature maps utilise varying quantum gate arrangements and entanglement structures to map classical data into the quantum Hilbert space. The fact that both the SU2 and ZZ circuits yielded similar performance suggests that the choice of feature map is not critical, and that multiple quantum circuit designs can be effective in improving clustering accuracy. This provides flexibility in algorithm design and allows researchers to explore different quantum circuit architectures based on the capabilities of available quantum hardware. This hybrid quantum-classical approach leverages the principles of quantum mechanics to enhance a widely used machine learning tool, providing a promising pathway towards strong and dependable unsupervised learning for diverse datasets. The algorithm operates on NISQ-feasible shallow circuits, meaning it can run on near-term quantum computers with a limited number of qubits, a significant advantage over many quantum machine learning proposals. NISQ (Noisy Intermediate-Scale Quantum) computers are currently available but are limited in terms of qubit count and coherence time. Shallow circuits, consisting of a few quantum gates, are less susceptible to errors caused by noise and decoherence, making them suitable for implementation on NISQ hardware. This is a crucial consideration, as many quantum machine learning algorithms require deep circuits with many qubits, which are beyond the capabilities of current quantum computers. The use of shallow circuits allows this algorithm to be tested and validated on existing quantum hardware, paving the way for practical applications. These promising results, however, do not yet demonstrate performance gains on datasets much larger than those tested, nor do they account for the practical challenges of implementing quantum circuits with sufficient durability for real-world deployment. The Iris dataset contains only 150 data points, and the breast cancer dataset is relatively small compared to many real-world datasets. Scaling this method to larger datasets will require further optimisation of the quantum circuits and potentially the development of new quantum algorithms. Furthermore, implementing quantum circuits with sufficient durability for real-world deployment requires addressing the challenges of quantum decoherence and gate errors. Further research will focus on scaling this method to larger datasets and exploring error mitigation strategies to improve the robustness of the quantum circuits and facilitate wider adoption. Quantum kernels enhance k-means clustering accuracy for complex biomedical datasets Clustering remains a cornerstone of data science, enabling insights across fields from healthcare to finance, but traditional k-means struggles with complex, high-dimensional data. The k-means algorithm aims to partition data points into k clusters, where each data point belongs to the cluster with the nearest mean (centroid). However, the algorithm is sensitive to the initial placement of the centroids, and can converge to suboptimal solutions. Furthermore, the use of Euclidean distance as the similarity metric can be problematic in high-dimensional spaces, as the distance between data points tends to become more uniform, making it difficult to distinguish between clusters. Previous attempts to refine k-means focused on better starting points and faster distance calculations; however, this latest work pivots to a more fundamental change, altering the way data similarity is measured. Improved cluster separation was achieved by embedding data into a higher-dimensional space, replacing a standard calculation within the k-means algorithm with a ‘quantum kernel’. Quantum kernels are functions that compute the similarity between data points in a quantum feature space. They are constructed by encoding classical data into quantum states and then measuring the overlap between these states. This process effectively maps the data into a higher-dimensional space, where the clusters may be more easily separated. The resulting hybrid approach yielded 91.0% accuracy on a breast cancer dataset, suggesting a subtle yet effective way to assess data similarity than traditional distance metrics and demonstrating the potential for quantum machine learning in biomedical applications. The improvement in accuracy suggests that the quantum kernel can capture more complex relationships between data points than the Euclidean distance metric, leading to more accurate clustering results. The research demonstrated improved clustering accuracy using a quantum-enhanced k-means algorithm. By employing quantum kernels derived from feature-mapped quantum states, the study successfully replaced the standard Euclidean distance metric within the k-means process. This resulted in an accuracy of 88.6% on the Iris dataset and 91.0% on the breast cancer dataset, indicating a more effective assessment of data similarity. The authors intend to further test this approach with additional datasets to validate its performance. 👉 More information 🗞 Hybrid Quantum–Classical k-Means Clustering via Quantum Feature Maps 🧠 ArXiv: https://arxiv.org/abs/2604.07873 Tags:

Read Original

Tags

quantum-machine-learning
quantum-investment
quantum-computing
quantum-hardware

Source Information

Source: Quantum Zeitgeist