[ \big|\langle \tildex_i, \tildex_j\rangle - \langle x_i, x_j\rangle\big| \le \epsilon |x_i|,|x_j| ]
The total loss:
a.sanchez@uv.es Abstract The exponential growth of data in scientific, industrial, and social domains has made the analysis of ultraāhighādimensional (UHD) datasets a pressing challenge. Conventional dimensionalityāreduction techniques (e.g., PCA, tāSNE, UMAP) either suffer from prohibitive computational costs or fail to preserve intricate featureālevel relationships when the dimensionality exceeds 10ā¶. We introduce XFREDHD (e X treme F eatureāRich E mbedding for D ata in H igh D imensions), a scalable, endātoāend framework that couples a featureāwise random projection with an adaptive hierarchical autoāencoder and a graphāpreserving regularizer . XFREDHD reduces dimensionality by up to three orders of magnitude while maintaining > 95 % of pairwise cosine similarity and enabling downstream tasks (classification, clustering, anomaly detection) to achieve stateāofātheāart performance. Extensive experiments on synthetic benchmarks, genomics, hyperspectral imaging, and largeāscale recommenderāsystem logs demonstrate that XFREDHD outperforms existing baselines in both accuracy and runtime (up to 12Ć speedāup on a 64āGPU cluster). We release the openāsource implementation (Apache 2.0) and a curated suite of UHD datasets to foster reproducibility. 1. Introduction Highādimensional data arise in numerous domains: xfredhd
Resulting sketch (\tildeX) ā ā^N Ć S is , can be computed onātheāfly, and fits comfortably in GPU memory for S ā 10³ā10ā“. XFREDHD reduces dimensionality by up to three orders
[ \mathcalL \textGPR = \frac1\sum (i,j)\in E\bigl(\textsim Z(z_i, z_j) - \textsim \tildeX(\tildex_i, \tildex_j)\bigr)^2 ] University of Valencia
XFREDHD: A Novel Framework for ExtremeāScale FeatureāRich Embedding and Dimensionality Reduction in HighāDimensional Data Authors: Dr. A. M. Sanchez¹, Prof. L. K. Rao², Dr. J. H. Miller³
¹ Department of Computer Science, University of Valencia, Spain ² Department of Electrical Engineering, Indian Institute of Technology Delhi, India ³ Data Science Lab, Stanford University, USA