Parallel Retrieval of Dense Vectors in the Vector Space Model
Keywords:
Vector space model, symmetric multiprocessing, dense vector computations, message passing interfaceAbstract
Modern information retrieval systems use distributed and parallel algorithms to meet their operational requirements, and commonly operate on sparse vectors; but dimensionality-reducing techniques produce dense and relatively short feature vectors. Motivated by this relevance of dense vectors, we have parallelized the vector space model for dense matrices and vectors. Our algorithm uses a hybrid partitioning splitting documents and features and operates on a mesh of hosts holding a block partitioned corpus matrix. We show that the theoretic speed-up is optimal. The empirical evaluation of an MPI-based implementation reveals that we obtain a super-linear speed-up on a cluster using Nehalem Xeon CPUs.Downloads
Download data is not yet available.
Downloads
Published
2012-01-26
How to Cite
Berka, T., & Vajteršic, M. (2012). Parallel Retrieval of Dense Vectors in the Vector Space Model. Computing and Informatics, 30(2), 247–265. Retrieved from http://147.213.75.17/ojs/index.php/cai/article/view/164
Issue
Section
Special Section Articles