Thursday , June 21 2018

Using Graphics Processing Units for
Accelerated Information Retrieval

Ştefan MOCANU, Ramona DIN, Daniela SARU, Cosmin POPA
Faculty of Automatic Control and Computer Science, University Politehnica of Bucharest, 060042, Romania,

Abstract: The Parallel computing platforms enable dramatic increases in computing performance by harnessing the power of Graphics Processing Units (GPUs). Their design, based on a high level of hardware parallelization achieved through a big number of processing cores, made GPUs serious competitors for CPU based processing architectures. This fact is most obvious when it comes to processing huge amount of data. Many recent studies aimed at the development of GPU based implementations for various fields such as: astronomy, medicine, image processing, data compression and others. However, very few of them aimed at achieving information retrieval improvement based on GPU. Considering this, along with the latest stage of content based information retrieval algorithms and their practical efficiency for a wide variety of applications, this paper focuses on emphasizing parallel GPGPU algorithms performances against their CPU equivalents.

Keywords: content based information retrieval, graphic processing unit, GPU, GPGPU, inverted file, parallel computing.

>>Full Text
Ștefan MOCANU, Ramona DIN, Daniela SARU, Cosmin POPA, Using Graphics Processing Units for Accelerated Information Retrieval, Studies in Informatics and Control, ISSN 1220-1766, vol. 23 (3), pp. 249-256, 2014.

1. Introduction

Due to continuous and massive growth of heterogeneous data collections, new information retrieval techniques must be developed or, at least, present techniques should be improved. Information retrieval applications are currently widespread and millions of people are relying on them to accomplish their professional, academic or personal targets. Some of traditional (relational) database searching techniques are being replaced with data mining techniques dedicated to various applications such as: text documents search; image, audio and video content search; web search engines. Several challenges of digital information retrieval and solutions are presented in [3].

While search is the central topic within the area of information retrieval, researchers and IT engineers pay attention to a wide variety of other interrelated problems as well. These cover storage and data manipulation, document routing, filtering, and selective dissemination (e.g. news aggregators, e-mail spam filters), text clustering and categorization, topic detection and tracking, information extraction, summarization, question answering systems, expert/knowledge search, and other wide studied fields [1].

The volume of digital data available to be processed online is staggering. Large information service providers have massed databases of petabytes, the biggest intranets contain over a million web pages, and even private document collections showed an exponential growth lately. Internet itself was proven to closely follow a Moore like growth law [2]. As the amount of data grows, searching and indexing costs rise up altogether with penalties in response time. Parallel algorithms and distributed processing and storing techniques may represent an efficient solution for enhancing modern search systems.

Although parallel computing algorithms eventually produce same results as their sequential equivalents, they have some advantages over the latter: shorter processing time, capability of processing higher amounts of data or solving a more difficult problem. Though, parallel algorithms usually have a higher complexity which requires more complex or additional hardware architectures. For years, all these were challenges for the central processing unit (CPU) and the performance improvement was highly dependent on the number of cores, being, of course, limited by the Amdahl’s law [27].

During the last years, processing units’ design has split into two major directions. On one hand, we have multi-core processors (2-10 cores per chip, such as AMD Phenom II, Intel I3, I5, I7 or Xeon E7) which are mostly dedicated to general purpose application for which they offer improved performance due to increased working frequency and small degree of parallelization. On the other hand, we have many-core microprocessors (processors with huge number of cores, hundreds or even thousands, such as Nvidia GTX590 – 1024 cores or Radeon HD 7970 – 2048 cores) which were mainly developed by graphics card manufacturers in their attempt to improve video performance with the support of the highly parallelized GPU.

For the past years, in certain applications, GPGPUs have shown far better performances than multi-core microprocessors available on the market and, also, their performance improvements from generation to generation are more consistent than those of CPUs. For instance, medium and peak floating-point calculation throughput is about 10 times better in case of GPGPU as opposed to CPU while the operating bandwidths of GPGPU and CPU follow the same ratio. A practical implementation of a non optimum algorithm in [6] achieved performances of almost 20 times better whiles the GPU’s manufacturer states [7] that certain algorithms may achieved performances up to 400 times better.

The high parallelism of the GPUs’ hardware design recommends them not only for video or graphic applications but for general purpose applications that require massive data processing. Development extensions (such as CUDA or C++ AMP) allow programmers to interact directly with the GPU and write and run applications that are not related to video or graphical purposes. In fact, this is the reason for which particular GPUs are also known as GPGPUs (General Purpose GPUs) which will be addressed from now on in this paper.

All these can justify once more the interest for exploiting GPGPUs in other areas than video or graphic rendering especially when it comes to implementing applications with a considerable amount of parallel sections. Furthermore, parallel algorithms as a concern of programming paradigm, have as long of a tradition as the one of sequential ones [4] and, although

future processor generations promise to come along with hundreds of cores per socket [5], an application can only benefit from this if it`s designed for parallel execution.

As already stated, information retrieval is the foundation of any modern search process. Considering this along with the great potential of graphic processors, especially CUDA enabled platforms, and advanced researches in the field of parallel information retrieval algorithms, this paper aims to analyze and compare performances of GPGPU specific algorithms against their similar CPUs implementations. The results will be compared and presented against each other, and further used in the design process of a GPU search engine.


  1. MANNING, C. D., P. RAGHAVAN, H. SCHUTZE, Introduction to Information Retrieval, Cambridge University Press, England, 2009.
  2., 2012.
  3. MITEA, A.-C., D. I. MORARIU, M. BREAZU, D. VOLOVICI, Digital Information Retrieval, Studies in Informatics and Control, vol. 19(2), 2010, pp. 185-192.
  4. MOCANU, Ş., R. DOBRESCU, D. SARU, R. DIN, A. GRUMĂZESCU, Arhitecturi complexe folosite în prelucrarea paralelă a imaginilor (in Romanian), Revista Română de Informatică şi Automatică,   vol. 20(1), 2010, pp. 97-105.
  5. FEINBUBE, F., P. TROEGER, A. POLZE, Joint Forces: From Multithreaded Programming to GPU Computing, Software, IEEE, vol. 28(1), 2011, pp. 51-57.
  6. DOBRESCU, R., Ş. MOCANU, D. SARU, A. GRUMĂZESCU, R. DIN, Aplicaţii pentru procesarea de imagini pe platforma CUDA – studiu de caz (in Romanian), Revista Română de Informatică şi Automatică, vol. 21(2), 2011, pp. 81-86.
  7. *** NVIDIA – GPU Computing,
  8. OWENS, J. D., M. HOUSTON, D. LUEBKE, S. GREEN, J. E. STONE, J. C. PHILLIPS, GPU Computing, Proc. IEEE, vol. 96(5), 2008, pp. 879-899.
  9. KIRK, D. B., W. W. HWU, Programming Massively Parallel Processors. A Hands-on Approach, Elsevier, USA, 2010.
  10. SALTON, G., A. WONG, C. S. YANG, A Vector Space Model for Automatic Indexing, Comm. ACM, vol. 18(11), 1975, pp. 613-6205
  11. BUETTCHER, S., C. CLARCKE, G. CORMACK, Information Retrieval. Implementing and Evaluating Search Engines, MIT Press, USA, 2010
  12. /we-knew-web-was-big.html, 2014.
  13. CROFT, B., D. METZLER, T. STROHMAN, Search Engines: Information Retrieval in Practice, Addison Wesley, First Edition, Feb 2009.
  14. QUAN, L. H., E. I. SICILIA-GARCIA, J. MING, F. J. SMITH, Extension of Zipf`s Law to Words and Phrases, Proc. of the 19th Intl. Conf. on Comp. Linguistics, COLING-2002, pp. 315-320.
  15. COHEN, J., Novel Architectures: Solving Computational Problems with GPU Computing, Computing in Science & Engineering, IEEE, vol. 11(5), Nvidia, Santa Clara, CA, USA, 2009, pp. 58-63.
  16. DING, S., J. HE, H. YAN, S. TORSTEN, Using Graphics Processors for High Performance IR Query Processing, WWW `09, Proc. 18th Intl. Conf. on WWW, New York, USA, 2009, pp. 421-430.
  17. SULLIVAN, T., H. NELSON, T. McBEE, M. ALVINO, General-purpose Computing on Graphics Processing Units: GPU Processing of Protein Structure Comparisons, Technical report, 2007.
  18. TEODORO, G., R. SACHETTO, O. SERTEL, M. N. GURCAN, W. MEIRA, U. CATALYUREK, R. FERREIRA, Coordinating the Use of GPU and CPU for Improving Performance of Compute Intensive Applications, Cluster Computing and Workshops, CLUSTER `09. IEEE Intl. Conf., New Orleans, LA, 2009, pp. 1-10.
  19. AKENINE-MOLLER, T., J. STROM, Graphics Processing Units for Handhelds, in Proc. of the IEEE, vol. 96(5), 2008, pp. 779-789.
  20. ZAKI, M. J., Data Mining Parallel and Distributed Association Mining: A Survey, IEEE Concurrency, vol. 7(4), 1999, pp. 14-25.
  21. WENBIN, F., K. K. LAU, M. LU, X. XIANGYE, C. K. LAM, P. Y. YANG, B. HE, Q. LUO, P. V. SANDER, K. YANG, Parallel Data Mining on Graphics Processors, Technical Report HKUST-CS08-07, Oct 2008.
  22. ***, Death and DALY Estimates for 2004 by Cause for WHO Member States,, published 2009
  23. SPOERK, J., C. GENDRIN, C. WEBER., M. FIGL, S. A. PAWIRO, H. FURTADO, D. FABRI, High-performance GPU-based Rendering for Real-time, rigid 2D/3D Image Registration and Motion Prediction in Radiation Oncology, Zeitschrift für Medizinische Physik, vol. 22(1), Elsevier GmbH., 2012, pp. 13-20.
  24. BROWN, K., UCSD Hospital Tests GPU-accelerated Cancer Treatment, Nvidia Blog, 2012,
  25. *** NVIDIA, Improving the Quality of Healthcare, Many Steps at a Time; Medical Imaging Technologies Running on GPU Produce better, Safer Results in Less Time,, 2012.
  26. HUANG, C. H., D. RACOCEANU, L. ROUX, T. C. PUTTI, Bio-inspired Computer Visual System using GPU and Visual Pattern Assessment Language (ViPAL): Application on Breast Cancer Prognosis, Neural Networks (IJCNN), The 2012 Intl. Joint Conf., Barcelona, Spain, July 2010, pp. 1-8.
  27. AMDHAHL, G. M., Validity of the Single-Processor Approach to Achieving Large Scale Computing Capabilities, AFIPS Conf. Proceedings, 1967.
  28., 2014
  29. DIN, R., Techniques for Medical Images Analysis using Parallel Processing, PhD Thesis, UPB, 2012
  30. SIROMASCENKO, AL. – AL., I. LUNGU, A Massive Multilevel-parallel Microscopic Traffic Simulator with Gridlock Detection and Solving, Studies in Informatics and Control, vol. 22(3), 2013, pp. 279-288.