Author: Abd El-Hady,Esraa Abd El-Raouf Hamed/ Title: An Efficient Ranking Module for an<br>Arabic Search Engine

Search In this Thesis

العنوان

An Efficient Ranking Module for an
Arabic Search Engine

المؤلف

Abd El-Hady,Esraa Abd El-Raouf Hamed

هيئة الاعداد

باحث / Esraa Abd El-Raouf Hamed Abd El-Hady

مشرف / Mohamed F. Tolba

مشرف / Nagwa Badr

مشرف / Mohamed Abdeen

الموضوع

Arabic Search Engine-

تاريخ النشر

2011

عدد الصفحات

98.p:

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science Applications

تاريخ الإجازة

1/1/2011

مكان الإجازة

اتحاد مكتبات الجامعات المصرية - Computer and Information Sciences

الفهرس

Only 14 pages are availabe for public view

from

Abstract

The plentiful content of the World Wide Web is useful to millions. Some simply
browse the Web through entry points. But many information seekers use a search
engine to begin their Web activity. In this case, users submit a query, typically a list
of keywords, and receive a list of Web pages that may be relevant, typically pages
that contain the keywords. Now, search engines became very essential information
resources for net users and they form a very important commercial industry.
Searching online provides you with a wealth of information, but not all of it will be
useful or of the highest quality. Search engines are distributed programs that dive
into the World Wide Web to find relevant information for a given search query.
Their fundamental components are: the crawlers, the indexer module, the collection
analysis module, the query engine, and the ranking module.
The ranking module represents a significant component in web search engines. The
main function of the ranking module is to sort the search results by relevance or
importance using information retrieval (IR) algorithms.
There were two kinds of methods in information retrieval, based on content and
based on hyper-link. The quantity of computation in systems based on content was
very large and the precision in systems based on hyper-link only was not ideal. It
was necessary to develop a technique combining the advantages of two systems.
Many web users are interested in Arabic web browsing whether the reason is
academic or commercial… suffer to find their search and request over the Arabic
search engine etc. As the existing web search engines are designed to perform
English web searches. They don’t generate morphological variations of Arabic
words but they just match the word as it is. Therefore their results contain only the
II
pages that exactly match the user query. They also don’t consider the different
meanings of a word so search results contain unrelated pages to user query.
In this research, we focus on implementing an enhanced ranking algorithm by
combining both the page content and the Hyper-Link with the focus on Arabic
search engines by taking into account the stem and the context of the Arabic word
by combining both the count of words related to query in the page and the count of
words related to query in outlinks pages of that page to calculate its rank, using
external database having the morphological meanings of the most Arabic words.
Then sort the pages according to its rank.
If there is more than one meaning to an input query word in case the user does a
query in using only one word, the user may choose the meaning he/she wishes to
search for. The search results will largely contain the inflected forms of the word
that belong to that meaning. This helps reduce the redundancy that results from
morphological search only.
Distributed page ranking are needed because the size of the web grows at a
remarkable speed and centralized page ranking is not scalable.
To speed up the ranking module process, this thesis proposes a parallel technique
for this Arabic ranking module. We applied this Arabic ranking module on a dataset
of 10000 Arabic web pages. This research proved that the optimal number of
processors needed for this parallelization is 10 processors. Using this number of
processors, the proposed parallel algorithm is very efficient and gets perfect
speedup.