الفهرس | Only 14 pages are availabe for public view |
Abstract The plentiful content of the World Wide Web is useful to millions. Some simply browse the Web through entry points. But many information seekers use a search engine to begin their Web activity. In this case, users submit a query, typically a list of keywords, and receive a list of Web pages that may be relevant, typically pages that contain the keywords. Now, search engines became very essential information resources for net users and they form a very important commercial industry. Searching online provides you with a wealth of information, but not all of it will be useful or of the highest quality. Search engines are distributed programs that dive into the World Wide Web to find relevant information for a given search query. Their fundamental components are: the crawlers, the indexer module, the collection analysis module, the query engine, and the ranking module. The ranking module represents a significant component in web search engines. The main function of the ranking module is to sort the search results by relevance or importance using information retrieval (IR) algorithms. There were two kinds of methods in information retrieval, based on content and based on hyper-link. The quantity of computation in systems based on content was very large and the precision in systems based on hyper-link only was not ideal. It was necessary to develop a technique combining the advantages of two systems. Many web users are interested in Arabic web browsing whether the reason is academic or commercial… suffer to find their search and request over the Arabic search engine etc. As the existing web search engines are designed to perform English web searches. They don’t generate morphological variations of Arabic words but they just match the word as it is. Therefore their results contain only the II pages that exactly match the user query. They also don’t consider the different meanings of a word so search results contain unrelated pages to user query. In this research, we focus on implementing an enhanced ranking algorithm by combining both the page content and the Hyper-Link with the focus on Arabic search engines by taking into account the stem and the context of the Arabic word by combining both the count of words related to query in the page and the count of words related to query in outlinks pages of that page to calculate its rank, using external database having the morphological meanings of the most Arabic words. Then sort the pages according to its rank. If there is more than one meaning to an input query word in case the user does a query in using only one word, the user may choose the meaning he/she wishes to search for. The search results will largely contain the inflected forms of the word that belong to that meaning. This helps reduce the redundancy that results from morphological search only. Distributed page ranking are needed because the size of the web grows at a remarkable speed and centralized page ranking is not scalable. To speed up the ranking module process, this thesis proposes a parallel technique for this Arabic ranking module. We applied this Arabic ranking module on a dataset of 10000 Arabic web pages. This research proved that the optimal number of processors needed for this parallelization is 10 processors. Using this number of processors, the proposed parallel algorithm is very efficient and gets perfect speedup. |