Only 14 pages are availabe for public view
Searching becomes a critical behavior of our life. In fact, the amount of information in the world is increasing exponentially over years. New books, journal articles and conference proceedings have been coming out each year. So, searching within this flood of information become a challenge. Millions of users interact with search engines daily. They may use Google, Yahoo, Bing …etc. they usually fall in a hole of which one of them is the best, which one has inside the perfect algorithms that returns the amazing results for their users. Thus each one has weakness points in searching. This holds to conduce that it is time to create new technologies fight those weakness points, these new technologies also can help us sift through all the available information which is the most valuable to users. In addition, could be services, different kind of languages which are currently spoken around the world. Furthermore, it services for all users’ needs and concern that each user on Internet has a distinct background and a specific goal when searching for information on the web. Really, Information Retrieval Systems as a science bases on these technologies’ ideas. Really, it plays critical roles to obtain relevant information resources for searching engines.
This thesis enters the race of challenges of improving web search engines performance and Information Retrieval Technologies by proposing several new hybrid technologies in Stemming, Ranking and Personalization. These technologies are core aspect in supporting Searching Results from web search engines. The evaluation of the proposed system proves that each of these technology is a significant benefits and improving the performance of Searching Engines and Information Retrieval Systems.
In Stemming, new technique called the Enhanced Porter’s Stemming Algorithm (EPSA) overcomes the drawbacks of the Porters algorithm and improves web searching. This new technique to overcome the stemming can cause errors in the form of words; ESPA algorithms prove that its hold a good Stemming weight that means it’s stronger in stemming than the Porter algorithm. In addition, EPSA improves the performance of the Information Retrieval system in respect to the recall and precision measures. The EPSA improves the precision over the Porter algorithm by about 2.3% while realizing approximately the same recall percentage.
For Ranking, a hybrid ranking algorithm to utilize the usage data called EHURA (Efficient Hybrid Usage-based Ranking Algorithm). This ranking algorithm to improve the ranked list provided from Arabic/English search engines that based only on Content-Based ranking. The results show a good percentage of improvement in the precision measure; the improvement percentage of EHURA over the Content-Based Ranking Algorithm about 10% for Arabic Language and 15% for English one.
Finally, a new technology for Personalization; this technology based on a new hybrid automatic semantic personalization re-ranking algorithm to utilize the usage features called NASPR (New Automatic Semantic Personalization Re-ranking). this hybrid personalized Re-ranking algorithm bases on semantic analysis for the words. NASPR verifies its effectiveness over EHURA’s ranking results by 1% in precision and 3% in recall.