A COMPARATIVE EVALUATION OF POPULAR SEARCH ENGINES ON FINDING TURKISH DOCUMENTS FOR A SPECIFIC TIME PERIOD
Yıltan Bitirim, Abdül Kadir GörürOriginal scientific paper This study evaluates the popular search engines, Google, Yahoo, Bing, and Ask, on finding Turkish documents by comparing their current performances with their performances measured six years ago. Furthermore, the study reveals the current information retrieval effectiveness of the search engines. First of all, the Turkish queries were run on the search engines separately. Each retrieved document was classified and precision ratios were calculated at various cut-off points for each query and engine pair. Afterwards, these ratios were compared with the six years ago ratios for the evaluations. Besides the descriptive statistics, Mann-Whitney U and Kruskal-Wallis H statistical tests were used in order to find out statistically significant differences. All search engines, except Google, have better performance today. Bing has the most increased performance compared to six years ago. Nowadays: Yahoo has the highest mean precision ratios at various cut-off points; all search engines have their highest mean precision ratios at cut-off point 5; dead links were encountered in Google, Bing, and Ask; and repeated documents were encountered in Google and Yahoo.
Keywords: information retrieval; performance evaluation; search engine; Turkish
Usporedna evaluacija popularnih mehanizama za pretraživanje u pronalaženju turskih dokumenata određenog vremenskog razdobljaIzvorni znastveni članak U ovom se istraživanju ocjenjuju popularni mehanizmi za pretraživanje, Google, Yahoo, Bing, i Ask, pri traženju turskih dokumenata usporedbom njihovog sadašnjeg rada s radom izmjerenim prije šest godina. Nadalje, istraživanje pokazuje sadašnju učinkovitost mehanizama u pronalaženju podataka. Najprije su učinjeni upiti za turske riječi odvojeno na svakom mehanizmu. Svaki pronađeni dokument je klasificiran, a izračunati su omjeri točnosti na raznim cut-off točkama za svaki upit i svaki mehanizam. Zatim su ti omjeri uspoređeni s onima od prije šest godina zbog procjene. Pored opisne statistike, korišteni su Mann-Whitney U i Kruskal-Wallis H statistički testovi kako bi se pronašle statistički značajne razlike. Svi mehanizmi za ispitivanje osim Google-a danas su učinkovitiji. Bing je njviše napredovao u odnosu na prije šest godina. Danas Yahoo ima najviše prosječne omjere točnosti u raznim cutoff točkama. Svi mehanizmi za pretraživanje imaju najviše prosječne omjere točnosti u cut-off točki 5; ugašene veze (dead links) su nađene u Google-u, Bingu i Asku, a ponovljeni dokumenti u Google-u i Yahoo-u.