MoL-2006-04: Link-Based Methods for Web Information Retrieval

MoL-2006-04: Nettey, Clive (2006) Link-Based Methods for Web Information Retrieval. [Report]

[img]
Preview
Text (Full Text)
MoL-2006-04.text.pdf

Download (981kB) | Preview
[img] Text (Abstract)
MoL-2006-04.abstract.txt

Download (1kB)

Abstract

Although commercial search engine companies have reported a great deal of success in appropriating link-based methods, these methods have struggled to demonstrate significant performance improvements over content-only retrieval methods in several off-line Web IR evaluations. In this thesis the effectiveness of link-based methods is assessed against content-only retrieval baselines. Algorithms embodying established HITS, in-degree, realised in-degree, and sibling score propagation techniques are evaluated alongside variants of those algorithms. The variant algorithms are devised to aid in three secondary lines of investigation relating to link-based methods: the effects of link randomisation, the utility of sibling relationships and the influence of link densities. All established link-based algorithms are demonstrated to improve on several content-only retrieval baseline performance metrics with the realised in-degree algorithm proving to be particularly effective across all considered metrics. In relation to the other lines of investigation, the experimentation reveals that: leveraging sibling relationships does not lead to significant performance improvements, higher link densities do not afford performance improvements and that algorithms are susceptible to link randomisation.

Item Type: Report
Report Nr: MoL-2006-04
Series Name: Master of Logic Thesis (MoL) Series
Year: 2006
Uncontrolled Keywords: Information Retrieval, World Wide Web, Hypertext Algorithms, Web Information Retrieval, Link-based Methodst
Date Deposited: 12 Oct 2016 14:38
Last Modified: 12 Oct 2016 14:38
URI: https://eprints.illc.uva.nl/id/eprint/767

Actions (login required)

View Item View Item