MoL-2006-04: Link-Based Methods for Web Information Retrieval

MoL-2006-04: Nettey, Clive (2006) Link-Based Methods for Web Information Retrieval. [Report]

[thumbnail of Full Text]
Text (Full Text)

Download (981kB) | Preview
[thumbnail of Abstract] Text (Abstract)

Download (1kB)


Although commercial search engine companies have reported a great deal
of success in appropriating link-based methods, these methods have
struggled to demonstrate significant performance improvements over
content-only retrieval methods in several off-line Web IR
evaluations. In this thesis the effectiveness of link-based methods is
assessed against content-only retrieval baselines. Algorithms
embodying established HITS, in-degree, realised in-degree, and sibling
score propagation techniques are evaluated alongside variants of those
algorithms. The variant algorithms are devised to aid in three
secondary lines of investigation relating to link-based methods: the
effects of link randomisation, the utility of sibling relationships
and the influence of link densities.
All established link-based algorithms are demonstrated to improve on
several content-only retrieval baseline performance metrics with the
realised in-degree algorithm proving to be particularly effective
across all considered metrics. In relation to the other lines of
investigation, the experimentation reveals that: leveraging sibling
relationships does not lead to significant performance improvements,
higher link densities do not afford performance improvements and that
algorithms are susceptible to link randomisation.

Item Type: Report
Report Nr: MoL-2006-04
Series Name: Master of Logic Thesis (MoL) Series
Year: 2006
Uncontrolled Keywords: Information Retrieval, World Wide Web, Hypertext Algorithms, Web Information Retrieval, Link-based Methodst
Date Deposited: 12 Oct 2016 14:38
Last Modified: 12 Oct 2016 14:38

Actions (login required)

View Item View Item