RMIT University
Browse
Zendel.pdf (4.38 MB)

New Perspectives on Query Performance Prediction

Download (4.38 MB)
thesis
posted on 2024-06-26, 05:12 authored by Oleg Zendel
Effectively navigating the vast online data landscape is crucial in our daily lives in the digital era. The internet serves as our primary hub for work and personal interactions, involving the consumption and production of information. However, the overwhelming volume of online information complicates the task of finding accurate and relevant information. Search engines are the primary means of accessing online information and face the critical challenge of delivering pertinent results swiftly, especially with short and occasionally ambiguous queries. This underscores the need to ensure the retrieval of the right information. At the core of effective search engines lies information retrieval, providing users with relevant information in response to queries. Traditional methods of user feedback for evaluation prove impractical due to scalability, time constraints, and cost. This thesis addresses these challenges by delving into information retrieval, specifically focusing on Query Performance Prediction (QPP). QPP estimates search result quality without relying on human judgments, offering insights to enhance search effectiveness. Acknowledging the flaws in even well-performing search systems, there is a growing imperative to predict search effectiveness. QPP, as explored in this thesis, serves as a means to anticipate potential shortcomings in meeting user information needs. The system can adapt and pivot to alternative strategies or alert the user, enhancing its adaptability to users' preferences and information needs. This thesis comprehensively inspects QPP from user and system perspectives, examining diverse evaluation methods. It deepens the understanding of QPP, identifying critical aspects and propelling the field forward. The introduction of two distribution-based evaluation frameworks for QPP methods with statistical analysis enhances the evaluation process. Initiating our exploration, we investigate the interaction between query variations and QPP from the user's perspective, aiming to understand user assessments and expectations. Our focus includes scrutinizing users' ability to assess the usefulness of diverse queries for a predefined information need, employing a crowd-sourcing user study for that purpose. Notably, our study reveals consistent user assessments of different query variants across varying information needs, contributing to advancing our knowledge of the user perspective towards a more user-centric information retrieval system evaluation. Continuing our research, we categorize different information needs into different cognitive complexity categories, testing the efficacy of Large Language Models (LLMs) against human experts. Our findings indicate the comparability of LLMs to human experts in classifying cognitive complexity, contributing to the development of more automated and scalable methods for data annotation and classification in information retrieval. The study then scrutinizes the dynamics of QPP, investigating factors influencing variance in prediction quality across different information needs and query variations. We propose a new evaluation method based on pairwise comparisons, revealing that the variance in prediction quality is primarily due to inherent task differences rather than the introduction of query variations. This work enhances understanding of QPP dynamics and contributes to the development of more accurate and robust evaluation methods across different QPP tasks. Introducing a distribution-based framework for QPP method evaluation, we propose a new evaluation method to generate a distribution for the predicted and actual search results. Using this framework, we study the variance in prediction quality across different factors in the retrieval pipeline, advancing the development of more accurate and robust evaluation methods for QPP models. Finally, we present an entropy-based QPP method for neural information retrieval models, evaluating its performance thoroughly. Our findings indicate comparability to current unsupervised state-of-the-art QPP methods, identifying strengths and weaknesses. This research contributes to the development of more accurate and robust QPP methods, emphasizing the potential of simple score-based QPP methods and the importance of thorough evaluation. In conclusion, this thesis advances research in information retrieval, specifically QPP, by offering insights into user assessments, cognitive complexity evaluation, key factors affecting QPP, and improved evaluation frameworks. These contributions enhance the development of more accurate and adaptable information retrieval systems, opening avenues for future investigations in addressing the evolving challenges of the digital information landscape.

History

Degree Type

Doctorate by Research

Copyright

© Oleg Zendel 2024

School name

Computing Technologies, RMIT University