Several methodologies have been proposed to push the evaluation of search systems to include the fairness of system decisions. These metrics often consider the membership of documents’ authors to particular groups, referred to as protected attributes including gender or ethnicity. To date, these metrics typically assume the availability and completeness of protected attribute labels of authors. However, due to privacy or policy reasons, the protected attributes of individuals may not always be present, limiting the application of fair ranking metrics in large scale systems. In order to address this problem, we propose two sampling strategies and an estimation technique for four different fair ranking metrics. We formulate a robust and unbiased estimator which can operate even with very limited number of labeled items. We evaluate our approach using both simulated and real world data. Our experimental results demonstrate that our method can estimate this family of fair ranking metrics and provides a robust, reliable alternative to exhaustive or random data annotation.

The Web Conference is announcing latest news and developments biweekly or on a monthly basis. We respect The General Data Protection Regulation 2016/679.