
Please use this identifier to cite or link to this item:
http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4945Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Gupta, S. | - |
| dc.date.accessioned | 2025-10-24T12:52:11Z | - |
| dc.date.available | 2025-10-24T12:52:11Z | - |
| dc.date.issued | 2025-04 | - |
| dc.identifier.uri | http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4945 | - |
| dc.description.abstract | With the advent of technology, the adoption of Artificial Intelligence (AI) and Machine Learning (ML) based decision systems into daily human life has significantly increased. Recent studies have exposed the prejudiced outlook (biasness) in the ML outcomes towards individuals and groups of individuals characterized through protected attributes such as race and gender. These decisions have a direct and long-lasting impact on the humans involved. Fairness has gained considerable attention from the research community when data labels are available for prediction modelling, i.e., supervised learning. However, in real-life scenarios, data may lack labels and providing manual labels will require proper incentivization or expertise. Consequently, researchers have started exploring fairness issues in unsupervised learning, which forms the focus of this thesis. In particular, the primary focus of this thesis is to address both theoretical underpinnings and practical implications of fair algorithms for unsupervised learning in the context of clustering and recommender systems. The contributions of the thesis include: 1. Group Fair Notions and Algorithms in Offline Clustering: The thesis first theoretically establishes relationships between different existing discrete group fairness notions and then proposes a generalized notion of group fairness for multivalued group values. We propose two simple and efficient round-robin-based algorithms for satisfying group fairness guarantees. We next prove that the proposed algorithm achieves a two-approximate solution to optimal clustering and show that the bounds are tight. The efficacy of the proposed algorithms is also shown via extensive simulations. 2. Nash Social Welfare for Facility Location: To investigate the problem of satisfying multiple fairness levels simultaneously, the thesis extends the fair clustering problem to the facility location problem. The thesis proposes the first-of-its-kind application of modelling Nash Social Welfare for facility location to target multiple fairness while focusing on minimizing the distance between individuals. The proposed polynomial time algorithm works for any h-dimensional metric space and allows facilities to be opened at a specified set of locations rather than solely at the individuals’ own locations, as in most previous literature. The proposed algorithm provides a solution that satisfies group fairness constraints and achieves a good approximation for individual fairness. The proposed method undergoes real-world testing on the United States (US). census dataset, with road maps providing the actual car road distances between individuals and facilities. 3. Group Fairness in Online Clustering: To tackle the challenge of handling group fairness requirements in an online model, the thesis proposes a randomized algorithm that prevents the over-representation of any protected group by applying capacity constraints on the number of data points from each group that can be assigned to a particular cluster. The proposed method achieves a constant-cost approximation to optimal offline clustering and also handles the challenge of an apriori unknown total number of data points using a doubling trick. Empirical results demonstrate the proposed algorithms’ efficacy against baseline methods on synthetic and real-world datasets. 4. Fairness in Federated Data Clustering: For addressing fairness in distributed settings, the thesis analyzes federated data clustering to ensure privacy-preserving clustering in a distributed environment. The proposed method results in cluster centers with lower cost deviation across clients, leading to a fairer and more personalized solution. The method is validated on different synthetic and real-world datasets, with results demonstrating effective performance against state-of-the-art methods. 5. Popularity Bias in Recommender System: While the first four contributions focus more on clustering. This contribution primarily analyzes the fairness aspects of recommender systems. The thesis proposes a novel metric that measures popularity bias as the difference in the Mean Squared Error (MSE) on the popular and non-popular items. Further, we propose a novel technique that solves the optimization problem of reducing overall loss with a penalty on popularity bias. It does not require any heavy pre-training and undergoes extensive experiments on real-world datasets displaying outperforming performance on recommendation accuracy, quality, and fairness. | en_US |
| dc.language.iso | en_US | en_US |
| dc.subject | Fairness | en_US |
| dc.subject | Unsupervised Learning | en_US |
| dc.subject | Clustering | en_US |
| dc.subject | Group Fairness | en_US |
| dc.subject | Online Algorithms | en_US |
| dc.subject | Federated Learning | en_US |
| dc.subject | Recommender Systems | en_US |
| dc.subject | Matrix Factorization | en_US |
| dc.title | Fair algorithms for clustering and recommender systems in unsupervised learning | en_US |
| dc.type | Thesis | en_US |
| Appears in Collections: | Year- 2025 | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Full_text.pdf.pdf | 107.45 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.