Please use this identifier to cite or link to this item:
http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4565
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Makkar, J | - |
dc.contributor.author | Bhumika | - |
dc.contributor.author | Jain, Shweta | - |
dc.contributor.author | Gupta, S | - |
dc.date.accessioned | 2024-05-29T14:00:15Z | - |
dc.date.available | 2024-05-29T14:00:15Z | - |
dc.date.issued | 2024-05-29 | - |
dc.identifier.uri | http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4565 | - |
dc.description.abstract | Abstract The work explores the federated data clustering problem. The primary goal is to perform k-means clustering of data distributed over multiple clients while preserving privacy during an exchange with the central server. Existing solutions to unsupervised federated data clustering are either computationally challenging or effective only in heterogeneous regimes, i.e., when the number of clusters per client (kz) is less than the total number of clusters (k) (specifically, kz ≤ unmapped: inline-formula unmapped: math unmapped: msqrt unmapped: mrow unmapped: mi k). Moreover, existing one-shot approaches assume that the information about kz is available for each client. In this paper, we propose two multi-shot approaches which we call MFC and MFCH, that perform well on both heterogeneous and non-heterogeneous regimes, i.e., are independent of the underlying client data distribution. Both MFC and MFCH stand out as they do not rely on prior knowledge about kz. We theoretically bound the closeness of the local centers obtained by MFC to that of the optimal global centers and prove that under some well-separability assumption, the centers will be close enough. MFCH improvises MFC by only sharing a single cluster center from each client, thus ensuring more privacy. Our theoretical analysis shows that when at least O(k2log k) clients are involved, centers obtained by MFCH will closely approximate optimal global centers. Experiments on synthetic and real-world datasets validate the proposed approaches’ efficacy showcasing lower objective costs in non-heterogeneous regimes while having comparable performance in heterogeneous regimes. In addition, as a byproduct MFC exhibits higher device-level fairness in terms of the individual objective cost compared to existing state-of-the-art algorithms. The code is publicly available at https://github.com/shivi98g/MFC. | en_US |
dc.language.iso | en_US | en_US |
dc.title | MFC: A Multishot Approach to Federated Data Clustering | en_US |
dc.type | Article | en_US |
Appears in Collections: | Year-2023 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Full Text.pdf | 769.54 kB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.