Multi-view clustering has been widely studied in machine learning, which uses complementary information to improve clustering performance. However, challenges remain when handling large-scale multi-view data due to the traditional approaches’ high time complexity. Besides, the existing approaches suffer from parameter selection. Due to the lack of labeled data, parameter selection in practical clustering applications is difficult, especially in big data. In this paper, we propose a novel approach for large-scale multi-view clustering to overcome the above challenges. Our approach focuses on learning the low-dimensional binary embedding of multi-view data, preserving the samples’ local structure during binary embedding, and optimizing the embedding and clustering in a unified framework. Furthermore, we proposed to learn the parameters using a combination of data-driven and heuristic approaches. Experiments on five large-scale multi- view datasets show that the proposed method is superior to the state-of-the-art in terms of clustering quality and running time.

2021 THE WEB CONFERENCE NEWSLETTER
The Web Conference is announcing latest news and developments biweekly or on a monthly basis. We respect The General Data Protection Regulation 2016/679.