ADGC: Awesome Deep Graph Clustering

ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets). Any other interesting papers and codes are welcome. Any problems, please contact [email protected].

Made with Python
GitHub stars
GitHub forks

What’s Deep Graph Clustering?

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years.

Important Survey Papers


  1. K-Means: “Algorithm AS 136: A k-means clustering algorithm” [pdf|code]
  2. DCN (ICML17): “Towards k-means-friendly spaces: Simultaneous deep learning and clustering” [pdf|code]
  3. DEC (ICML16): “Unsupervised Deep Embedding for Clustering Analysis” [pdf|code]
  4. IDEC (IJCAI17): “Improved Deep Embedded Clustering with Local Structure Preservation” [pdf|code]
  5. GAE/VGAE : “Variational Graph Auto-Encoders” [pdf|code]
  6. DAEGC (IJCAI19): “Attributed Graph Clustering: A Deep Attentional Embedding Approach” [pdf|code]
  7. ARGA/ARVGA (TCYB19): “Learning Graph Embedding with Adversarial Training Methods” [pdf|code]
  8. SDCN/SDCN_Q (WWW20): “Structural Deep Clustering Network” [pdf|code]
  9. DFCN (AAAI21): “Deep Fusion Clustering Network” [pdf|code]
  10. MVGRL (ICML20): “Contrastive Multi-View Representation Learning on Graphs” [pdf|code]

Benchmark Datasets

We divide the datasets into two categories, i.e. graph datasets and non-graph datasets. Graph datasets are some graphs in real-world, such as citation networks, social networks and so on. Non-graph datasets are NOT graph type. However, if necessary, we could construct “adjacency matrices” by K-Nearest Neighbors (KNN) algorithm.

Quick Start

  • Step1: Download all datasets from [Google Drive|Baidu Netdisk]. Optionally, download some of them from URLs in the tables (Google Drive)

  • Step2: Unzip them to ./dataset/

  • Step3: Run the ./dataset/

    Two functions load_graph_data and load_data are provided in ./dataset/ to load graph datasets and non-graph datasets, respectively.

Datasets Details

  1. Graph Datasets

    Dataset Samples Dimension Edges Classes URL
    DBLP 4057 334 3528 4
    CITE 3327 3703 4552 6
    ACM 3025 1870 13128 3
    AMAP 7650 745 119081 8
    AMAC 13752 767 245861 10
    PUBMED 19717 500 44325 3
    CORAFULL 19793 8710 63421 70
    CORA 2708 1433 6632 7
    CITESEER 3327 3703 6215 6
  2. Non-graph Datasets

    Dataset Samples Dimension Type Classes URL
    USPS 9298 256 Image 10
    HHAR 10299 561 Record 6
    REUT 10000 2000 Text 4

If you find this repository useful to your research or work, it is really appreciate to star this repository.​ ❤️


View Github