ADGC: Awesome Deep Graph Clustering

ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets). Any other interesting papers and codes are welcome. Any problems, please contact [email protected].

Made with Python
GitHub stars
GitHub forks
visitors


What’s Deep Graph Clustering?

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years.

Important Survey Papers

Papers

  1. K-Means: “Algorithm AS 136: A k-means clustering algorithm” [pdf|code]
  2. DCN (ICML17): “Towards k-means-friendly spaces: Simultaneous deep learning and clustering” [pdf|code]
  3. DEC (ICML16): “Unsupervised Deep Embedding for Clustering Analysis” [pdf|code]
  4. IDEC (IJCAI17): “Improved Deep Embedded Clustering with Local Structure Preservation” [pdf|code]
  5. GAE/VGAE : “Variational Graph Auto-Encoders” [pdf|code]
  6. DAEGC (IJCAI19): “Attributed Graph Clustering: A Deep Attentional Embedding Approach” [pdf|code]
  7. ARGA/ARVGA (TCYB19): “Learning Graph Embedding with Adversarial Training Methods” [pdf|code]
  8. SDCN/SDCN_Q (WWW20): “Structural Deep Clustering Network” [pdf|code]
  9. DFCN (AAAI21): “Deep Fusion Clustering Network” [pdf|code]
  10. MVGRL (ICML20): “Contrastive Multi-View Representation Learning on Graphs” [pdf|code]

Benchmark Datasets

We divide the datasets into two categories, i.e. graph datasets and non-graph datasets. Graph datasets are some graphs in real-world, such as citation networks, social networks and so on. Non-graph datasets are NOT graph type. However, if necessary, we could construct “adjacency matrices” by K-Nearest Neighbors (KNN) algorithm.

Quick Start

  • Step1: Download all datasets from [Google Drive|Baidu Netdisk]. Optionally, download some of them from URLs in the tables (Google Drive)

  • Step2: Unzip them to ./dataset/

  • Step3: Run the ./dataset/utils.py

    Two functions load_graph_data and load_data are provided in ./dataset/utils.py to load graph datasets and non-graph datasets, respectively.

Datasets Details

  1. Graph Datasets

    Dataset Samples Dimension Edges Classes URL
    DBLP 4057 334 3528 4 dblp.zip
    CITE 3327 3703 4552 6 cite.zip
    ACM 3025 1870 13128 3 acm.zip
    AMAP 7650 745 119081 8 amap.zip
    AMAC 13752 767 245861 10 amac.zip
    PUBMED 19717 500 44325 3 pubmed.zip
    CORAFULL 19793 8710 63421 70 corafull.zip
    CORA 2708 1433 6632 7 cora.zip
    CITESEER 3327 3703 6215 6 citeseer.zip
  2. Non-graph Datasets

    Dataset Samples Dimension Type Classes URL
    USPS 9298 256 Image 10 usps.zip
    HHAR 10299 561 Record 6 hhar.zip
    REUT 10000 2000 Text 4 reut.zip

If you find this repository useful to your research or work, it is really appreciate to star this repository.​ ❤️

GitHub

View Github