Skip to content

mkadiri3/link-prediction

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Link Prediction Experiments

This repository contains a series of machine learning experiments for link prediction within social networks.

We first implement and apply a variety of link prediction methods to each of the ego networks contained within the SNAP Facebook dataset and to various random networks generated using networkx, and then calculate and compare the ROC AUC, Average Precision, and runtime of each method.

Link Prediction Methods Tested:

Requirements

Pre-Use Installation

python setup.py install

Included Files

Network Data

  • facebook/: Original Facebook ego networks dataset, with added .allfeats files (with both ego and alter features)
  • fb-processed/: Pickle dumps of (adjacency_matrix, feature_matrix) tuples for each ego network, and for combined network
  • visualizations/: Visualizations of each network generated by networkx and matplotlib
  • network-statistics/: .txt and .pkl files of pre-calculated network characteristics (with info on connectivity, network size, etc.) for each network
  • train-test-splits/: Pickle dumps of pre-processed train-test splits for Facebook ego networks, with varying degrees of visibility (i.e. how many edges are hidden). Includes: adj_train, train_edges, train_edges_false, val_edges, val_edges_false, test_edges, test_edges_false
  • process-ego-networks.py: Script used to process raw Facebook data and generate pickle dumps
  • process-combined-network.py: Script used to combine Facebook ego networks and generate complete network pickle dump
  • fb-train-test-splits.py: Script used to generate and store train-test splits for each Facebook ego network

Annotated Link Prediction IPython Notebooks

  • link-prediction-baselines.ipynb: Adamic-Adar, Jaccard Coefficient, Preferential Attachment
  • spectral-clustering.ipynb: Using spectral embeddings for link prediction
  • node2vec.ipynb: Skip-gram based representation learning for node/edge embeddings
  • graph-vae.ipynb: (Variational) Graph Autoencoder, learns node embeddings to recreate adjacency matrix

Link Prediction Helper Scripts

  • link_prediction_scores.py: Utility functions for running various link prediction tests

Exploratory Analysis

  • network-visualizations.ipynb: Generate .pdf visualizations for each network, in addition to calculating and storing a variety of network metrics (e.g. transtivity, avg. clustering coefficient, etc.)

Full Link Prediction Experiments

  • nx-graph-experiments.ipynb: Run all link prediction tests on various types of random networks (Erdos-Renyi, Barabasi-Albert, etc.)
  • fb-graph-experiments.ipynb: Run all link prediction tests on each Facebook ego network
  • run-all-experiments.py: Run all link prediction experiments (on both Facebook networks and random networkx networks), save results as pickle dumps in results

Results

  • results/: Pickle dumps of experiment results, with results (ROC AUC, ROC Curve, Avg. Precision, Runtime) stored for each link prediction method in Python dictionary form
  • investigate-results.ipynb: Generate and save bar plots/ROC curve plots for each method and graph type, save network characteristics to .txt files
  • result-plots/: Bar plots for the results of each experiment (ROC AUC, AP, Minimum Runtime), in .pdf form

About

Representation learning for link prediction within social networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 83.8%
  • Python 16.2%