Link Prediction Experiments

This repository contains a series of machine learning experiments for link prediction within social networks.

We first implement and apply a variety of link prediction methods to each of the ego networks contained within the SNAP Facebook dataset and to various random networks generated using networkx, and then calculate and compare the ROC AUC, Average Precision, and runtime of each method.

Link Prediction Methods Tested:

(Variational) Graph Auto-Encoders: An end-to-end trainable convolutional neural network model for unsupervised learning on graphs
Node2Vec/DeepWalk: A skip-gram based approach to learning node embeddings from random walks within a given graph
Spectral Clustering: Using spectral embeddings to create node representations from an adjacency matrix
Baseline Indexes: Adamic-Adar, Jaccard Coefficient, Preferential Attachment

Requirements

Pre-Use Installation

python setup.py install

Included Files

Network Data

facebook/: Original Facebook ego networks dataset, with added .allfeats files (with both ego and alter features)
fb-processed/: Pickle dumps of (adjacency_matrix, feature_matrix) tuples for each ego network, and for combined network
visualizations/: Visualizations of each network generated by networkx and matplotlib
network-statistics/: .txt and .pkl files of pre-calculated network characteristics (with info on connectivity, network size, etc.) for each network
train-test-splits/: Pickle dumps of pre-processed train-test splits for Facebook ego networks, with varying degrees of visibility (i.e. how many edges are hidden). Includes: adj_train, train_edges, train_edges_false, val_edges, val_edges_false, test_edges, test_edges_false
process-ego-networks.py: Script used to process raw Facebook data and generate pickle dumps
process-combined-network.py: Script used to combine Facebook ego networks and generate complete network pickle dump
fb-train-test-splits.py: Script used to generate and store train-test splits for each Facebook ego network

Annotated Link Prediction IPython Notebooks

link-prediction-baselines.ipynb: Adamic-Adar, Jaccard Coefficient, Preferential Attachment
spectral-clustering.ipynb: Using spectral embeddings for link prediction
node2vec.ipynb: Skip-gram based representation learning for node/edge embeddings
graph-vae.ipynb: (Variational) Graph Autoencoder, learns node embeddings to recreate adjacency matrix

Link Prediction Helper Scripts

link_prediction_scores.py: Utility functions for running various link prediction tests

Exploratory Analysis

network-visualizations.ipynb: Generate .pdf visualizations for each network, in addition to calculating and storing a variety of network metrics (e.g. transtivity, avg. clustering coefficient, etc.)

Full Link Prediction Experiments

nx-graph-experiments.ipynb: Run all link prediction tests on various types of random networks (Erdos-Renyi, Barabasi-Albert, etc.)
fb-graph-experiments.ipynb: Run all link prediction tests on each Facebook ego network
run-all-experiments.py: Run all link prediction experiments (on both Facebook networks and random networkx networks), save results as pickle dumps in results

Results

results/: Pickle dumps of experiment results, with results (ROC AUC, ROC Curve, Avg. Precision, Runtime) stored for each link prediction method in Python dictionary form
investigate-results.ipynb: Generate and save bar plots/ROC curve plots for each method and graph type, save network characteristics to .txt files
result-plots/: Bar plots for the results of each experiment (ROC AUC, AP, Minimum Runtime), in .pdf form

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Link Prediction Experiments

Link Prediction Methods Tested:

Requirements

Pre-Use Installation

Included Files

Network Data

Annotated Link Prediction IPython Notebooks

Link Prediction Helper Scripts

Exploratory Analysis

Full Link Prediction Experiments

Results

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
facebook		facebook
fb-processed		fb-processed
gae		gae
network-statistics		network-statistics
result-plots		result-plots
results		results
train-test-splits		train-test-splits
visualizations		visualizations
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
fb-graph-experiments.ipynb		fb-graph-experiments.ipynb
fb-train-test-splits.py		fb-train-test-splits.py
graph-vae.ipynb		graph-vae.ipynb
investigate-results.ipynb		investigate-results.ipynb
link-prediction-baselines.ipynb		link-prediction-baselines.ipynb
link_prediction_scores.py		link_prediction_scores.py
network-visualizations.ipynb		network-visualizations.ipynb
node2vec.ipynb		node2vec.ipynb
node2vec.py		node2vec.py
nx-graph-experiments.ipynb		nx-graph-experiments.ipynb
process-combined-network.py		process-combined-network.py
process-ego-networks.py		process-ego-networks.py
run-all-experiments.py		run-all-experiments.py
setup.py		setup.py
spectral-clustering.ipynb		spectral-clustering.ipynb

License

mkadiri3/link-prediction

Folders and files

Latest commit

History

Repository files navigation

Link Prediction Experiments

Link Prediction Methods Tested:

Requirements

Pre-Use Installation

Included Files

Network Data

Annotated Link Prediction IPython Notebooks

Link Prediction Helper Scripts

Exploratory Analysis

Full Link Prediction Experiments

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages