Skip to content

clmb/WikiDAT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WikiDAT

Wikipedia Data Analysis Toolkit

Authors: Felipe Ortega, Aaron Halfaker. License: GPLv3 (http://www.gnu.org/licenses/gpl.txt).

The aim of WikiDAT is to create an extensible toolkit for Wikipedia Data Analysis, based on MySQL, Python and R.

Each module implements a different type of analysis, storing the output in subdirectories results, figs or traces, created in the module's directory. Module source code includes Python and R code to implement both the data preparation/cleaning and data analysis steps, including inline comments. An important goal is to illustrate different case examples of interesting analyses with Wikipedia data, following a didactic approach.

The long-term goal is to include more case examples progressively, in order to cover many of the usual examples of quantitative analyses that can be undertaken with Wikipedia data. In the future, this may also include the use of tools for distributed computing to support analysis of really huge data sets in high-resolution studies.

About

Wikipedia Data Analysis Toolkit

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published