An API script for Stack Overflow for Teams that creates a JSON export of users, user groups, tags, articles, questions, answers, and comments. It uses a combination of both versions of the API (i.e., 2.3 and 3) in order to create the most comprehensive export possible.
- A Stack Overflow for Teams instance (Basic, Business, or Enterprise)
- Python 3.x (download)
- Operating system: Linux, MacOS, or Windows
Download and unpack the contents of this repository
Installing Dependencies
- Open a terminal window (or, for Windows, a command prompt)
- Navigate to the directory where you unpacked the files
- Install the dependencies:
pip3 install -r requirements.txt
API Authentication
For the Business tier, you'll need a personal access token (PAT). You'll need to obtain an API key and an access token for Enterprise. Documentation for creating an Enterprise key and token can be found within your instance at this url: https://[your_site]/api/docs/authentication
Before proceeding, please note a critical step when creating your API Application in Stack Overflow Enterprise for Access Token generation:
Generating an Access Token
To generate an Access Token for Enterprise, you must first ensure your API Application is correctly configured:
- API Application "Domain" Field Requirement: When creating your API Application (where you obtain your Client ID and Client Secret), the "Domain" field must be populated with the base URL of your Stack Overflow Enterprise instance (e.g.,
https://your.so-enterprise.url
). Although the UI may mark this field as 'Optional,' failure to populate it will prevent Access Token generation and lead to a"redirect_uri is not configured"
error during the OAuth flow.
Once your API Application is configured with a valid Domain, follow these steps to generate your Access Token:
- Go to the page where you created your API key. Take note of the "Client ID" associated with your API key.
- Go to the following URL, replacing the base URL, the
client_id
, and the base URL of theredirect_uri
with your own:https://YOUR.SO-ENTERPRISE.URL/oauth/dialog?client_id=111&redirect_uri=https://YOUR.SO-ENTERPRISE.URL/oauth/login_success
- You may be prompted to log in to Stack Overflow Enterprise if you're not already. Either way, you'll be redirected to a page that simply says "Authorizing Application."
- In the URL of that page, you'll find your access token. Example:
https://YOUR.SO-ENTERPRISE.URL/oauth/login_success#access_token=YOUR_TOKEN
Note on Access Token Requirements:
While API v3 now generally allows querying with just an API key for most GET requests, certain paths and data (e.g., /images
and the email attribute on a User
object) still specifically require an Access Token for access. If you encounter permissions errors on such paths, ensure you are using an Access Token.
In a terminal window, navigate to the directory where you unpacked the script. Run the script using the following format, replacing the URL, token, and/or key with your own:
- For Basic and Business:
python3 so4t_data_export.py --url "https://stackoverflowteams.com/c/TEAM-NAME" --token "YOUR_TOKEN"
- For Enterprise:
python3 so4t_data_export.py --url "https://SUBDOMAIN.stackenterprise.co" --key "YOUR_KEY" --token "YOUR_TOKEN"
The script can take several minutes to run, particularly as it gathers data via the API. As it runs, it will update the terminal window with the tasks it performs.
When the script completes, it will indicate the JSON files will be created in the same directory where the script is located. The files will be named articles.json, questions_answers_comments.json, tags.json, user_groups.json, and users.json.
- Images are not exported
- Collections and Communities do not have an API endpoint, so they are not exported
If you encounter problems using the script, please leave feedback in the Github Issues. You can also clone and change the script to suit your needs. It is provided as-is, with no warranty or guarantee of any kind.
All data obtained via the API is handled locally on the device from which the script is run. The script does not transmit data to other parties like Stack Overflow. All API calls performed are read-only, so there is no risk of editing or adding content to your Stack Overflow for Teams instance.