Auditing Prompt Caching in Language Model APIs

This repository contains code and data for the paper Auditing Prompt Caching in Language Model APIs by Chenchen Gu, Xiang Lisa Li, Rohith Kuditipudi, Percy Liang, Tatsunori Hashimoto.

This code is intended solely for research purposes, and should not be used for any malicious or harmful purposes.

Installation

Create a conda environment:

conda create -n auditing-prompt-caching python=3.12
conda activate auditing-prompt-caching

Install packages:

pip install -r requirements.txt

(Optional) Install pre-commit hooks (ruff linting and formatting):

pre-commit install

Running Audits

Usage

Example usage to run audits:

python audit.py \
    --sharing_level {per_user,per_org,global} \
    --provider <provider> \
    --model <model> \
    --endpoint {chat,embeddings} \
    --n_samples 250 \
    --n_prompt_tokens 5000 \
    --prefix_fraction 0.95 \
    --n_victim_requests 1 \
    --sleep_time 1.0 \
    --base_output_dir data \

To view a help message showing all arguments:

python audit.py -h

Provider names are listed in clients/client_factory.py.

API Keys

API keys are loaded as environment variables from the .env and .env.<provider> (e.g., .env.openai) files. Alternatively, env files can be specified using the --env_files argument, in which case the previous two files will not be automatically loaded.

The base API_KEY_NAME for each provider can be found in each <provider>_client.py file in the clients directory. These names follow the format <PROVIDER>_API_KEY, e.g., OPENAI_API_KEY. Then, the environment variable name for the victim's API key is obtained by adding a suffix of _VICTIM to the base API key name. The environment variable names for the attacker's API key in the per-organization and global sharing levels are obtained by adding suffixes of _ATTACKER_PER_ORG and _ATTACKER_GLOBAL, respectively. (In the per-user level, the attacker and victim are the same user, so they have the same API key).

For example, to set the API keys for OpenAI, .env or .env.openai would look like this:

OPENAI_API_KEY_VICTIM="sk-..."
OPENAI_API_KEY_ATTACKER_PER_ORG="sk-..."
OPENAI_API_KEY_ATTACKER_GLOBAL="sk-..."

Data

audit-data.zip contains the most important data fields from the paper's audits: the response times, prompt and completion lengths, and timestamps. When unzipped, the directory is organized as

audit-data/<provider>/<sharing_level>/<filename>.json

The filenames are formatted as

<provider>_<model>_<sharing_level>_n<n_samples>_p<n_prompt_tokens>_pf<prefix_fraction>_v<n_victim_requests>_<timestamp>.json

For example:

audit-data/openai/per_org/openai_gpt-4o-mini_per_org_n250_p5000_pf0-95_v1_2024-09-11T233151Z.json

Each file is structured as follows:

{
  "cache_hit": [
    {
      "client_time": 0.561403,  // client-side response time, in seconds
      "server_time": 0.156,  // server-side processing time, in seconds (null if not available)
      "n_prompt_tokens": 5007,  // number of prompt tokens
      "sent_timestamp": 1726097516.6008322,  // Unix timestamp of when the API request was sent
      "n_completion_tokens": 1  // number of completion tokens (only for chat models)
    },
    ...
  ],
  "cache_miss": [
    {
      // same fields as above
    },
    ...
  ],
  "victim": [
    [
      // each inner list contains the victim requests for one prompt
      {
        // same fields as above
      },
      ...
    ],
    [
      ...
    ],
    ...
  ],
  "stats": {
    // statistics, e.g., p-values, medians, means
  },
  "config": {
    // configuration parameters, e.g., model, prefix_fraction
  }
}

This Google Drive folder contains the full data for each API request, including the full prompts, completions/embeddings, and HTTP requests and responses. See clients/timing_data.py for information about all data fields. The folder also contains data from the paper's ablations.

Citation

Please cite this work using this BibTeX entry:

@article{gu2025auditing,
  title={Auditing Prompt Caching in Language Model APIs},
  author={Gu, Chenchen and Li, Xiang Lisa and Kuditipudi, Rohith and Liang, Percy and Hashimoto, Tatsunori},
  journal={arXiv preprint arXiv:2502.07776},
  year={2025},
  url={https://arxiv.org/abs/2502.07776},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
clients		clients
utils		utils
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
audit-data.zip		audit-data.zip
audit.py		audit.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Auditing Prompt Caching in Language Model APIs

Table of Contents

Installation

Running Audits

Usage

API Keys

Data

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

chenchenygu/auditing-prompt-caching

Folders and files

Latest commit

History

Repository files navigation

Auditing Prompt Caching in Language Model APIs

Table of Contents

Installation

Running Audits

Usage

API Keys

Data

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages