Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
include src/web/index.html
include src/web/index.html
include src/promptlab/alembic.ini
recursive-include src/promptlab/migrations *
163 changes: 163 additions & 0 deletions docs/database_management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Database Management in PromptLab

This document explains the centralized database management system implemented in PromptLab to solve the multiple initialization issue.

## Problem Solved

Previously, PromptLab had multiple `init_engine` functions that could initialize the database concurrently, leading to:
- Race conditions during startup
- Redundant database initialization
- Potential inconsistent database state
- Performance issues

## Solution Overview

### Centralized Database Manager

The new system implements a singleton `DatabaseManager` class that ensures:
- **Single Point of Initialization**: Only one place handles database setup
- **Thread Safety**: Uses locking mechanisms to prevent race conditions
- **One-Time Operation**: Database is initialized only once, regardless of how many times it's requested
- **Migration Support**: Integrated Alembic support for schema migrations

### Key Components

1. **DatabaseManager** (`src/promptlab/sqlite/database_manager.py`)
- Singleton pattern ensures single instance
- Thread-safe initialization with double-checked locking
- Automatic Alembic migration support
- Logging for debugging and monitoring

2. **Enhanced Session Management** (`src/promptlab/sqlite/session.py`)
- Thread-safe session initialization
- Utility functions for checking initialization state
- Reset functionality for testing

3. **CLI Commands** (`src/promptlab/_cli.py`)
- `promptlab db init`: Initialize database
- `promptlab db migrate`: Run migrations
- `promptlab db revision`: Create new migration

## Usage

### Starting the Studio

The studio startup remains the same:
```bash
promptlab studio start -d /path/to/database.db -p 8000
```

The database will be automatically initialized on first startup.

### Manual Database Operations

Initialize a database:
```bash
promptlab db init -d /path/to/database.db
```

Run migrations:
```bash
promptlab db migrate -d /path/to/database.db
```

Create a new migration:
```bash
promptlab db revision -d /path/to/database.db -m "Add new table"
```

### Programmatic Usage

```python
from promptlab.sqlite.database_manager import db_manager
from promptlab.tracer.local_tracer import LocalTracer

# The database will be automatically initialized
tracer = LocalTracer({"type": "local", "db_file": "/path/to/db.sqlite"})

# Or initialize manually
db_manager.initialize_database("/path/to/db.sqlite")
```

## Migration System

### Alembic Integration

The system now includes full Alembic support for database schema migrations:

- **Automatic Migration Detection**: On startup, the system checks for pending migrations
- **Safe Migration Execution**: Migrations are applied automatically and safely
- **Version Tracking**: Database schema version is tracked in the `alembic_version` table

### Creating Migrations

1. Make changes to your SQLAlchemy models in `src/promptlab/sqlite/models.py`
2. Generate a migration:
```bash
promptlab db revision -d /path/to/database.db -m "Description of changes"
```
3. Review the generated migration file in `migrations/versions/`
4. Apply migrations:
```bash
promptlab db migrate -d /path/to/database.db
```

## File Structure

```
promptlab/
├── alembic.ini # Alembic configuration
├── migrations/ # Migration files
│ ├── env.py # Alembic environment
│ ├── script.py.mako # Migration template
│ └── versions/ # Version files
├── src/promptlab/sqlite/
│ ├── database_manager.py # Centralized database manager
│ ├── session.py # Enhanced session management
│ └── models.py # SQLAlchemy models
└── tests/unit/
└── test_database_initialization.py # Tests for new system
```

## Thread Safety

The new system implements several thread safety mechanisms:

1. **Double-Checked Locking**: Prevents race conditions during initialization
2. **Global State Protection**: Uses threading locks to protect shared state
3. **Singleton Pattern**: Ensures only one DatabaseManager instance exists

## Testing

Run the database initialization tests:
```bash
python -m pytest tests/unit/test_database_initialization.py
```

The tests verify:
- Single initialization across multiple calls
- Thread safety with concurrent access
- Proper LocalTracer integration
- Database file creation

## Benefits

1. **Reliability**: Eliminates race conditions and initialization conflicts
2. **Performance**: Avoids redundant database operations
3. **Maintainability**: Centralized database logic is easier to maintain
4. **Scalability**: Proper migration system supports schema evolution
5. **Debugging**: Comprehensive logging helps troubleshoot issues

## Backwards Compatibility

The changes are backwards compatible:
- Existing `LocalTracer` usage remains unchanged
- CLI commands work as before
- No breaking changes to public APIs

## Future Enhancements

- Database connection pooling for improved performance
- Support for multiple database backends
- Enhanced migration rollback capabilities
- Database health monitoring and metrics
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ dependencies = [
"fastapi",
"uvicorn[standard]>=0.18.0",
"sqlalchemy",
"alembic>=1.8.0",
"passlib",
"python-jose",
"python-multipart",
Expand All @@ -46,7 +47,7 @@ Homepage = "https://github.com/imum-ai/promptlab"
Issues = "https://github.com/imum-ai/promptlab/issues"

[tool.setuptools.package-data]
promptlab = ["web/*.html"]
promptlab = ["web/*.html", "alembic.ini", "migrations/*", "migrations/**/*"]

[tool.setuptools.packages.find]
where = ["src"]
Expand Down
25 changes: 0 additions & 25 deletions run_tests.sh

This file was deleted.

1 change: 0 additions & 1 deletion src/promptlab/_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,5 @@ def start(db, port):

click.echo(f"Running on port: {port}")


if __name__ == "__main__":
promptlab()
71 changes: 71 additions & 0 deletions src/promptlab/alembic.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Alembic configuration for PromptLab database migrations

[alembic]
# Path to migration scripts
script_location = migrations

# Template used to generate migration file names
file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s

# Max length of characters to apply to the "slug" field
truncate_slug_length = 40

# Set to 'true' to run the environment during the 'revision' command
revision_environment = false

# Set to 'true' to allow .pyc and .pyo files without a source .py file to be detected
# as revisions in the versions/ directory
sourceless = false

# Version table name
version_table = alembic_version

# Version path separator (default: os.pathsep)
version_path_separator = :

# Set to 'true' to search source files recursively in the versions/ directory
recursive_version_locations = false

# The output encoding used when revision files are written from script.py.mako
output_encoding = utf-8

# Database URL placeholder - will be set programmatically
sqlalchemy.url =

[post_write_hooks]
# Post-write hooks define scripts or Python functions that are run
# on newly-generated revision scripts

[loggers]
keys = root,sqlalchemy,alembic

[handlers]
keys = console

[formatters]
keys = generic

[logger_root]
level = WARN
handlers = console
qualname =

[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine

[logger_alembic]
level = INFO
handlers =
qualname = alembic

[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic

[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S
89 changes: 89 additions & 0 deletions src/promptlab/migrations/env.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
"""Alembic environment configuration for PromptLab migrations."""

from logging.config import fileConfig
from pathlib import Path
import sys

from sqlalchemy import engine_from_config
from sqlalchemy import pool
from alembic import context

# Add the src directory to the path so we can import our models
project_root = Path(__file__).parent.parent.parent.parent
src_path = project_root / "src"
sys.path.insert(0, str(src_path))

# Import your models here (after path modification)
from promptlab.sqlite.models import Base # noqa: E402

# This is the Alembic Config object
config = context.config

# Interpret the config file for Python logging
if config.config_file_name is not None:
fileConfig(config.config_file_name)

# Set the target metadata for 'autogenerate' support
target_metadata = Base.metadata


# Other values from the config
def get_url():
"""Get database URL from environment or config."""
url = config.get_main_option("sqlalchemy.url")
if url:
return url

# Fallback to a default SQLite URL for offline mode
return "sqlite:///promptlab.db"


def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.

This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.

Calls to context.execute() here emit the given string to the
script output.
"""
url = get_url()
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)

with context.begin_transaction():
context.run_migrations()


def run_migrations_online() -> None:
"""Run migrations in 'online' mode.

In this scenario we need to create an Engine
and associate a connection with the context.
"""
configuration = config.get_section(config.config_ini_section)
configuration["sqlalchemy.url"] = get_url()

connectable = engine_from_config(
configuration,
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)

with connectable.connect() as connection:
context.configure(connection=connection, target_metadata=target_metadata)

with context.begin_transaction():
context.run_migrations()


if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
Loading
Loading