Comprehensive test management for Gen AI applications
Rhesis is a complete test management platform for Gen AI teams, helping you build applications that deliver value, not surprises. The platform provides tools to create, manage, and execute test cases specifically designed for generative AI applications, ensuring they remain robust, reliable, and compliant.
- Test Set Management: Create, organize, and maintain comprehensive test suites for Gen AI applications
- Automated Test Generation: Generate test cases automatically based on your application's requirements
- Edge Case Discovery: Identify potential vulnerabilities and edge cases in your Gen AI systems
- Compliance Validation: Ensure your AI systems meet regulatory and ethical standards
- Integration Capabilities: Seamlessly integrate testing into your development workflow
- Performance Tracking: Monitor and analyze test results over time to track improvements
Rhesis is proudly open source, built on the belief that responsible AI testing should be accessible to everyone:
- Community-Driven Development: We believe the best tools are built collaboratively with input from diverse perspectives
- Transparency First: All our algorithms and methodologies are open for inspection and improvement
- Extensible Architecture: Build your own plugins, extensions, and integrations on top of our platform
- Free Core Functionality: Essential testing capabilities are free and open source forever
- Democratizing AI Safety: Making robust AI testing accessible to teams of all sizes, not just large corporations
- Research Collaboration: We actively collaborate with academic institutions to advance the field of AI testing
- Public Test Sets: We maintain a growing library of open source test sets for common AI failure modes
Our commitment to open source goes beyond code. We're building an ecosystem where knowledge about AI testing is shared freely, helping the entire industry build safer, more reliable AI systems.
While we offer commercial services built on top of Rhesis, we maintain a clear separation between open source and commercial offerings:
- The core platform and SDK remain MIT-licensed and free forever
- Commercial offerings focus on enterprise support, managed services, and specialized integrations
- Improvements developed for commercial clients are contributed back to the open source codebase whenever possible
- We never "bait and switch" by moving core functionality from open source to paid tiers
- All commercial/enterprise code is clearly separated in dedicated
ee/
folders and not mixed with open source code
This main repo contains all the components of the Rhesis platform:
rhesis/
├── apps/
│ ├── backend/ # FastAPI backend service
│ ├── frontend/ # React frontend application
│ ├── worker/ # Celery worker service
│ ├── chatbot/ # Chatbot application
│ └── polyphemus/ # Uncensored LLM service for test generation
├── sdk/ # Python SDK for Rhesis
├── infrastructure/ # Infrastructure as code
├── scripts/ # Utility scripts
└── docs/ # Documentation
Please refer to the README files in each component directory for specific setup instructions:
Install the Rhesis SDK using pip:
pip install rhesis-sdk
- Visit https://app.rhesis.ai
- Sign up for a Rhesis account
- Navigate to your account settings
- Generate a new API key
Your API key will be in the format rh-XXXXXXXXXXXXXXXXXXXX
. Keep this key secure and never share it publicly.
Note: You can create custom test sets for your specific use cases directly in the Rhesis App by connecting your GitHub account.
You can configure the Rhesis SDK either through environment variables or direct configuration:
export RHESIS_API_KEY="your-api-key"
export RHESIS_BASE_URL="https://api.rhesis.ai" # optional
Or in Python:
import rhesis
# Set configuration directly
rhesis.base_url = "https://api.rhesis.ai" # optional
rhesis.api_key = "rh-XXXXXXXXXXXXXXXXXXXX"
The backend service provides the core API for the platform, handling authentication, test set management, and integration with external services.
The frontend application provides the user interface for creating, managing, and analyzing test sets for Gen AI applications.
The SDK enables developers to access curated test sets and generate dynamic ones for GenAI applications.
- List Test Sets: Browse through available curated test sets
- Load Test Sets: Load specific test sets for your use case
- Download Test Sets: Download test set data for offline use
- Generate Test Sets: Generate new test sets from basic prompts
from rhesis.sdk.entities import TestSet
# List all test sets
for test_set in TestSet().all():
print(test_set)
# Load a specific test set
test_set = TestSet(id="agent-or-industry-fraud-harmful")
test_set.load()
# Download test set data
test_set.download()
# Generate a new test set
prompt_synthesizer = PromptSynthesizer(prompt="Generate tests for an insurance chatbot that can answer questions about the company's policies.")
test_set = prompt_synthesizer.generate(num_tests=5)
The worker service handles background tasks such as test set generation and analysis.
The chatbot application provides a conversational interface for interacting with the platform.
Polyphemus is a service with an uncensored LLM specifically designed for comprehensive test generation. It enables the creation of robust test cases by exploring edge cases and potential vulnerabilities that might be filtered by standard, safety-constrained models.
Each component in this monorepo maintains its own version number following Semantic Versioning. We use a component-specific tagging strategy for releases:
backend-v1.0.0
- For backend releasesfrontend-v2.3.1
- For frontend releasessdk-v0.5.2
- For SDK releases
For more details on our versioning and release process, please see CONTRIBUTING.md.
We welcome contributions to the Rhesis platform! Rhesis thrives thanks to our amazing community of contributors.
- Code: Fix bugs, implement features, or improve documentation
- Test Sets: Contribute new test cases or improve existing ones
- Documentation: Help improve our guides, tutorials, and API references
- Community Support: Answer questions in our Discord or GitHub discussions
- Feedback: Report bugs, suggest features, or share your experience using Rhesis
- Fork the repository
- Create a feature branch
- Make your changes
- Write or update tests
- Submit a pull request
Our team reviews PRs regularly and provides feedback. We follow a code of conduct to ensure a welcoming environment for all contributors.
For detailed guidelines, please see CONTRIBUTING.md.
For information about releasing Rhesis components and platform versions, see our Release Guide.
We host community calls where we discuss roadmap, feature requests, and showcase community contributions. Join our Discord server for announcements.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, issues, or feature requests:
- Visit our documentation
- Join our Discord server
- Contact us at [email protected]
- Create an issue in this repository
- GitHub Discussions: For questions, ideas, and community discussions
Testing generative AI applications presents unique challenges compared to traditional software:
- Non-deterministic outputs: Gen AI can produce different responses to the same input
- Hallucinations: Models may generate plausible but factually incorrect information
- Edge cases: Unexpected inputs can lead to problematic outputs
- Ethical concerns: Models may produce biased, harmful, or inappropriate content
- Compliance requirements: Many industries have specific regulatory requirements
Rhesis provides the tools to address these challenges through comprehensive test management, helping teams build more reliable and trustworthy Gen AI applications.
Made in Potsdam, Germany 🇩🇪
Visit rhesis.ai to learn more.