Transcriptr is a modern web application that converts audio files to text using artificial intelligence. It provides a clean, intuitive interface for uploading audio files and receiving high-quality transcriptions powered by Replicate's Incredibly Fast Whisper model.
Visit the live demo at Transcriptr Demo.
- Audio Transcription: Convert audio to text with high accuracy
- Multiple Format Support: Download transcriptions in TXT, MD, PDF, and DOCX formats
- Language Selection: Choose from multiple languages for better accuracy
- Speaker Diarization: Optionally identify different speakers in the transcription
- Batch Processing: Handle large files efficiently with optimized processing
- Export Options: Download individual formats or all formats as a ZIP
- Frontend: React with TypeScript, powered by Next.js for server-side rendering and static site generation
- UI: Tailwind CSS with shadcn/ui components for a modern interface
- Backend: Next.js API Routes for handling API requests
- AI Integration: Replicate API for accessing the Incredibly Fast Whisper model
- Document Handling:
- Printerz for high-quality PDF template rendering
- Libraries for generating DOCX, and ZIP files
- Storage: Firebase Storage for saving generated documents
- Node.js (v18 or later)
- bun
- Replicate API token (for AI transcription)
-
Clone the repository:
git clone https://github.com/aramb-dev/transcriptr.git cd transcriptr
-
Install dependencies:
bun install
-
Create a .env.local file in the root directory with your Replicate API token:
NEXT_PUBLIC_REPLICATE_API_TOKEN=your_replicate_api_token_here
-
Start the development server:
bun run dev
-
Open your browser to
http://localhost:3000
to see the application.
Transcriptr requires several environment variables to function properly. Create a .env.local
file in the project root with the following variables:
Variable | Description |
---|---|
NEXT_PUBLIC_REPLICATE_API_TOKEN |
Your Replicate API token for accessing the Incredibly Fast Whisper model |
NEXT_PUBLIC_FIREBASE_API_KEY |
Firebase API key for storage services |
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN |
Firebase auth domain |
NEXT_PUBLIC_FIREBASE_PROJECT_ID |
Firebase project ID |
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET |
Firebase storage bucket for storing transcriptions and PDFs |
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID |
Firebase messaging sender ID |
NEXT_PUBLIC_FIREBASE_APP_ID |
Firebase application ID |
NEXT_PUBLIC_PRINTERZ_API_KEY |
API key for Printerz PDF generation services |
NEXT_PUBLIC_LARGE_FILE_THRESHOLD |
Threshold in MB for large file warnings |
Variable | Description | Default |
---|---|---|
NEXT_PUBLIC_CLOUDCONVERT_API_KEY |
CloudConvert API key for automatic audio format conversion (M4A, AAC, WMA → MP3) | None |
PORT |
Port for the server to listen on | 3000 |
NODE_ENV |
Environment mode (development or production ) |
development |
NEXT_PUBLIC_REPLICATE_API_TOKEN=your_replicate_token_here
NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your-project-id
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your-project.appspot.com
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=123456789012
NEXT_PUBLIC_FIREBASE_APP_ID=1:123456789012:web:abcdef1234567890
NEXT_PUBLIC_PRINTERZ_API_KEY=your_printerz_api_key
NEXT_PUBLIC_LARGE_FILE_THRESHOLD=1
NEXT_PUBLIC_CLOUDCONVERT_API_KEY=your_cloudconvert_api_key
- Replicate API Token: Sign up at Replicate and create an API token
- Firebase: Set up a project in Firebase Console and get your credentials
- Printerz: Create an account at Printerz and get your API key
- CloudConvert (optional): Register at CloudConvert to enable automatic conversion of M4A, AAC, WMA, and other formats to MP3
To build the application for production:
bun run build
This command creates an optimized production build in the .next
directory.
- Build the application as described above
- Set the environment variable
NODE_ENV
toproduction
- Start the server:
bun run start
The server will run on port 3000 by default, but you can override this by setting the PORT
environment variable.
Create a Dockerfile in the root directory:
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN bun install
COPY . .
RUN bun run build
ENV NODE_ENV=production
ENV PORT=3000
EXPOSE 3000
CMD ["bun", "run", "start"]
Build and run the Docker container:
docker build -t transcriptr .
docker run -p 3000:3000 -e NEXT_PUBLIC_REPLICATE_API_TOKEN=your_token_here transcriptr
transcriptr/
├── public/ # Static assets
├── src/ # Source code
│ ├── app/ # Next.js App Router
│ │ ├── layout.tsx # Main layout
│ │ └── page.tsx # Main page
│ ├── components/ # React components
│ │ ├── ui/ # UI components based on shadcn/ui
│ │ └── ...
│ ├── hooks/ # Custom React hooks
│ ├── lib/ # Utility functions
│ └── server/ # Server-side logic
├── next.config.mjs # Next.js configuration
├── tailwind.config.js # Tailwind CSS configuration
├── tsconfig.json # TypeScript configuration
└── package.json # Dependencies and scripts
Next.js API Routes are used for the backend. The API endpoints are located in the src/app/api
directory.
Transcriptr supports a wide range of audio formats with automatic conversion:
- MP3 (.mp3) - Most common format
- WAV (.wav) - Uncompressed audio
- FLAC (.flac) - Lossless compression
- OGG (.ogg) - Open-source format
- M4A (.m4a) - iPhone/macOS recordings
- AAC (.aac) - Advanced Audio Coding
- MP4 (.mp4) - Video files with audio
- WMA (.wma) - Windows Media Audio
- AIFF (.aiff) - Apple format
- CAF (.caf) - Core Audio Format
- Upload any supported format - No manual conversion needed!
- Automatic detection - System identifies if conversion is required
- Seamless processing - Unsupported formats are converted to MP3 automatically
- Transparent progress - View conversion status in real-time
Note: To enable automatic conversion, you need to set up the
CLOUDCONVERT_API_KEY
environment variable. See the Environment Variables section for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Replicate for providing the Incredibly Fast Whisper model
- shadcn/ui for the component library
- Tailwind CSS for styling
- React for the UI framework
- Next.js for the application framework
- Printerz for PDF template rendering and generation
Developed by Abdur-Rahman Bilal (aramb-dev)