PromptKart

A prompt engineering and evaluation suite with tracks, heats, and leaderboards for systematic LLM testing.

Note: This project is not yet available on GitHub.

Overview

PromptKart is a structured environment for prompt engineering that brings rigor to LLM experimentation. Instead of ad-hoc testing, it organizes evaluations into tracks and heats, running prompts across multiple providers and tracking performance over time.

Key Features

Tracks & Heats: Organize experiments into themed tracks with multiple evaluation rounds.
Multi-Provider Testing: Run the same prompts against OpenAI, Anthropic, Google, and local models simultaneously.
Leaderboard System: Track which prompts and models perform best across different tasks.
Chips System: Modular prompt components that can be mixed and matched.
Karts: Configurable prompt templates with variable substitution.
Docker Support: Full containerized deployment with docker-compose.

Technical Architecture

PromptKart uses a FastAPI backend for prompt execution and evaluation, with a React/Vite frontend for the interactive dashboard. SQLite stores evaluation history and leaderboard data.

Core components:

Evaluation Engine: Executes prompts across providers with consistent metrics.
Track Manager: Organizes experiments and manages heat progression.
Metrics Collector: Gathers response quality, latency, and cost data.
Leaderboard Service: Aggregates results and ranks performance.

Technology Stack

Backend: Python, FastAPI, SQLAlchemy
Frontend: React, TypeScript, Vite
Database: SQLite for persistence
Containerization: Docker and docker-compose
LLM Integration: Multi-provider support via unified interface

Current Status

Active development with core evaluation and leaderboard features complete. The chips/karts system enables modular prompt construction. Currently expanding provider support and improving the analytics dashboard.

PromptKart

Overview

Key Features

Technical Architecture

Technology Stack

Current Status

Have questions about PromptKart?

Related Projects

Medicine

ClawGuard

SocialBacklog