Skip to content

Introduction

flaky-tests is a lightweight test telemetry tool. It hooks into your test runner, records every failure to a database, and gives you a CLI to detect when tests have newly started failing intermittently.

The philosophy is simple: capture everything passively, surface problems on demand, remove friction to investigate.

Most flaky test tools are either:

  • SaaS platforms — expensive, require sending your data to a third party, and need significant setup before they’re useful
  • Retry libraries — they paper over the problem instead of surfacing it

flaky-tests is the alternative: a local-first, zero-account tool that just stores test failures in a database. You own the data. You query it when something feels wrong.

  1. Capture — A preload or reporter hooks into your test runner and writes every failure to your chosen store (SQLite, Turso, Supabase, or Postgres)
  2. Detect — The flaky-tests check CLI compares failure counts across two equal time windows. If a test crossed the threshold in the current window but had zero failures in the prior one, it’s flagged as a new pattern
  3. Investigate — The CLI generates a structured prompt ready to paste into Claude, Cursor, or Copilot. Either a test issue (bad setup, timing, wrong assertion) or a code issue (regression, race condition)
  4. Notify — The CLI can open a GitHub issue with the investigation prompt embedded, or you can schedule a GitHub Action to do it automatically

flaky-tests is a monorepo of focused packages:

PackageRole
@flaky-tests/coreShared types and IStore interface
@flaky-tests/plugin-bunBun test preload
@flaky-tests/plugin-vitestVitest reporter
@flaky-tests/store-sqliteLocal SQLite (Bun built-in)
@flaky-tests/store-tursoTurso — remote SQLite, free tier
@flaky-tests/store-supabaseSupabase
@flaky-tests/store-postgresPostgreSQL / Neon
@flaky-tests/corePattern detection and issue creation

You only install what you need.