Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. https://promptfoo.dev
  • TypeScript 96.8%
  • CSS 1.7%
  • JavaScript 1.1%
  • Python 0.2%
  • HTML 0.1%
Find a file
Yufeng He 644e71b06b
fix(testCaseReader): preserve JSONL row description instead of overwriting (#9840)
Co-authored-by: mldangelo <michael.l.dangelo@gmail.com>
2026-06-22 14:04:17 -04:00
.agents docs(agents): improve Codex skill routing (#9130) 2026-05-06 14:16:48 -04:00
.claude chore: use local block-no-verify install instead of npx in Claude Code hook (#9675) 2026-06-09 17:10:51 -07:00
.claude-plugin feat(redteam): publish all four promptfoo skills to the Claude Code marketplace (#9665) 2026-06-09 08:37:43 -07:00
.cursor chore(deps): remove unused ts-node dependency (#6731) 2025-12-17 00:06:12 -08:00
.devcontainer chore(deps): update mcr.microsoft.com/vscode/devcontainers/typescript-node docker tag to v24 (#7059) 2026-01-14 10:26:05 -08:00
.github fix: restore Docker CLI links and validate rate-limit reset metadata (#9790) 2026-06-17 13:32:19 -04:00
.husky chore: tighten knip dead-file checks (#9074) 2026-05-03 13:55:33 -04:00
.vscode chore: update CODEOWNERS handles and VS Code association (#8299) 2026-03-24 09:28:00 -07:00
architecture feat(providers): add Moonshot (Kimi) provider (#9672) 2026-06-18 00:13:49 -04:00
code-scan-action fix(deps): update type definitions (#9832) 2026-06-21 20:51:02 -04:00
docs refactor(eval): add evaluation store port (#9601) 2026-06-03 14:15:56 -04:00
drizzle docs(agents): add subsystem AGENTS.md context files (#9579) 2026-06-02 00:48:44 -04:00
examples fix(deps): update opentelemetry (#9827) 2026-06-21 20:03:45 -04:00
helm/chart/promptfoo fix(helm): correct Docker registry domain from fghcr.io to ghcr.io (#7056) 2026-01-14 09:07:06 -08:00
plugins fix(redteam): publish marketplace skill fixes (#9676) 2026-06-09 16:55:12 -07:00
scripts fix(deps): constrain undici to <7.27.1 to fix Node 26 "terminated" error (#9668) 2026-06-09 15:13:25 -07:00
site fix(csv): preserve quoted commas in contains-any/all assertion values (#9761) 2026-06-21 21:48:19 -07:00
src fix(testCaseReader): preserve JSONL row description instead of overwriting (#9840) 2026-06-22 14:04:17 -04:00
test fix(testCaseReader): preserve JSONL row description instead of overwriting (#9840) 2026-06-22 14:04:17 -04:00
tools/biome test: isolate env mutations in root tests (#8789) 2026-04-18 00:41:57 -07:00
.biomeignore chore(providers): remove adaline gateway provider (#6999) 2026-01-10 03:51:02 -05:00
.dockerignore feat: Migrate NextUI to a React App (#1637) 2024-09-16 21:38:27 -06:00
.git-blame-ignore-revs chore(biome): run linter (#6761) 2025-12-18 09:29:32 -08:00
.gitignore feat(providers): support Agents SDK 0.9 workflows (#9128) 2026-05-08 14:38:34 -04:00
.mailmap chore: add mailmap aliases for public handles (#8458) 2026-04-02 14:46:58 -07:00
.npmignore docs: Merge docs into main repo (#317) 2023-11-30 11:23:35 -08:00
.npmrc fix(deps): avoid incompatible npm release-age config (#9244) 2026-05-16 09:12:15 -07:00
.nvmrc chore(deps): update node.js (#9802) 2026-06-18 10:37:54 -04:00
.prettierignore test: add vitest coverage configuration for all test suites (#7154) 2026-01-26 12:07:07 -08:00
.prettierrc.yaml chore: migrate from ESLint + Prettier to Biome (#4903) 2025-07-13 00:11:30 -04:00
.release-please-manifest.json chore(main): release code-scan-action 0.1.8 (#9602) 2026-06-16 13:45:57 -04:00
.rubocop.yml feat: Add ruby provider (#5902) 2025-10-13 09:21:41 -07:00
.ruff.toml feat: Claude Agent SDK provider support (#5509) 2025-10-13 10:19:57 -07:00
AGENTS.md docs(agents): watch main CI after landing a PR and fix flakes at the source (#9678) 2026-06-09 22:02:09 -07:00
biome.jsonc chore(deps): update biome to v2.4.16 (#9648) 2026-06-08 10:31:34 -07:00
CHANGELOG.md chore(main): release 0.121.17 (#9770) 2026-06-16 12:58:50 -04:00
CITATION.cff docs: add faizan as a contributor in citation file (#6879) 2025-12-29 19:29:27 -05:00
CLAUDE.md chore: consolidate agent instruction files using AGENTS.md standard (#6398) 2025-11-28 19:23:19 -05:00
CODE_OF_CONDUCT.md docs: add Contributor Covenant 3.0 Code of Conduct (#7022) 2026-01-12 15:43:01 -08:00
codecov.yml ci(codecov): make project coverage status informational (#9755) 2026-06-16 09:53:47 -04:00
CONTRIBUTING.md chore: add minimumReleaseAge policy for npm dependencies (#6383) 2025-11-27 14:09:25 -05:00
Dockerfile fix: restore Docker CLI links and validate rate-limit reset metadata (#9790) 2026-06-17 13:32:19 -04:00
drizzle.config.ts chore: migrate drizzle (#1922) 2024-10-17 14:22:42 -07:00
knip.json chore: tighten knip dead-file checks (#9074) 2026-05-03 13:55:33 -04:00
LICENSE chore: update year 2025-01-16 15:07:58 -08:00
package-lock.json chore(deps): update ibm packages to ^1.7.14 (#9839) 2026-06-22 08:32:12 -04:00
package.json chore(deps): update ibm packages to ^1.7.14 (#9839) 2026-06-22 08:32:12 -04:00
pnpm-workspace.yaml chore(build): add pnpm support (#3307) 2025-03-06 11:19:37 -08:00
README.md chore(build): add Node 26 support (#9222) 2026-05-13 17:47:37 -07:00
release-please-config.json chore(release-please): bump last-release-sha to clear drift guard (#9844) 2026-06-22 12:04:53 -04:00
renovate.json chore(deps): hold tsdown on 0.21.x while Node 20 is supported (#9731) 2026-06-15 11:53:52 -04:00
SECURITY.md docs: clarify runtime feedback-loop scope (#9314) 2026-05-20 22:55:21 -04:00
tsconfig.json feat(build): publish a lightweight promptfoo/contracts subpath (#9535) 2026-05-31 21:09:33 -04:00
tsdown.config.ts feat(build): publish a lightweight promptfoo/contracts subpath (#9535) 2026-05-31 21:09:33 -04:00
vitest.config.ts test: stop intermittent forks-worker crash from failing green CI shards (#9681) 2026-06-09 22:01:59 -07:00
vitest.integration.config.ts test: enforce root TypeScript coverage (#9101) 2026-05-04 14:24:15 -04:00
vitest.setup.ts fix(db): isolate libsql test databases (#9504) 2026-05-28 16:29:21 -04:00
vitest.smoke.config.ts test: add CLI and library smoke tests (#6669) 2025-12-29 20:45:53 -05:00

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a CLI and library for evaluating and red-teaming LLM apps. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website · Getting Started · Red Teaming · Documentation · Discord

Promptfoo is now part of OpenAI. Promptfoo remains open source and MIT licensed. Read the company update.

Quick Start

Requires Node.js ^20.20.0 or >=22.22.0 for npm and npx usage.

npm install -g promptfoo
promptfoo init --example getting-started

Also available via brew install promptfoo and pip install promptfoo. You can also use npx promptfoo@latest to run any command without installing.

Most LLM providers require an API key. Set yours as an environment variable:

export OPENAI_API_KEY=sk-abc123

Once you're in the example directory, run an eval and view results:

cd getting-started
promptfoo eval
promptfoo view

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Review pull requests for LLM-related security and compliance issues with code scanning
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

promptfoo command line

It also can generate security vulnerability reports:

gen ai red team

Why Promptfoo?

  • Developer-first: Fast, with features like live reload and caching
  • Private: LLM evals run 100% locally - your prompts never leave your machine
  • Flexible: Works with any LLM API or programming language
  • Battle-tested: Powers LLM apps serving 10M+ users in production
  • Data-driven: Make decisions based on metrics, not gut feel
  • Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.