Low-latency prompt compression

Save on LLM input tokens before every request.

Plan Ferret shortens prompts before they reach your model, helping teams reduce input token costs with a simple API and evolving compression algorithm.

How do you save on token input costs? You Plan Ferret!

Create a free account Read the API docs

Free plan: 20/day
Paid usage: $0.01/query
Potential savings: Up to 50%

FERRET

A vigilant ferret for every prompt.

Send Plan Ferret your original query, receive a compressed version, and pass that on to your LLM provider.

POST /detokenate/v1
Authorization: Bearer pf_live_...

{ "q": "the original query" }