Midjourney Prompt Counter
See how many tokens your image prompt actually uses. Most image generators (Midjourney, Stable Diffusion, Flux) use a CLIP text encoder with a hard 77-token cap. Anything past that gets dropped or diluted.
Keyword density
Top 8Add a prompt to see your top keywords.
The 77-token rule
CLIP, the text encoder used by Stable Diffusion and (with variations) by Midjourney and Flux, has a fixed input size of 77 tokens. Your prompt enters CLIP, which produces a 77-token-long embedding. Anything past 77 tokens is either truncated (in older pipelines) or split into multiple chunks and averaged (in newer ones). Either way, the second half of a long prompt rarely lands the way you intended.
In practice, 77 tokens is about 55–60 English words — fewer if your prompt is heavy with adjectives or hyphenated terms.
The tool above shows your token budget against the 77-token cap when you select the Stable Diffusion model preset. Green under 50%, amber under 85%, red after.
A reusable structure for image prompts
Five slots that fit comfortably under 77 tokens:
- Subject — 5–10 tokens (“a futuristic fashion model”)
- Setting — 5–10 tokens (“in a neon-lit Tokyo street”)
- Lighting / camera — 10–15 tokens (“85mm lens, soft rim lighting, shallow depth of field”)
- Style — 5–10 tokens (“editorial photography style”)
- Composition / aspect — 3–6 tokens (“9:16 vertical”)
Total: ~40–60 tokens. Room to spare for one or two specifics.
What to cut when you’re over budget
Three patterns push image prompts past 77 tokens. All three are easy to trim:
- Stacked style adjectives— “moody, atmospheric, dramatic, cinematic, professional, photorealistic, ultra-detailed, hyperrealistic, masterpiece...” Pick one or two that actually point at different things.
- Multiple influences in long form— “in the style of Petra Collins meets Wong Kar-wai meets Ridley Scott...” Pick one. The model can only blend so many references in 77 tokens.
- Generic quality words— “masterpiece, best quality, ultra-detailed, professional photography, award-winning” — these mostly burn tokens without carrying visual information on modern models.
Negative prompts count separately
Most image tools handle negative prompts (the things you don’t want) in a separate CLIP pass. So a 50-token positive + a 20-token negative fits without conflict. Use a short, aimed negative prompt to remove common failure modes rather than stuffing everything into the positive prompt.