Tokens

The basic units of text that AI models process — roughly word-sized chunks that can be words, parts of words, or punctuation.

Tokens are the units a language model actually processes: text is split into chunks that may be whole words, parts of words, or punctuation marks. A rough rule for English: one token is about four characters, or three-quarters of a word — "hello" is one token, "unbelievable" may be two or three.

Tokens explain two practical constraints. The context window — "128K" or "1 million tokens" — is the model's working memory; conversations, documents, and instructions all consume it, and past the limit older content is truncated or the request refused. And most AI APIs charge per token, input and output separately, so token counts drive both cost and speed.

For teams building AI applications, token usage is a key performance and budget metric; shorter, clearer prompts use fewer tokens and respond faster.