Đếm Token ChatGPT — Máy Tính Miễn Phí

Hướng dẫn chi tiết sắp ra mắt

Chúng tôi đang chuẩn bị hướng dẫn giáo dục toàn diện cho ChatGPT Token Counter. Quay lại sớm để xem giải thích từng bước, công thức, ví dụ thực tế và mẹo từ chuyên gia.

The ChatGPT Token Counter estimates the number of tokens in a given text and calculates the associated API cost for OpenAI models including GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, and other LLMs. Tokens are the fundamental unit of text processing in large language models — they are not characters or words, but subword units typically averaging about 4 characters or 0.75 words per token in English. Understanding token counts is essential for API cost management, prompt engineering, and staying within model context window limits. GPT-4o supports a 128K token context window, meaning a single conversation can include roughly 96,000 words of combined input and output. Since API billing is based on token count with separate rates for input and output tokens, accurate token estimation directly impacts development budgets and application costs.

f(x)

Token Count (approximate) = Character Count / 4, or Word Count / 0.75. API Cost = (Input Tokens / 1,000,000) x Input Price per Million + (Output Tokens / 1,000,000) x Output Price per Million.

Tên	Đơn vị	Mô tả
Token Count	tokens	Number of tokens in the input or output text, as determined by the model's tokenizer (e.g., cl100k_base for GPT-4)
Character Count	characters	Total number of characters in the text — roughly 4 characters per token in English
Input Price	USD per 1M tokens	Cost per million input (prompt) tokens for the selected model
Output Price	USD per 1M tokens	Cost per million output (completion) tokens for the selected model — typically 2-4x higher than input price
Context Window	tokens	Maximum combined input + output tokens the model supports in a single request

Biến thể	Công thức
Character-Based Estimate
Word-Based Estimate
API Cost Calculation

1Paste or type your text into the token counter input field — the tool processes it in real time.
2The counter estimates token count using the approximation of ~4 characters per token (or uses the actual cl100k_base tokenizer for precise counts).
3Select the AI model you plan to use — GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, Claude 3, etc. — as pricing differs significantly.
4The calculator applies the model's per-million-token pricing to your input token count and your estimated output token count.
5Total API cost is computed: (Input Tokens x Input Rate) + (Output Tokens x Output Rate).
6The tool also shows how much of the model's context window your prompt consumes, helping you stay within limits.
7For batch operations, multiply the per-request cost by the number of API calls to estimate total project cost.

Ví dụ 1Short Prompt: Customer Support Chatbot (GPT-4o)

Cho trước:150-word user message, ~50-word system prompt, ~200-word response

Kết quả:~267 input tokens + ~267 output tokens = $0.00067 + $0.00267 = $0.00334 per request

200 words / 0.75 = ~267 tokens input. At GPT-4o rates ($2.50/1M input, $10.00/1M output): input cost = 267/1M x $2.50 = $0.00067, output cost = 267/1M x $10.00 = $0.00267. Total per request = $0.00334. At 10,000 customer interactions/month, the total cost is about $33.40.

Ví dụ 2Long Document Summarization (GPT-4o)

Cho trước:10,000-word document input, 500-word summary output

Kết quả:$0.0333 input + $0.00667 output = $0.04 per document

10,000 words / 0.75 = ~13,333 input tokens. 500 words / 0.75 = ~667 output tokens. Input cost: 13,333/1M x $2.50 = $0.0333. Output cost: 667/1M x $10.00 = $0.00667. Total = ~$0.04 per document. Summarizing 1,000 documents would cost about $40.

Ví dụ 3GPT-3.5 Turbo vs GPT-4o Cost Comparison

Cho trước:1,000 tokens input, 500 tokens output, compare models

Kết quả:GPT-3.5 Turbo: $0.00125 | GPT-4o: $0.0075 (6x more expensive)

GPT-3.5 Turbo: (1,000/1M x $0.50) + (500/1M x $1.50) = $0.0005 + $0.00075 = $0.00125. GPT-4o: (1,000/1M x $2.50) + (500/1M x $10.00) = $0.0025 + $0.005 = $0.0075. GPT-4o is 6x more expensive per request but delivers substantially better reasoning and instruction-following quality.

Ví dụ 4High-Volume Application: 100K API Calls/Day

Cho trước:500 input tokens, 300 output tokens per call, 100K calls/day, GPT-4o mini

Kết quả:$7.50 input + $18.00 output = $25.50/day ($765/month)

Daily input tokens: 500 x 100,000 = 50M. Daily output tokens: 300 x 100,000 = 30M. Input cost: 50M/1M x $0.15 = $7.50. Output cost: 30M/1M x $0.60 = $18.00. Daily total = $25.50, monthly = ~$765. GPT-4o mini is designed for high-volume applications where cost efficiency matters more than peak reasoning capability.

🏗️

API budget forecasting: Developers estimate monthly OpenAI API costs based on expected request volumes, average prompt lengths, and model selection to set engineering budgets.

🔬

Prompt optimization: Engineers measure token counts of different prompt designs to find the most cost-efficient formulations that maintain output quality.

📊

Chatbot cost modeling: Product teams calculate per-conversation costs for AI chatbots to determine pricing, margins, and whether to use cheaper models for simpler queries.

🏥

Document processing pipelines: Companies estimate the cost of processing large document corpora (contracts, medical records, legal filings) through GPT-4 for summarization, extraction, or analysis.

⚙️

Model selection: Technical leads compare the cost-per-task across different models (GPT-4o vs. Claude vs. Gemini) to choose the optimal model for each use case in their application.

Non-English Text and Multilingual Tokenization

Non-Latin scripts (Chinese, Japanese, Korean, Arabic, Hindi) typically consume 2-3x more tokens per word than English because the tokenizer was primarily trained on English text. A 1,000-character Chinese text might use 700-1,000 tokens, while the same length in English would use only 250 tokens. This significantly impacts cost for multilingual applications.

Code Tokenization

Source code often tokenizes differently than prose. Common programming keywords and syntax are usually single tokens, but variable names, strings, and comments vary widely. Indentation and whitespace consume tokens. Minified code uses fewer tokens than formatted code. Python typically tokenizes more efficiently than verbose languages like Java.

Cached Input Tokens (Prompt Caching)

OpenAI offers a 50% discount on cached input tokens for GPT-4o and GPT-4o mini when the same prompt prefix is reused across API calls. If your system prompt (say, 2,000 tokens) is identical across requests, you pay full price on the first call and half price on subsequent calls for those cached tokens. This can reduce costs by 20-40% for applications with long system prompts.

Model	Input (per 1M tokens)	Output (per 1M tokens)	Context Window
GPT-4o	$2.50	$10.00	128K tokens
GPT-4o mini	$0.15	$0.60	128K tokens
GPT-4 Turbo	$10.00	$30.00	128K tokens
GPT-3.5 Turbo	$0.50	$1.50	16K tokens
o1 (reasoning)	$15.00	$60.00	200K tokens
o1-mini	$3.00	$12.00	128K tokens
Claude 3.5 Sonnet (Anthropic)	$3.00	$15.00	200K tokens
Gemini 1.5 Pro (Google)	$1.25	$5.00	1M tokens

H

What exactly is a token?

Đ

A token is a subword unit used by LLMs to process text. Common words like 'the' or 'and' are single tokens, while longer or unusual words may be split into multiple tokens (e.g., 'cryptocurrency' might be 'crypto' + 'currency' = 2 tokens). In English, 1 token averages ~4 characters or ~0.75 words. Non-English languages and code typically have different token-per-word ratios.

H

Why are output tokens more expensive than input tokens?

Đ

Output (completion) tokens require more computation than input (prompt) tokens because the model must generate each output token sequentially, performing a full forward pass through the neural network for each one. Input tokens can be processed in parallel. This asymmetry in compute cost is reflected in the 2-4x price premium for output tokens.

H

How accurate is the ~4 characters per token estimate?

Đ

The 4-character approximation is reasonably accurate for standard English prose (within 10-15%). However, it becomes less accurate for code (which may use 2-3 characters per token due to syntax symbols), non-Latin scripts (Chinese/Japanese may use 1-2 characters per token), and specialized terminology. For precise counts, use OpenAI's tiktoken library or the actual tokenizer.

H

What is the context window and why does it matter?

Đ

The context window is the maximum number of tokens (input + output combined) a model can process in a single request. GPT-4o has a 128K token window (~96,000 words). If your input exceeds the context window, you must truncate, summarize, or use retrieval-augmented generation (RAG). Exceeding the limit returns an API error.

H

Do system prompts and conversation history count toward token usage?

Đ

Yes. Every API call includes the full system prompt, all conversation history, and the new user message as input tokens. In a chatbot, conversation history grows with each turn, so costs increase as conversations get longer. Implement conversation truncation or summarization strategies to control costs in multi-turn applications.

H

How does GPT-4o pricing compare to Claude and other models?

Đ

As of 2025: GPT-4o is $2.50/$10.00 per 1M tokens (input/output). Anthropic Claude 3.5 Sonnet is $3.00/$15.00 per 1M. Google Gemini 1.5 Pro is $1.25/$5.00 per 1M. GPT-4o mini at $0.15/$0.60 is the cheapest frontier model option. Prices change frequently as providers compete on cost-performance.

💡

Mẹo Chuyên Nghiệp

To reduce API costs without sacrificing quality: (1) Use GPT-4o mini for simple tasks like classification and extraction, reserving GPT-4o for complex reasoning. (2) Implement prompt caching to get 50% off repeated system prompts. (3) Set max_tokens to prevent unexpectedly long outputs. (4) Summarize conversation history instead of sending the full transcript. A well-optimized application can cut costs by 60-80% compared to naive API usage.

⭐

Bạn có biết?

The word 'tokenization' in AI has a curious dual life — in natural language processing it means splitting text into subword units, while in cybersecurity it means replacing sensitive data with non-sensitive placeholders. Both meanings involve transforming information into smaller units, but for completely different purposes. OpenAI's cl100k_base tokenizer has a vocabulary of exactly 100,256 unique tokens.

Tài liệu tham khảo

ChatGPT Token Counter

Là gì ChatGPT Token Counter?

Công thức

Chú giải biến

Biến thể công thức

Cách ChatGPT Token Counter

Ví dụ có lời giải

Ứng dụng thực tế

Trường hợp đặc biệt

Non-English Text and Multilingual Tokenization

Code Tokenization

Cached Input Tokens (Prompt Caching)

Hướng dẫn theo vùng

OpenAI Model Pricing (2025)

Câu hỏi thường gặp

Lỗi thường gặp cần tránh

Cài đặt