Does your PDF fit in ChatGPT's context window?
Context windows are measured in tokens, not pages. Here is how to count them before pasting a PDF into any AI tool, and what to do when the number is too high.
You paste a long report into ChatGPT and ask it to summarize the main risks. The answer looks confident. It covers the executive summary and the first two sections in good detail. The back half of the document, the section you actually needed, got cut off and the model never told you.
That is the context window problem. It happens silently, without any error message.
What a context window means in practice
Every language model has a limit on how much text it can hold in memory during a single conversation. Once you exceed that limit, the model either refuses the request, truncates your input without saying so, or shifts to a compressed version of the content. None of these options come with a clear warning.
The limit is measured in tokens, not words or pages. A token is roughly three to four characters of English. A typical page of prose lands somewhere between 300 and 500 tokens depending on how dense the text is.
Context windows vary quite a bit across models:
- GPT-3.5-turbo: 16 385 tokens (roughly 40 to 50 pages of text)
- GPT-4o: 128 000 tokens (roughly 300 to 400 pages)
- Claude 3.5 Sonnet: 200 000 tokens
- Gemini 1.5 Pro: 1 000 000 tokens
Tokens are not words, and that matters
Technical documents, legal contracts, and PDFs with tables or structured data often tokenize longer than their page count would suggest. A 30-page policy document full of defined terms and cross-references can easily run to 18 000 or 20 000 tokens. A scanned document with minimal text might come in under 2 000.
Code-heavy PDFs are especially unpredictable. Code tokenizes very differently from prose. Non-English text also tokenizes at a higher rate per word in most models, since vocabularies are optimized for English. Guessing does not work reliably here.
Why silent truncation is the real problem
Most chat interfaces do not throw an error when you exceed the context window. They either quietly crop the input or produce an answer that looks complete but only covers part of the document. You have no way to know truncation happened unless you checked the token count beforehand.
For use cases where the end of a document matters, investment memos where risk factors appear in the last third, contracts where the clauses you care about are buried deep, or research papers where the conclusion contradicts the abstract, this is not a small issue.
Checking before you paste
PDFShore's token counter extracts the text from your PDF in the browser and runs the same tokenizer GPT-4 uses, cl100k_base. You get the actual count before anything goes to any model.
The results include total tokens by model family, a per-page breakdown so you can see which sections are heavy, and a comparison against common context windows so you know at a glance whether the document fits. The extraction and counting both run locally. The content of your PDF does not leave your machine.
What to do when the count is too high
Splitting by chapter or natural section is the most reliable fix. Most long documents are structured for sequential reading anyway. Pass one section at a time, starting with the part that answers your question.
If you only need a specific part of a long manual, extract those pages first. Feeding a 200-page document when you need the troubleshooting section on page 140 wastes most of the context window and makes the answer less reliable. Pull the relevant pages and paste that instead.
For recurring workflows, keep a cleaned Markdown version of documents you use often. Markdown tokenizes more predictably than raw PDF extraction, and you can trim header boilerplate that inflates the count without adding information.