Extract all text content from your PDF file.
or click to browse
Extraction works on any PDF where the text was generated from a real font — Word exports, browser print-to-PDF, most invoices and reports. It will not work on scans or phone photos saved as PDF, because those are images without a text layer. For scanned documents, use the OCR tool — it runs Tesseract in your browser to recognise text from page images.
Two common reasons. Either the PDF is a scan (no text layer — use OCR), or the PDF uses a custom font encoding that maps characters in an unusual way. For the second case, OCR also works as a fallback.
Text comes out in reading order, but column alignment from tables is lost — PDFs do not store tables as tables, only as floating text boxes. For table-shaped data, try the PDF to Excel tool instead.
Extract everything first, then keep the section you need. For page-level splitting, combine with the Split PDF tool upstream.
UTF-8 with no BOM. That is the modern standard and works in every editor and spreadsheet.
Privacy: The PDF is parsed entirely in your browser with PDF.js. Nothing is sent over the network — the extracted text exists only in this tab until you download or copy it.