ractogateway.rag.readers.pdf_reader

PDF reader — uses pypdf (lazy import).

Install with: pip install ractogateway[rag-pdf]

class ractogateway.rag.readers.pdf_reader.PdfReader(extract_images=False)[source]

Bases: BaseReader

Extract text from PDF files using pypdf.

Accepts a file path (str / Path), raw bytes, or any binary file-like object with a .read() method.

Parameters:

extract_images (bool) – Reserved for future use — image extraction is not yet supported.

property supported_extensions: frozenset[str]

Lower-case extensions (with dot) this reader handles, e.g. {".pdf"}.