ractogateway.rag.processors.lemmatizer
Lemmatization processor — uses NLTK WordNetLemmatizer (lazy import).
Install with: pip install ractogateway[rag-nlp]
Note: Lemmatization changes the surface form of text and can degrade embedding quality for neural models (which were trained on unmodified text). Use this processor only when building keyword-index pipelines or when explicitly required for your retrieval strategy.
- class ractogateway.rag.processors.lemmatizer.Lemmatizer(use_pos_tagging=True)[source]
Bases:
BaseProcessorReduce words to their base (lemma) form using NLTK WordNet.
- Parameters:
use_pos_tagging (
bool) – IfTrue, use POS tagging to improve lemmatization accuracy. Slightly slower but produces better results.