The rise of Large Language Models (LLMs) has transformed the field of natural language processing (NLP), offering a wide range of proprietary and open-source models varying significantly in size and complexity, often measured by billions of parameters. While larger models excel in complex tasks like summarization and creative text generation, smaller models are suited for simpler tasks such as document classification and information extraction from unstructured data. This study evaluates open-source LLMs, specifically those with 7 to 14 billion parameters, in the task of extracting information from OCR texts of digitized documents. The effectiveness of OCR can be influenced by factors such as skewed images and blurred photos, resulting in unstructured text with various issues. The utility of these models is highlighted in Intelligent Process Automation (IPA), where software robots partially replace humans in validating and extracting information, enhancing efficiency and accuracy. The documents used in this research, provided by a state treasury department in Brazil, comprise personal verification documents. Results show that open-source entry-level models perform 18% lower than a cutting-edge proprietary model with trillions of parameters, making them viable free alternatives.