Extract Data from Bank Statements (PDF) into JSON files with the help of GPT4All / Llama3 LLM
List PDFs from your drive that roughly have a similar layout and you expect an LLM to be able to extract data in a systematic way
Formulate a concise prompt (and instruction) and try to force the LLM to give back a JSON file with always the same structure (Mistral seems to be very good at that)
Use GPT4All wrapper to put document and query before the LLM
Collect the responses
Extract the data from JSON files, either with the help of Regex or just convert the JSON with KKNIME nodes
Make sure they have the same structure
=> you need to have GPT4All installed and a suitable model downloaded to your "gpt4all_models" folder. You can choose the model then in the component