Use LLM (Ollama/Llama3.2) to process text data in Chunks and collect a structured output on your local machine
you need Ollama installed and ready to run on your machine, also you should have GPt4All installed
tested it with Llama 3.2
The text you want to clear / be processed will be stored in chunks in JSOn structures to the LLM will understand the content and you can refer to the columns in the Prompt
There will be 10 lines at a time so as not to overflow the memory of the model
The Question will consist of the Prompt and then the JSOn file itself
The results will either be a CSV structure or sometimes a CSV structure enclosed in backticks - so there is a fork in the workflow to account for both. You might have to adapt this for your own data
The columns with the extracted CSV will then be written to actual CSVs and imported back
Key is that you tell the LLM to preserve the original row_key so as to be able to join it together
obviously some more work on the prompt or the data preprocessing might be necessary - you could also try and employ a (well) LLM to do that
An Apple Silicon machine (M1) works reasonably fast in order to let this run locally