A meta collection about KNIME and performance and performance tuning and some problems
Please be aware: there might not be the one perfect setting for every task and system. Read the entries and find the ones that fit your needs. And be aware: often there is no substituting good setup and strong hardware performance with super clever settings - having said that there are some things you *could* check out:
- overall performance of your system (allocated RAM, CPU, Disk). Give KNIME what you can but leave some for your system (cf. links)
- permissions: do you (and KNIME) have full access to all files (there was an issue with MacOS)
- virus scanner - KNIME has a lot of small files, make sure no virus scanner is agressivly blocking them
- settings about data storage and Java heap space (and temporary folders)
- restart your KNIME once with a -clean option in the knime.ini (cf. links)
- maybe do a clean new installation in a new folder (if on windows maybe just un-zip the zip version into a clean folder)
- check your storage/disk - be aware of possible issues with OneDrive [special characters like # and "(" ")" must be allowed/supported]
- think about a suitable backup strategy for your work. Things can go wrong. You might want to revert to an earlier version of your work
- research this collection or on the KNIME forum (forum.knime.com)
Start with the official blog about KNIME performance
https://www.knime.com/blog/optimizing-knime-workflows-for-performance
(yes in theory such a blog might cover it all but there is more to a large system like KNIME)
Check your installation (on a Windows machine use the .ZIP file if in doubt) - share debug log with the community or support
https://forum.knime.com/t/multiple-problems-have-occurred-message-pop-up-on-knime-startup/38655/2?u=mlauber71
Long entry and thread about performance and KNIME "Process 900+ CSV files"
https://forum.knime.com/t/processing-hundreds-of-millions-of-records/13593/5?u=mlauber71
Additional thread about performance and collection of links
https://forum.knime.com/t/large-data-tables-missing/13108/2?u=mlauber71
If you are stuck and KNIME is not responding or behaving strangely try a clean restart *once*
https://forum.knime.com/t/knime-update-next-button-is-not-moving/15539/4?u=mlauber71
Write table to disk and check Java heap space as options to deal with performance issues
https://forum.knime.com/t/knime-3-6-crash-when-dealing-with-massive-data/12145/2?u=mlauber71
Check your virus scanner if it hinders performance
https://forum.knime.com/t/knime-takes-long-to-load/12669/4?u=mlauber71
Think about using the Cache node at certain points and you might run the Java Garbage Collector to free memory
https://forum.knime.com/t/execute-failed-cannot-read-file-knime-container-20200301-1905559478659870280-bin-snappy/21461/19?u=mlauber71
https://hub.knime.com/search?q=garbage&type=Node
Also sometimes there are problems with these 'snappy' files. Switching the internal storage option might help
https://forum.knime.com/t/blob-and-buffer-errors/25915/2?u=mlauber71
Tweak the internal storage (format) of KNIME workflows
https://forum.knime.com/t/knime-is-slowing-down/12619/2?u=mlauber71
You can also try out the new "Columnar Table Backend" (based on Apache Arrow) for internal storage
https://www.knime.com/blog/improved-performance-with-new-table-backend?u=mlauber71
To speed up large data processing think about 'streaming' or using Parquet or ORC files (partitioned files)
https://forum.knime.com/t/problem-with-disk-space-during-workflow-execution/48435/6?u=mlauber71
Find the KNIME temporary files and folders
https://forum.knime.com/t/question-about-memory-policy-in-node-options/36839/2?u=mlauber71
Be aware that some configurations of MS OneDrive might not support all the spacial chacaters like hash (#) that KNIME workflows need
https://forum.knime.com/t/can-not-import-workflow/11115/4?u=mlauber71
https://forum.knime.com/t/can-not-open-knime-workflows-nor-create-new-workflows-after-migration-to-a-new-drive/19058/4?u=mlauber71
On MacOS make sure that KNIME has all the necessary access rights
https://forum.knime.com/t/read-permission-error-in-csv-reader/20879/24?u=mlauber71
https://forum.knime.com/t/macos-catalina-cannot-read-from-file-system/24472/3?u=mlauber71
=> that is also true for other systems of course
Column rename node is not good to change the column type (sorry)
https://forum.knime.com/t/knime-nodes-corrupting-on-multiple-computers/13277/7?u=mlauber71
KNIME and Spark 1
https://forum.knime.com/t/how-to-speed-up-the-spark-to-datebase-node/13288/2?u=mlauber71
KNIME and Spark 2
https://forum.knime.com/t/spark-context-how-to-cache-the-intermediate-data-and-not-write-it-back-while-doing-a-series-of-transformations-on-the-data/20881/4?u=mlauber71
Streaming data in KNIME
https://www.knime.com/blog/streaming-data-in-knime
Save and don't save data in a workflow
https://forum.knime.com/t/save-workflow-without-data/27843/2?u=mlauber71
Create your own backup system with KNIME and ZIP
https://forum.knime.com/t/copy-move-files-node/29412/15?u=mlauber71
KNIME and Backup - think about protecting your work(flows)
https://forum.knime.com/t/restore-knime-workflows/36812/2?u=mlauber71
Create path variables in KNIME
https://forum.knime.com/t/write-multiple-csv-file-to-a-loop/44989/2?u=mlauber71
Workflow
A meta collection about KNIME and performance and performance tuning and some problems
External resources
- Start with the official blog about KNIME performance
- Additional thread about performance and collection of links
- If you are stuck and KNIME is not responding or behaving strangely try a clean restart *once*
- Write table to disk and check Java heap space as options to deal with performance issues
- Check your virus scanner if it hinders performance
- Think about using the Cache node at certain points and you might run the Java Garbage Collector to free meomory
- Tweak the internal storage (format) of KNIME workflows
- Be aware that some configurations of MS OneDrive might not support all the spacial chacaters like hash (#) that KNIME workflows need
- Also sometimes there are problems with these 'snappy' files. Switching the internal storage option might help
- You can also try out the new "Columnar Table Backend" (based on Parquet) for internal storage
- On MacOS make sure that KNIME has all the necessary access rights
- KNIME and Spark 1
- KNIME and Spark 2
- Streaming data in KNIME
- Save and don't save data in a workflow
- Create your own backup system with KNIME and ZIP
- Column rename node is not good to change the column type (sorry)
- KNIME and Backup - think about protecting your work(flows)
- Long entry and thread about performance and KNIME "Process 900+ CSV files"
- Check your installation (on a Windows machine use the .ZIP file if in doubt) - share debug log with the community or support
- To speed up large data processing think about 'streaming' or using Parquet or ORC files (partitioned files)
- Find the KNIME temporary files and folders
- Create path variables in KNIME
- BLOG: Mastering KNIME: Unlocking Peak Performance with Expert Tips and Smart Setting
Used extensions & nodes
All required extensions are part of the default installation of KNIME Analytics Platform version 4.7.2
Legal
By using or downloading the workflow, you agree to our terms and conditions.