A meta collection about KNIME and performance and performance tuning and some problems
Please be aware: there might not be the one perfect setting for every task and system. Read the entries and find the ones that fit your needs. And be aware: often there is no substituting good setup and strong hardware performance with super clever settings - having said that there are some things you *could* check out:
- overall performance of your system (allocated RAM, CPU, Disk). Give KNIME what you can but leave some for your system (cf. links)
- permissions: do you (and KNIME) have full access to all files (there was an issue with MacOS)
- virus scanner - KNIME has a lot of small files, make sure no virus scanner is agressivly blocking them
- settings about data storage and Java heap space (and temporary folders)
- restart your KNIME once with a -clean option in the knime.ini (cf. links)
- maybe do a clean new installation in a new folder (if on windows maybe just un-zip the zip version into a clean folder)
- check your storage/disk - be aware of possible issues with OneDrive [special characters like # and "(" ")" must be allowed/supported]
- think about a suitable backup strategy for your work. Things can go wrong. You might want to revert to an earlier version of your work
- research this collection or on the KNIME forum (forum.knime.com)
Start with the official blog about KNIME performance
https://www.knime.com/blog/optimizing-knime-workflows-for-performance
(yes in theory such a blog might cover it all but there is more to a large system like KNIME)
Check your installation (on a Windows machine use the .ZIP file if in doubt) - share debug log with the community or support
https://forum.knime.com/t/multiple-problems-have-occurred-message-pop-up-on-knime-startup/38655/2?u=mlauber71
Long entry and thread about performance and KNIME "Process 900+ CSV files"
https://forum.knime.com/t/processing-hundreds-of-millions-of-records/13593/5?u=mlauber71
Additional thread about performance and collection of links
https://forum.knime.com/t/large-data-tables-missing/13108/2?u=mlauber71
If you are stuck and KNIME is not responding or behaving strangely try a clean restart *once*
https://forum.knime.com/t/knime-update-next-button-is-not-moving/15539/4?u=mlauber71
Write table to disk and check Java heap space as options to deal with performance issues
https://forum.knime.com/t/knime-3-6-crash-when-dealing-with-massive-data/12145/2?u=mlauber71
Check your virus scanner if it hinders performance
https://forum.knime.com/t/knime-takes-long-to-load/12669/4?u=mlauber71
Think about using the Cache node at certain points and you might run the Java Garbage Collector to free memory
https://forum.knime.com/t/execute-failed-cannot-read-file-knime-container-20200301-1905559478659870280-bin-snappy/21461/19?u=mlauber71
https://hub.knime.com/search?q=garbage&type=Node
Also sometimes there are problems with these 'snappy' files. Switching the internal storage option might help
https://forum.knime.com/t/blob-and-buffer-errors/25915/2?u=mlauber71
Tweak the internal storage (format) of KNIME workflows
https://forum.knime.com/t/knime-is-slowing-down/12619/2?u=mlauber71
You can also try out the new "Columnar Table Backend" (based on Apache Arrow) for internal storage
https://www.knime.com/blog/improved-performance-with-new-table-backend?u=mlauber71
To speed up large data processing think about 'streaming' or using Parquet or ORC files (partitioned files)
https://forum.knime.com/t/problem-with-disk-space-during-workflow-execution/48435/6?u=mlauber71
Find the KNIME temporary files and folders
https://forum.knime.com/t/question-about-memory-policy-in-node-options/36839/2?u=mlauber71
Be aware that some configurations of MS OneDrive might not support all the spacial chacaters like hash (#) that KNIME workflows need
https://forum.knime.com/t/can-not-import-workflow/11115/4?u=mlauber71
https://forum.knime.com/t/can-not-open-knime-workflows-nor-create-new-workflows-after-migration-to-a-new-drive/19058/4?u=mlauber71
On MacOS make sure that KNIME has all the necessary access rights
https://forum.knime.com/t/read-permission-error-in-csv-reader/20879/24?u=mlauber71
https://forum.knime.com/t/macos-catalina-cannot-read-from-file-system/24472/3?u=mlauber71
=> that is also true for other systems of course
Column rename node is not good to change the column type (sorry)
https://forum.knime.com/t/knime-nodes-corrupting-on-multiple-computers/13277/7?u=mlauber71
KNIME and Spark 1
https://forum.knime.com/t/how-to-speed-up-the-spark-to-datebase-node/13288/2?u=mlauber71
KNIME and Spark 2
https://forum.knime.com/t/spark-context-how-to-cache-the-intermediate-data-and-not-write-it-back-while-doing-a-series-of-transformations-on-the-data/20881/4?u=mlauber71
Streaming data in KNIME
https://www.knime.com/blog/streaming-data-in-knime
Save and don't save data in a workflow
https://forum.knime.com/t/save-workflow-without-data/27843/2?u=mlauber71
Create your own backup system with KNIME and ZIP
https://forum.knime.com/t/copy-move-files-node/29412/15?u=mlauber71
KNIME and Backup - think about protecting your work(flows)
https://forum.knime.com/t/restore-knime-workflows/36812/2?u=mlauber71
Create path variables in KNIME
https://forum.knime.com/t/write-multiple-csv-file-to-a-loop/44989/2?u=mlauber71
Workflow
A meta collection about KNIME and performance and performance tuning and some problems
External resources
- Start with the official blog about KNIME performance
- Additional thread about performance and collection of links
- If you are stuck and KNIME is not responding or behaving strangely try a clean restart *once*
- Write table to disk and check Java heap space as options to deal with performance issues
- Check your virus scanner if it hinders performance
- Think about using the Cache node at certain points and you might run the Java Garbage Collector to free meomory
- Tweak the internal storage (format) of KNIME workflows
- Be aware that some configurations of MS OneDrive might not support all the spacial chacaters like hash (#) that KNIME workflows need
- Also sometimes there are problems with these 'snappy' files. Switching the internal storage option might help
- You can also try out the new "Columnar Table Backend" (based on Parquet) for internal storage
- On MacOS make sure that KNIME has all the necessary access rights
- KNIME and Spark 1
- KNIME and Spark 2
- Streaming data in KNIME
- Save and don't save data in a workflow
- Create your own backup system with KNIME and ZIP
- Column rename node is not good to change the column type (sorry)
- KNIME and Backup - think about protecting your work(flows)
- Long entry and thread about performance and KNIME "Process 900+ CSV files"
- Check your installation (on a Windows machine use the .ZIP file if in doubt) - share debug log with the community or support
- To speed up large data processing think about 'streaming' or using Parquet or ORC files (partitioned files)
- Find the KNIME temporary files and folders
- Create path variables in KNIME
- BLOG: Mastering KNIME: Unlocking Peak Performance with Expert Tips and Smart Setting
Used extensions & nodes
All required extensions are part of the default installation of KNIME Analytics Platform version 4.7.2
No known nodes available
Legal
By using or downloading the workflow, you agree to our terms and conditions.