This workflow demonstrates how to do sentiment analysis by fine-tuning Google's BERT network. The idea is straight forward: A small classification MLP is applied on top of BERT which is downloaded from TensorFlow Hub. The full network is then trained end-to-end on the task at hand. After 1 epoch of training, the network should already have more than 85% accuracy on the test set. Once training is completed, the "Visualize Before vs After" component shows the difference between BERT embeddings before and after training. You should see that the training introduced a much clearer separation between the classes. The view also allows to interactively play with different classification thresholds. The dataset used here consists of the first 10000 reviews in the IMDB Movie Reviews dataset (http://ai.stanford.edu/~amaas/data/sentiment/) from "Learning Word Vectors for Sentiment Analysis" by Maas et al. If you want to train a better model, we recommend to download the full dataset and train on it instead of the subset that comes with the workflow. Additional Notes: The red flow variable connections are used to enforce a sequential execution of nodes that make use of TensorFlow in order to prevent memory issues (especially if you are using a GPU). If you wish to track your training progress, you can go to File->Preferences->KNIME->KNIME GUI and set the console log level to Info. Then you can monitor the status of the training in the console view (typically at the bottom right of the KNIME workbench). Required KNIME extensions: - KNIME Python Integration - KNIME Deep Learning - Keras Integration - KNIME Deep Learning - TensorFlow 2 Integration - KNIME Statistics Nodes (Labs) - KNIME Machine Learning Interpretability Extension Required Python packages (need to be available in your TensorFlow 2 Python environment): - tensorflow_hub - bert-for-tf2
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.0 Note: Not all extensions may be displayed.
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.