Hub
Pricing About
WorkflowWorkflow

Challenge 21 - Summarize KNIME Forum Topics

JustknimeitRESTful APIsText ProcessingGenAIJKISeason4-21
Just KNIME It profile image
Draft Latest edits on 
Oct 7, 2025 10:42 AM
Drag & drop
Like
Download workflow
Workflow preview

Challenge 21: Summarize KNIME Forum Topics

Level: Hard

Description: Dive into the world of data science with our advanced KNIME challenge, where you'll explore the intricacies of text processing and visualization using the KNIME Analytics Platform. This challenge is designed for those who are ready to tackle complex workflows that combine multiple data science techniques, including web scraping, text processing, and visualization. Participants will have the opportunity to work with real-world data from the KNIME Forum, extracting and analyzing the latest topics to generate insightful visualizations. This challenge is perfect for those looking to enhance their skills in handling JSON data, working with APIs, and leveraging advanced text processing techniques.

Beginner-friendly objective(s): 1. Set up the initial data retrieval process by configuring the GET Request node to fetch the latest topics from the KNIME Forum. 2. Parse the JSON response to extract topic IDs. 3. Using the topic IDs from the first step, retrieve all posts in each topic along with topics' relevant details like title and author in a second request (per topic).


Intermediate-friendly objective(s): 4. Implement text processing techniques to clean and prepare the extracted data, including removing HTML tags, punctuation, and stop words. 5. Create a visualization of the most frequent bigrams using the NGram Creator and Tag Cloud nodes.

Advanced objective(s): 6. Integrate LLMs to summarize conversations from the forum topics, showcasing the power of LLMs in text analysis. 7. Develop a comprehensive visualization using the KNIME View nodes to display the summarized topics alongside their associated tag cloud and metadata.

Hints:

Discourse API documentation: https://docs.discourse.org/

Initial request URL (Objective 1): https://forum.knime.com/latest.json (as documented at https://docs.discourse.org/#tag/Topics/operation/listLatestTopics)

Solution Summary: The solution involves a comprehensive workflow that begins with fetching the latest topics from the KNIME Forum using a GET Request node. The JSON response is parsed to extract topic details, which are then processed to remove unnecessary elements like HTML tags and stop words. The workflow leverages OpenAI's language model to summarize conversations, and the results are visualized using a Tag Cloud and Tile View to provide an interactive and insightful representation of the data. This solution showcases the integration of web scraping, text processing, and advanced visualization techniques within KNIME.

Solution Details: The workflow starts with a GET Request node configured to fetch the latest topics from the KNIME Forum. The JSON response is processed using a JSON Path node to extract topic IDs, titles, and authors. A Group Loop Start node is used to iterate over the extracted data, grouping it by topic ID, title, and author. The JSON Path node is employed again to parse additional details from the topic URLs, such as post numbers and usernames. Text processing nodes, including the Markup Tag Filter, Punctuation Erasure, and Stop Word Filter, are used to clean the text data by removing HTML tags, punctuation, and stop words. The NGram Creator node generates bigrams, which are visualized using the Tag Cloud node. The OpenAI Authenticator and LLM Selector nodes are configured to authenticate and select a language model for summarizing conversations. The summarized text is then joined with the original data using a Joiner node, and the final visualization is created using the Tile View node, displaying the summarized topics alongside their images and metadata. This detailed workflow demonstrates the integration of multiple data science techniques to achieve a comprehensive analysis and visualization of forum topics.

External resources

  • Discourse API Documentation
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 5.5.1
  • Go to item
    KNIME AI ExtensionTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.1

    knime
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.1

    knime
  • Go to item
    KNIME ExpressionsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.1

    knime
  • Go to item
    KNIME JavaScript ViewsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.0

    knime
  • Go to item
    KNIME JSON-ProcessingTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.0

    knime
  • Go to item
    KNIME Quick FormsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.1

    knime
  • Go to item
    KNIME REST Client ExtensionTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.0

    knime
  • Go to item
    KNIME TextprocessingTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.5.0

    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits