Hub
Pricing About
WorkflowWorkflow

Databricks Unity File System

DatabricksSparkFile handlingRelease 5.3
knime profile image
Versionv1.0 KNIME AP 5.3 ReleaseLatest, created on 
Jul 10, 2024 11:11 AM
Drag & drop
Like
Download workflow
Workflow preview

This workflow demonstrates KNIME's capability to connect with Databricks Unity Volumes, part of the Unity Catalog framework. It enables users to read and write files from and to Databricks Unity Volumes.

The use case presented here involves writing once-a-month Excel files containing daily weather information for different locations as Parquet files into the Databricks Unity Volume. Then, the data is read, and a simple linear regression model is applied to Spark.

For more information about Databricks Unity Catalog and Databricks Unity Volumes, please refer to the "External resources" links.

You can download the workflow and run it on your local machine using the latest version of the KNIME Analytics Platform. For optimal performance, it is recommended that you use the latest version of KNIME AP.

Workflow Requirements

To run the workflow locally, you will need:

  1. A Databricks account

  2. An existing Databricks cluster

Workflow Details

  1. Connecting to Databricks Unity Volume

    • First, we connect to the Databricks Unity Volume, where we want to read and write files via the Databricks Unity File System Connector.

  2. Writing Data to Unity Volume

    • The use case involves taking thirty generated Excel files with synthetic weather information from 1000 locations and writing them into the Databricks Unity Volume as Parquet files.

  3. Creating a Spark Context

    • We create a Spark context using the Create Databricks Environment node and read the previously generated Parquet files with the Parquet to Spark node, creating a DataFrame in Spark.

  4. Data Manipulation and Modeling

    • We manipulate the data in the Spark context using the KNIME Extension for Apache Spark nodes. This operation includes filtering missing values, splitting and normalizing the data frame, and applying a linear regression model.

  5. Model Evaluation

    • Finally, we use the Spark Numeric Score node to visualize the linear regression performance and capacity to predict rainfall based on the selected features and shut down the Spark context.

External resources

  • KNIME Docs: Read and write from or to a connected file system
  • KNIME Docs: File Folder Utility nodes
  • KNIME Docs: KNIME Analytics Platform and file systems
  • Databricks Docs: What are Unity Catalog volumes?
  • Databricks Docs: What is Unity Catalog?
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 5.3.0
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME Databricks IntegrationTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME Data GenerationTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME Excel SupportTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME Extension for Apache SparkTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME Extension for Big Data File FormatsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime
  • Go to item
    KNIME JavasnippetTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.3.0

    knime profile image
    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits