Hub
Pricing About
WorkflowWorkflow

FINAL - Movie Recommendation Model (CF)

Movie Recommendation SystemCollaborative FilteringSpark
T
Draft Latest edits on 
May 23, 2025 3:35 AM
Drag & drop
Like
Download workflow
Workflow preview

Movie Recommendation System

Author(s)

  • Tylah Jenkins (14248037)

  • Dayhe Kwon (25531308)

  • Aabhii Taneja (13568821)

  • Harshit Setia (25206804)

Date: 01/08/2025

Overview & Purpose

This workflow builds a simple movie recommendation system using the MovieLens 100k dataset to predict user preferences and movie ratings.

Data Used

Uses MovieLens 100k dataset files: u.data (for the final working model), uX.base/test files if running the hyperparameter tuning, and u.user/u.item if running the hybrid model.

  • Required Location: On the user's Desktop is the most convenient location.

Methodology

Workflow steps include:

  • Data loading (ingestion) and initial inspection

  • Cleaning & preprocessing (column renaming, cleaning was conducted and trusted from source)

  • Data splitting (Partitioned - 80/20 train/test split) / Test validation set checks

  • Spark environment initialisation and required transformations

  • Model training/prediction (Collaborative Filtering)

  • Output generation (predicted vs actual)

  • Evaluation & Analysis

How to run the Workflow

  • Place data files in the specified location.

  • Open workflow in KNIME.

  • Drag mouse over all nodes in the 'Final Working Model' section and select execute all.

Outputs

  • Actual vs. Predicted ratings.

  • Table of Top 10 Recommendations table for each user (unrated movies).

  • Evaluation metrics table: RMSE and Recall.

Evaluation Metrics

System evaluated using RMSD and Recall metrics on the u.data set (training/test split).

Assumptions

  • Data files are complete, unaltered, and tab-separated.

  • Input data includes required pre-cleaning (users >= 20 ratings, with complete demographics).

  • Rating scale is a 1-5.

  • Data provided is reliable and credible.

External resources

  • Social Information Network Analysis - Assignment 2 Brief
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 5.4.3
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.1

    knime
  • Go to item
    KNIME Excel SupportTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.0

    knime
  • Go to item
    KNIME ExpressionsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.1

    knime
  • Go to item
    KNIME Extension for Apache SparkTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.1

    knime
  • Go to item
    KNIME Extension for Local Big Data EnvironmentsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.1

    knime
  • Go to item
    KNIME JavasnippetTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.3

    knime
  • Go to item
    KNIME Math Expression (JEP)Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.0

    knime
  • Go to item
    KNIME ViewsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 5.4.2

    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits