Train, Score and Explain a Machine Learning Model
Challenge description:
You work as a data scientist for a recruiting agency specialized in matching job-seekers in the AI, Data Science and IT space with vacancies in companies that require the services of the recruiting agency. Unfortunately, companies are often reluctant to disclose salaries in job offers. Therefore, in order to attract the best candidates, your boss has tasked you with building a machine learning pipeline to predict data science salaries.
Use the provided dataset on data science jobs to train and score a machine learning model of your choice that predicts data science salaries. Perform the pre-processing operations you deem necessary and select meaningful features to train the model.
Clearly, your boss would like to obtain predictions that are as accurate as possible. Additionally, she expects you to be able to explain the model's decision-making process.
Key requirement: you must use an explainable AI (XAI) technique of your choice to explain the model's predictions and provide a short written description (max. 100 words) in an annotation. For example, you could use one of KNIME Verified Components on Model Interpretability: https://hub.knime.com/knime/spaces/Examples/00_Components/Model%20Interpretability~WMtQn1U91a-xzZY3/.
Outcome:
A machine learning pipeline for data pre-processing, model training, scoring, and explanation via explainable AI (XAI) techniques.
Deliver your solution as a separate workflow and name it: Solution_Round_16_<your_team_name>. Place your solution workflow in the same folder of this challenge workflow.
Teams are strongly encouraged to submit high-quality work in order to improve their chances of getting maximum points. Don't be afraid to go the extra mile! :)
Dataset:
Data Science Salaries 2023 dataset from Kaggle: https://www.kaggle.com/datasets/arnabchaki/data-science-salaries-2023
Deadline:
March 10, 2024 (submission by 11:59 PM CET) **. Check the calendar of the tournament: https://info.knime.com/game-of-nodes
** We will verify the date and time of the latest edits.
KNIME Game of Nodes:
Rules, Assessment Criteria & FAQs: https://info.knime.com/game-of-nodes
External resources
Used extensions & nodes
All required extensions are part of the default installation of KNIME Analytics Platform version 5.2.1
No known nodes available
Legal
By using or downloading the workflow, you agree to our terms and conditions.