sree_23048/Price prediction using Random Forest model

Public space

Price prediction using Random Forest model

Type	Name
	AB_NYC_2019.csv
	Random Forest SOL_project

This dataset is useful for analyzing Airbnb listings in terms of pricing, location, host activity, and availability.

The dataset contains information about Airbnb listings with the following attributes:

1. Row ID: Unique identifier for each row. 2. ID: Unique identifier for each listing.

3. Name: Name of the listing. 4. Host ID: Unique identifier for the host. 5. Host Name: Name of the host. 6. Neighbourhood Group: Broad area or borough (e.g., Manhattan, Brooklyn). 7. Neighbourhood: Specific neighborhood within the borough. 8. Latitude: Latitude coordinate of the listing. 9. Longitude: Longitude coordinate of the listing. 10. Room Type: Type of room (e.g., Private room, Entire home/apt). 11. Price: Price per night.12. Minimum Nights: Minimum number of nights required for booking.13. Number of Reviews: Total number of reviews received.14. Last Review: Date of the most recent review.15. Reviews per Month: Average number of reviews per month.16. Calculated Host Listings Count: Total number of listings the host has.17. Availability 365: Number of days the listing is available in a year.

1. CSV Reader

- Description: This node reads data from a CSV file.

- Purpose: To import the dataset into the workflow for further processing.

2. Partitioning

- Description: This node splits the dataset into training and test sets.

- Purpose: To create separate datasets for training the model and evaluating its performance.

3. Random Forest Learner (Regression)

- Description: This node trains a Random Forest regression model using the training dataset.

- Purpose: To create a predictive model based on the training data.

4. Random Forest Predictor (Regression)

- Description: This node applies the trained Random Forest model to the test dataset to make predictions.

- Purpose: To generate predictions on the test data using the trained model.

5. Numeric Scorer

- Description: This node evaluates the performance of the regression model by comparing the predicted values to the actual values.

- Purpose: To assess the accuracy and performance of the model.

6. ROC Curve (legacy)

- Description: This node generates a Receiver Operating Characteristic (ROC) curve to visualize the performance of the model.

- Purpose: To provide a graphical representation of the model's performance, particularly in terms of true positive rate and false positive rate.

This workflow is designed for performing regression analysis using a Random Forest model, evaluating its performance, and visualizing the results. Connect with me at guharaysree@gmail.com if there's anything else you'd like me to add or modify to match with your niche and requirements!