Link Search Menu Expand Document

Project 06 - California Housing Data Continued Exploration

Due Date: Wednesday 10/28 @ 6am Uploaded to Canvas

Overview

In this project, you’ll do some additional analysis of the California housing data set that we have been using in class. The project will be done in Google Colab, also as we have been doing in lecture.

Do the following:

  1. Open the Colab notebook that you’ve been using to follow along in lecture.
  2. Go to the File menu and choose Save a copy in Drive. This should open a new window or tab also in your browser.
  3. At the top of the new tab/window, double click on the name of the workbook and remove copy of and replace it with your first and last name.
  4. Remove all the text and code cells BELOW the cell where we imported the data.
  5. Perform the tasks/analysis or answer the questions that follow. It is imperative that you label each of your responses with the question ID (Q1 for example). If we can’t find your answer to a question, we will simply count off. If your answer to a question is in a code cell, label your responses by putting a comment above the code that corresponds to a question. If your answer to a question is in a text block, put the question ID on a separate line at the top of the response.

Q1. What is the average median age of all houses? Q2. How many households are in areas that have a median income above 3.87? (An area is represented by a single row in the data.) Q3. Create a bar chart that shows the number of areas counted by ocean_proximity. Q4. Create 2 scatter plots. One should contain only those areas that have an ocean_proximity value of NEAR BAY. The other should contain only those areas that have an ocean_proximity value of NEAR OCEAN. Q5. Create a scatter plot that shows the correlation between median_income and total_rooms.
Q6. Create an additional feature in the data set that represents the average number of bedrooms per household. Generate a histogram for this new feature. Q7. What is the correlation between average number of bedrooms per household and median house value?

Submission and Grading

Your project should be submitted as a zip file to Canvas in the appropriate PA05 location. The zip file should contain all files necessary to run the project, including but not limited to:

  1. your python code
  2. the secret_words.txt file
  3. the scoreboard.txt file

It will be graded according to the following rubric:

  • Source Code Quality: 30%
    • Proper use of vertical white space
    • Proper use of horizontal white space
    • Understandable variable names
    • File header comments with your name, SMU ID, and a brief explanation of the project.
  • Correctness: 30%
    • Does your project properly use the secret word in the input file to run the program?
  • User Interface: 40%
    • Are the interface steps logical?
    • Does the main game loop stop under the correct conditions?
    • Are the prompts to the user helpful?