Project 06 - California Housing Data Continued Exploration
Due Date: Wednesday 10/28 @ 6am Uploaded to Canvas
Overview
In this project, you’ll do some additional analysis of the California housing data set that we have been using in class. The project will be done in Google Colab, also as we have been doing in lecture.
Do the following:
- Open the Colab notebook that you’ve been using to follow along in lecture.
- Go to the
File
menu and chooseSave a copy in Drive
. This should open a new window or tab also in your browser. - At the top of the new tab/window, double click on the name of the workbook and remove
copy of
and replace it with your first and last name. - Remove all the text and code cells BELOW the cell where we imported the data.
- Perform the tasks/analysis or answer the questions that follow. It is imperative that you label each of your responses with the question ID (
Q1
for example). If we can’t find your answer to a question, we will simply count off. If your answer to a question is in a code cell, label your responses by putting a comment above the code that corresponds to a question. If your answer to a question is in a text block, put the question ID on a separate line at the top of the response.
Q1. What is the average median age of all houses? Q2. How many households are in areas that have a median income above 3.87? (An area is represented by a single row in the data.) Q3. Create a bar chart that shows the number of areas counted by ocean_proximity
. Q4. Create 2 scatter plots. One should contain only those areas that have an ocean_proximity value of NEAR BAY
. The other should contain only those areas that have an ocean_proximity value of NEAR OCEAN
. Q5. Create a scatter plot that shows the correlation between median_income and total_rooms.
Q6. Create an additional feature in the data set that represents the average number of bedrooms per household. Generate a histogram for this new feature. Q7. What is the correlation between average number of bedrooms per household and median house value?
Submission and Grading
Your project should be submitted as a zip file to Canvas in the appropriate PA05 location. The zip file should contain all files necessary to run the project, including but not limited to:
- your python code
- the secret_words.txt file
- the scoreboard.txt file
It will be graded according to the following rubric:
- Source Code Quality: 30%
- Proper use of vertical white space
- Proper use of horizontal white space
- Understandable variable names
- File header comments with your name, SMU ID, and a brief explanation of the project.
- Correctness: 30%
- Does your project properly use the secret word in the input file to run the program?
- User Interface: 40%
- Are the interface steps logical?
- Does the main game loop stop under the correct conditions?
- Are the prompts to the user helpful?