# Critical Thinking Exercise

This project utilizes the “Real Estate – Base” database. The purpose is twofold:

–

Build critical thinking skills needed to structure data analysis appropriately for effective decision

making.

–

Analyze available data practically and skillfully in order to build an explanatory regression model.

The Real Estate – Base database includes the following variables for 101 homes (* NOTE: These variables

are shown as qualitative variables within the database):

a.

*Unit#

(An assigned database key)

b. *Type

(H = House, C = Condo/Apartment)

c.

*Location

(1 through 10 – voting district where located)

d. *U/S/R

(Urban vs. Suburban vs. Rural location)

e. Price

(The price the house ended up selling for in 2017)

f.

Sq. Ft.

(Heated/Cooled & Attached square footage)

g.

Lot (Acres)

(Acreage of property)

h. Garage

(Number of attached covered and/or enclosed parking positions)

i.

BRs

(Number of qualified bedrooms)

j.

Baths

(Number of bathrooms – no tub or shower indicated as .5)

k.

*Pool

(No=No Access; HA=Shared Pool; AG=Above Ground; IG=In Ground)

l.

Age

(Age of home in rounded year at end of 2017)

Phase 1 of the project (homework for QM3345)

1. Create the following charts in Excel using the charting tools and the indicated variables in “Real

Estate – Base.xlsx”:

a.

Create a new tab in the spreadsheet called “Scatterplots”. After creating each

Scatterplot on the original tab, move it to the Scatterplot tab you created.

b. Create a Scatterplot using the variables Price and Sq. Ft.

c.

Create a Scatterplot using the variables Price and Lot (Acres).

d. Create a Scatterplot using the variables Price and Garage.

e. Create a Scatterplot using the variables Price and BRs.

f.

Create a Scatterplot using the variables Price and Baths.

g.

Create a Scatterplot using the variables Price and Age.

2. What sort of relationship do you see between these variables based on the scatterplots?

a.

Between Price and Sq. Ft. (Circle)?

No relationship Weak Moderate Strong

b. Between Price and Lot (Circle)?

No relationship Weak Moderate Strong

c.

Between Price and Garage (Circle)?

No relationship Weak Moderate Strong

d. Between Price and BRs (Circle)?

No relationship Weak Moderate Strong

e. Between Price and Baths (Circle)?

No relationship Weak Moderate Strong

f.

Between Price and Age (Circle)?

No relationship Weak Moderate Strong

3. In the Excel spreadsheet provided, using the Data Analysis Add-in, run a regression analysis with

Price as the Dependent Variable and Lot, Garage and BRs as the Independent Variables and

select to have Excel create a new tab called “Regression Model”.

4. Provide the following from the “Excel Model”:

a.

Coefficient of Determination (R-squared)

___________________

b. Y-Intercept for the Regression Model

___________________

c.

Slope value for X1 (Lot)

___________________

d. Slope value for X2 (Garage)

___________________

e. Slope value for X3 (BRs)

___________________

Phase 2 of the Project (Critical Thinking and SAS® Model)

5. Do you think we need all three current Independent variables in our Regression model to

predict changes in Price (Circle)? Yes No

Explain: _________________________________________________________________________

_______________________________________________________________________________

_______________________________________________________________________________

6. Which variable(s) would you remove (Circle)?

Lot Size

Garage BRs

7. Of the following variables in the spreadsheet, which variable would you select next to add to the

model (i.e., you think it would create a stronger prediction of Price)?

Type Location U/S/R Sq. Ft. Baths Pool Age

8. Run a SAS Regression Model on the Real Estate – Base database using Price as the Dependent

Variable (Y) and include the original Independent Variables (minus any you removed in step 6)

and adding the variable you chose in step 7. Print your model output and turn it in with the

assignment.

9. Provide the following from the SAS Model:

a.

Coefficient of Determination (R-squared).

________________________

b. Y-Intercept for the Regression Model

________________________

c.

Slope value for each of your Independent Variables.

________________________

10. Did your SAS model provide a stronger Coefficient of Determination (Circle)? Yes No

Critical Thinking Question:

11. A large real estate company is trying to use similar data plus their own sales data to forecast

total sales for the coming year for each of their agents and they have pulled data from their

Finance records. They are trying to assemble the best data to build a Regression model.

a.

Would it make sense to use the same data as we used above in the SAS model? Why or

why not?

__________________________________________________________________________________

__________________________________________________________________________________

b. Recommend three data elements you think they probably have available to help them

predict sales for each of their sales people.

1. ______________________________________________

2. ______________________________________________

3. ______________________________________________