a. [3pts] Write down the two-way complete model for the experiment. Remember to explain each term
in the model.
b. [3pts] Find the sums of squares that are accounted for by the factors and their interactions, i.e., ssA,
ssB and ssAB.
c.[5pts] Generate an interaction plot. Do you see an obvious interaction between the two factors? Carry
out a formal hypothesis testing for the interaction.
d. [3pts] Test the hypothesis that different elapsed times have the same effects on the reaction time.
e.[3pts] Find a 95% confidence interval for the difference between the average reaction time from the
auditory cue and the average reaction time from the visual cue [Hint: this is to compare the two main
effects of factor cue].
f.[3pts] Find an appropriate confidence interval for the difference between auditory cue and the visual
cue, when the elapse time is 5 seconds.
Fig: 1
3. The coating experiment, described in Excercise 7 of Chapter 7, was to study the effect of different spray parameters on thermal spray coating properties. In the experiment, the authors attempted to produce high-quality alumina (Al2O3) coatings by controlling the fuel ratio (factor A at 1:2.8 and 1:2.0), carrier gas flow rate (factor B at 1.33 and 3.21 L s-¹), frequency of detonations (factor C at 2 and 4 Hz), and spray distance (factor D at 180 and 220 mm). To quantify the quality of the coating, the researchers measured multiple response variables. In this example we will examine the porosity (vol. %). The data are shown in the table below and can be downloaded from http: //deanvossdraguljic.ietsandbox.net/DeanVossDraguljic/SAS-data.html. A B с D Yijkl A B C D 2 2 2 2 5.95 2 2 2 1 2 2 1 2 1 4.57 4.03 2.17 1 2 2 2 1 2 2 2 2 1 1 1 2 1 2 1 1 1 1 1 2 1 1 1 1 2 2 1 2 1 2 1 2 Yijkl 12.28 9.57 6.73 6.07 8.49 4.92 6.95 5.31 1 2 2 3.43 2 1 2 1 1.02 2 1 1 2 4.25 2 1 1 1 2.13 1 1 1 1 (a). [3pts]. Run a model with ALL main-effects and two-way interaction effects. Write down the SAS code copy the ANOVA table from SAS. and (b).[2pts] The 95% confidence interval for the difference between the two main effects of B is ( (c).[2pts] Do you believe there are significant interaction effects between B and C? Answer yes or no. (d).[3pts] Generate an interaction plot for the B*C interaction. State how the plot supports your answer in c. [Hint: use 1smeans B*C to get the least squares estimates, then plot it using any software of your choice.] (e).[3pts] The 90% confidence upper limit for the error variance o² is Show your work.
2. Consider the reaction time experiment described in Exercise 4 of Chapter 4. a. [3pts] Write down the two-way complete model for the experiment. Remember to explain each term in the model. b. [3pts] Find the sums of squares that are accounted for by the factors and their interactions, i.e., ssA, ssB and ssAB. c.[5pts] Generate an interaction plot. Do you see an obvious interaction between the two factors? Carry out a formal hypothesis testing for the interaction. d. [3pts] Test the hypothesis that different elapsed times have the same effects on the reaction time. e.[3pts] Find a 95% confidence interval for the difference between the average reaction time from the auditory cue and the average reaction time from the visual cue [Hint: this is to compare the two main effects of factor cue]. f.[3pts] Find an appropriate confidence interval for the difference between auditory cue and the visual cue, when the elapse time is 5 seconds.
2. An experiment is to be run with two factors: Factor A with 2 levels and factor B with 4 levels. The experimenter would like to examine the pairwise differences between the four levels of factor B, with a simultaneous confidence level of 90%. The experimenter is confident that the two factors do not interact and will employ a two-way main-effects model. Furthermore, the experimenter believes that the mse will be unlikely to exceed 25. Find the required sample size for the 90% simultaneous confidence intervals for the pairwise comparison of main effects of B to have a width at most 10. (a) [2pts, no partial credit]The required sample size is at least (b) [2pts] Provide the SAS code.
1. Considering the two-way main effects model with two factors: Yijt = μ+αi+Bj+€ijt, i = 1, 2, 3; j = 1,2, 3, t = 1,2,,r. Anwser True or False to the following statements. (a) [1pt] μ+0₁+ B₂ is estimable. (b) [1pt] μ+ a₁ + (31 +3₂) is estimable. (c)[1pt] 31 (3₂ +33) is estimable. (d) [1pt] B₁ (32 +33)/2 is estimable. () CCCC () () ()
1. In a completely randomized design, there are two factors, A with two levels and B with three levels. Suppose the 6 treatment means are 116, 12 = 10,13 = 8 H21 = 5, 22 = 5,/23 = 5. Note these are the true treatment means and are supposed to be known. Answer the following questions. a. [3pts] Are there interaction effects? Why? [Hint: Use the definition of interations.] b. [3pts] Find μ, ai, Bj and (aß)ij, i = 1, 2, j = 1, 2, 3, such that Hij = μl + ai + Bj + (aß)ij.
5. Find the following probabilities using contingency tables by "proc freq" command 1) The Probability that a purchase was made by a customer from UK 2) The probability that a purchase was made in the 4th quarter 3) The probability that a purchase was made in the 4th quarter among the UK customers. 4) The probability that a purchase was made by a non-UK customer in the 4th quarter.
1. Suppose you are generating the necessary variables to analyze the data using the following code. Carefully explain the new variables you generate in the data step. /* Here some initial data codes to create variables */ data salesl ; set mysales ; date = datepart (Invoicedate); yearmm = year (date) *100+month (date); total sale UnitPrice* Quantity; month month (date); quarter = qtr (date); itemID=1*substr(StockCode, 1, 2); if itemID = . then delete ; 1_date= '31DEC2011'D; format date 1_date maddyy 10. ; if country = "United Kingdom" then UK = 1; else UK = 0; if totalsale = . then delete ; run ;
student note : i have the code and answers also you just need to seed it with my ID and give it back run the code with my id and give it back/n SAS Output Data Set Page Size Number of Data Set Pages First Data Page Max Obs per Page Obs in First Data Page Data Set Name Member Type Engine Created Last Modified Number of Data Set Repairs Filename Protection Data Set Type Label Data Representation Encoding ExtendObsCounter Release Created Host Created Owner Name File Size File Size (bytes) 65536 8 1 681 657 0 YES The SAS System The CONTENTS Procedure WORK.AIRBNBO DATA V9 576KB 589824 04/30/2023 22:55:34 04/30/2023 22:55:34 WINDOWS_64 wlatin1 Western (Windows) 9.0401M7 X64_SRV19 4_\airbnb0.sas7bdat Engine/Host Dependent Information C:\Users\LOC55A~1\Temp\SAS Temporary Files\_TD28732_IS-SHPRD- DPU-AADDS\MVENISHE Observations 5000 Variables 12 Indexes 0 Observation Length 96 Deleted Observations 0 NO NO Compressed Sorted Alphabetic List of Variables and Attributes # Variable 1 Listing Month 12 PricePerNight 3 accommodates 4 bathrooms Type Len Format Informat Num 8 BEST12. BEST32. Num 8 BEST12. BEST32. Num 8 BEST12. BEST32. Num 8 BEST12. BEST32. 5 bedrooms Num 8 BEST12. BEST32. 6 beds Num 8 BEST12. BEST32. 7 guests_included Num 8 BEST12. BEST32. 2 host_total listings Num 8 BEST12. BEST32. Page 1 of 91 file:///C:/Users/local_MVENISHE/Temp/SAS%20Temporary%20Files/_TD28732_IS-SH... 4/30/2023 SAS Output 8 minimum_nights Num 9 number_of_reviews Num 10 review_scores_rating Num 11 reviews_per_month Num 8 BEST12. BEST32. 8 BEST12. BEST32. 8 BEST12. BEST32. 8 BEST12. BEST32. Page 2 of 91 file:///C:/Users/local_MVENISHE/Temp/SAS%20Temporary%20Files/_TD28732_IS-SH... 4/30/2023 SAS Output The SAS System The SURVEYSELECT Procedure Selection Method Simple Random Sampling Input Data Set Random Number Seed Sample Size Selection Probability Sampling Weight Output Data Set AIRBNBO 2129790 2000 0.4 2.5 AIRBNB1 Page 3 of 91 file:///C:/Users/local_MVENISHE/Temp/SAS%20Temporary%20Files/_TD28732_IS-SH... 4/30/2023 SAS Output bathrooms bedrooms beds The SAS System The MEANS Procedure Variable N Listing Month 2000 4.3483000 2.2512932 host_total_listings 2000 53.3640000 200.9729822 accommodates 4.8090000 3.2130485 1.0000000 1.4142500 0.7665284 1.8065000 1.2461277 2.4745000 2.0867503 32.0000000 2000 2000 2000 2000 2000 2.5795000 2.1042561 1.0000000 16.0000000 2000 3.7160000 11.4773346 1.0000000 365.0000000 number_of_reviews 2000 51.6795000 63.3991286 1.0000000 583.0000000 review_scores_rating 2000 95.3960000 6.0902159 20.0000000 100.0000000 reviews_per_month 2000 2.3512300 1.9085478 0.0200000 12.5500000 PricePerNight 2000 147.8865000 126.3896150 6.0000000 953.0000000 hostclass 1888 1.6912076 0.7782735 1.0000000 3.0000000 guests_included minimum_nights Mean Std Dev Minimum Maximum 0.3000000 11.6000000 1283.00 32.0000000 11.0000000 12.0000000 0 0 0 0 Page 4 of 91 file:///C:/Users/local_MVENISHE/Temp/SAS%20Temporary%20Files/_TD28732_IS-SH... 4/30/2023 SAS Output N Mean Std Deviation Skewness Uncorrected SS The SAS System The UNIVARIATE Procedure Variable: Listing Month Moments 2000 Sum Weights 4.3483 Sum Observations 2.25129324 Variance 0.34387476 Kurtosis 47947 Corrected SS Coeff Variation 51.7741012 Std Error Mean Test Location Basic Statistical Measures Variability Mean 4.348300 Std Deviation Median 4.300000 Variance Mode 6.100000 Range Interquartile Range Tests for Location: Mu0=0 Statistic 86.37786 Pr> |t| p Value Student's t t Sign M 1000 Pr>= |MI Signed Rank S 1000500 Pr>= |S| Quantiles (Definition 5) Level Quantile 100% Max 11.60 99% 10.25 95% 8.10 90% 7.20 75% Q3 6.10 50% Median 4.30 25% Q1 2.40 10% 1.50 5% 1.00 1% 0.50 0% Min 0.30 Extreme Observations Lowest Highest 2000 8696.6 5.06832127 -0.2967753 10131.5742 0.05034045 2.25129 5.06832 11.30000 3.70000 <.0001 <.0001 <.0001 Page 5 of 91 file:///C:/Users/local_MVENISHE/Temp/SAS%20Temporary%20Files/_TD28732_IS-SH... 4/30/2023/n I. Airbnb Price in Chicago (Sample Data) Let's work on the Airbnb price in Chicago. Here are the selected variables: Listing Month host_total listings accommodates ● bathrooms bedrooms beds ● ● ● run; ECO520 Homework 5 Regression Analysis on Airbnb Price in Chicago guests_included minimum_nights number_of_reviews review_scores_rating reviews_per_month PricePerNight Here is the SAS code to load the data: The Number of Months since listing The total number of listings by the host Maximum number of peoples to stay The Number of bathrooms The Number of bedrooms The Number of beds The Number of guests included in the price Minimum nights per rent filename webdat url "https://bigblue.depaul.edu/jlee141/econdata/eco520/airbnb2019.csv"; /* Import Chicago Community data*/ PROC IMPORT OUT= airbnb0 DATAFILE= webdat DBMS=CSV REPLACE; RUN; proc contents; run ; Total number of Reviews for the rent unit The Average score of the rating for the rent unit The Number of Reviews per Month Price per night /* Create your own random sample data. Make sure type your student ID as seed number Replace your_depaul_id with your student id (only numbers) / run ; proc surveyselect data= airbnb0 method=srs seed = your_depaul_id n = 2000 out-airbnb1 ; /* The following code will create the class of host */ data airbnb2; set airbnb1; proc means ; run ; if 0 < host total listings <3 then hostclass 1 else if 3 <= host_total listings < 20 then hostclass 2 else if host_total_listings 20 then hostclass 3 /* More variables you would create */ >= ; ; ; 1. In the airbnb2 data step, add the following new variables: 1) the most popular hosts who have more than 65 reviews as popular_host. 2) big family units that accommodate more than 8 people as big_unit. 3) long-term rent units that have more than 7 days as minimum nights as longterm. 2. Find any outliers or missing cases on all variables. If necessary, remove the outliers or any missing cases. Show your works in SAS and explanation. 3. Use scatter plots to find potential variables to have nonlinear relationship with price. Create the square of rooms, the square of beds, and the square of bathrooms. If necessary, create some squared or logarithmic variables to analyze the potential nonlinear relationships. 4. Machine Learning using Regression Analysis: Let's consider creating regression models using a training data set, save the estimated models, and predict the prices using the rest of the testing data. (Use the example we covered in the PowerPoint slides). Make sure to include all class and dummy variables you created in 1. 1) Split the Airbnb2 data to 70% as training data and 30% as testing (validating) data with a seed number of 55555. Estimate regression models as the dependent variable of PricePerNight using only the training data with the following options. 1. Your own best model 2. Adjusted R square 3. Stepwise 2) Perform the out-of-sample prediction for the observations using the observations that were not used to estimate the regression models. Find the following statistics and compare the results. Which model is the best in terms of the following statistics? 1. MSE (mean square error) 2. RMSE (root mean square error) 3. MPE (mean percentage error) 4. MAE (mean absolute error) All questions need to be typed with appropriate graphs and tables from SAS in a PDF file. Submit your SAS code as a separate text file.
3. Create Let's find the following findings using proc summary in SAS: 1) Who are the five most valuable customers to the business in terms of volumes or sales amount? 2) Who are the five most valuable itermID that were purchased most in terms of volumes or total sales amount? 4. Using the univariate and graph command in SAS, find the outliers of customers and item IDs on the total sales. Carefully explain what to do with the outliers.
4. Using the univariate and graph command in SAS, find the outliers of customers and item IDs on the total sales. Carefully explain what to do with the outliers.