Help With STATS762 Assignment, R Programming Assignment,R Course AssignmentHelp With , data Assignment Python Programming| Python Programming

STATS762 Regression for Data Science
Assignment 3
Due date: 10am, 1 June 2020
Instruction
• Please submit both your R Markdown document and a pdf file containing
the document it generates. To create a pdf you should start your R Markdown
document with the following lines (having made the appropriate
changes):
---
title: "STATS 762 Assignment 3"
author: "Your Name, ID 1234567"
date: "Due: 10am, 1 June 2020"
output: pdf_document
---
• Add the set.seed-function before your R-script to obtain the same output
when it is resimulated.
• All answers should be written with corresponding question numbers.
• Working must be shown.
• Each answer should be written explicitly and a R-code itself does not
make an answer.
For example, the question is finding an average height of 6 trees: (1, 2, 1,
3, 1.5).
Good answer Bad answer
• If any of above is unsatisfied, a penalty may be applied.
1. The spreadsheet avocado2.csv contains historical 338 avocado sales in
various markets in California, US. The attributes follow;
Total.Volume Total number of sold avocados
AveragePrice Average price of a single avocado
type Production type; organic and conventionally produced avocados
1
A researcher wants to investigate how the amount of sales relates to an average
price and a production type (organic/conventional). Total.Volume
is transformed in a log-scale to fit a linear regression model with AveragePrice
and type.
(a) Write how a log-transformed total number of sold avocados is useful
for modelling a quantile using a linear regression. [2 marks]
(b) Find a suitable linear regression model for the 0.2 quantile of log(Total.Volume)
and express a typical 0.2 quantile of total number of sold avocados
for a given price and production type. [5 marks]
(c) Find a suitable linear regression model for the 0.8 quantile of log(Total.Volume)
and express a typical 0.8 quantile of total number of sold avocados
for a given price and production type. [5 marks]
(d) Using your model, predict the 0.2 quantile of the total sales for $1.2
conventional avocados and $1.8 organic avocados. [1 marks]
(e) What conventional avocado price does result that 80% of markets
sold at most 5.4 millions avocados? [3 marks]
2. The spreadsheets (banktrain.csv and banktest.csv) are related with
direct marketing campaigns of a bank. The marketing campaigns were
based on phone calls. Often, more than one contact to the same client was
required, in order to access if the product (bank term deposit) would be
(or not) subscribed. The interest is to predict if the client will subscribe a
term deposit (variable y).
The attributions follow;
gender - gender (categorical: ”male”,”female”)
age - age (numeric)
marital - marital status (categorical: ”married”,”divorced”,”single”)
education - education information of client (categorical: ”unknown”,”secondary”,”primary”,”tertiary”)
default - credit account status (categorical: ”yes”,”no”)
balance - average yearly balance, in euros (numeric)
housing - housing loan status (categorical: ”yes”,”no”)
loan - personal loan status (categorical: ”yes”,”no”)
contact - contact communication type (categorical: ”unknown”,”telephone”,”cellular”)
duration - last contact duration, in seconds (numeric)
campaign - number of contacts performed during this campaign and for this client (numeric)
previous - number of contacts performed before this campaign and for this client (numeric)
poutcome - outcome of the previous marketing campaign (categorical: ”unknown”,”other”,”failure”,”success”)
y - Has the client subscribed a term deposit? (categorical: ”yes”,”no”)
2
We use the train data (banktrain.csv) to find a model and the test data
(banktest.csv) to examine the predictability of a model. Note that the
number of cross validation folders is 10.
The function in make.r reforms a data that each categorical variable creates
indicator variables corresponding to categorical levels. It produces
a list with two objects; a reformed data (data) and a vector of group
memberships (gpname).
(a) Using the train data, complete the following questions.
i. Using an appropriate penalty on the model complexity, find a
model minimizing the cross validation error. Show how you
found the model and describe the model with the client characters
included. [4 marks]
ii. Using an appropriate penalty on the model complexity, find
a parsimonious model. Show how you found the model and
describe the model with the client characters included. [4 marks]
(b) Estimate the predictability of each model using an appropriate measure
and, compare the predictability. [3 marks]
(c) Using your parsimonious model, describe a type of client who is
very likely to subscribe a term deposit. [3 marks]
(d) If a marketing focuses on a single client character what would be the
feature to succeed the marketing campaign? [3 marks]
3

QQ：99515681
WeChat：codinghelp
Email：99515681@qq.com
Work Time：8:00-23:00

Hots

Help With Artificial Intelligence Meth... 2024-04-18
Ghostwriter Kxo206 Database Management... 2024-04-18
Ghostwriter Comp9417 Project: Multitas... 2024-04-18
Ghostwriter Bl5611 Drugdiscovery2024 T... 2024-04-18
Ghostwriter Comp5313/Comp4313—Large Sc 2024-04-18
Ghostwriter Aem 4500 / Econ 3860 / Aem... 2024-04-18
Ghostwriter Math 1151, Spring 2024 Wri... 2024-04-18
Ghostwriter 7Ssmm712 – Topics In Appli 2024-04-18
Ghostwriter Elec Eng 3088/7088 Compute... 2024-04-18
Ghostwriter Pols0010 Data Analysis Ter... 2024-04-18
Ghostwriter Econ 602: Course Projecthe... 2024-04-18
Help With Economics 253 - Spring 2024 ... 2024-04-18
Ghostwriter Artd 6151: Sustainability ... 2024-04-18
Help With Ifn647 Text, Web And Media A... 2024-04-18
Help With Cse340 Project 2: Parsinggh... 2024-04-18
Help With Mane - 4500 Modeling And Con... 2024-04-18
Help With Civil 750 - Timber Engineeri... 2024-04-18
Ghostwriter Qbus6860 Sustainable Energ... 2024-04-18
Ghostwriter 25721 Investment Managemen... 2024-04-18
Help With Eeee4123 Hdl For Programmabl... 2024-04-18

Programming Assignment Help！