program ProgrammingHelp With ,c/c++，Java ProgrammingHelp With

Assignment 1 Q1
A few more cool things about PCA (30 points)
For parts a) to c) below, please assume the following:
Let be an random matrix such that , i.e. the is the covariance
matrix for row of (the th column of .
Assume that is a positive definite matrix with normed eigenvalue decomposition .
Question parts:
a. (10 points) Let be the vector of scores for the -th row of . Show that the PCA
representation preserves distance between the two vectors and , i.e. that
where . Hint: Use the properties of the various pieces of the eigenvalue
decomposition.
b. (10 points) Using the properties of traces of products of matrices and the definition of in part a),
show that:
showing that the sum of the eigenvalues is equal to the sum of the marginal variances.
c. (10 points) Assume that we generate a random vector such that and
. Let
where as described at the beginning of this question.
i. What are the and ?
ii. What is the distribution of ?
Please show your work in deriving the answers, but you may use standard results for the properties of Normal
random variables.
X = ( |? | )X1 Xp n × p Var(( ) = Σ?iXt)i Σ
i X i Xt
Σ Σ = WΛW t
= W(Yi Xt)i p i X
(Xt)i (Xt)j
|| ? ||(X)t i (X)t j = || ? ||Yi Yj
||u ? v|| = (u ? v (u ? v))t
Σ
tr(Σ) = tr(Λ)
p × 1 Z ～ Normal(0, 1)Zi
Cov( , ) = 0?i ≠ jZi Zj
V = ZWtΛ1/2
Σ = WΛWt
E(V) Var(V)
ViAssignment 1 Q2
Analyzing wine data (30 points)
The data for this exercise comes from a paper by Cortez, et al. (2009)
(https://www.sciencedirect.com/science/article/abs/pii/S0167923609001377?via%3Dihub) where the authors
were trying to relate various chemical properties of red and white wine to perceived quality. For this question,
we will analyze only the data for the chemical properties, not the quality. Also the original paper looked at red
and white wine, we will only use the data for the red.
The data can be read in via:
library(tidyverse)
wine_data<-read_csv("red_wine_data.csv") # Be sure this is in your current working di
rectory
glimpse(wine_data)
Rows: 1,599
Columns: 12
$ `fixed acidity` 7.4, 7.8, 7.8, 11.2, 7.4, 7.4, 7.9, 7.3, 7.8, 7…
$ `volatile acidity` 0.700, 0.880, 0.760, 0.280, 0.700, 0.660, 0.600…
$ `citric acid` 0.00, 0.00, 0.04, 0.56, 0.00, 0.00, 0.06, 0.00,…
$ `residual sugar` 1.9, 2.6, 2.3, 1.9, 1.9, 1.8, 1.6, 1.2, 2.0, 6.…
$ chlorides 0.076, 0.098, 0.092, 0.075, 0.076, 0.075, 0.069…
$ `free sulfur dioxide` 11, 25, 15, 17, 11, 13, 15, 15, 9, 17, 15, 17, …
$ `total sulfur dioxide` 34, 67, 54, 60, 34, 40, 59, 21, 18, 102, 65, 10…
$ density 0.9978, 0.9968, 0.9970, 0.9980, 0.9978, 0.9978,…
$ pH 3.51, 3.20, 3.26, 3.16, 3.51, 3.51, 3.30, 3.39,…
$ sulphates 0.56, 0.68, 0.65, 0.58, 0.56, 0.56, 0.46, 0.47,…
$ alcohol 9.4, 9.8, 9.8, 9.8, 9.4, 9.4, 9.4, 10.0, 9.5, 1…
$ quality 5, 5, 5, 6, 5, 5, 5, 7, 7, 5, 5, 5, 5, 5, 5, 5,…
The variables are self-evident from the names. We will not want to use the quality varible and we can create a
new dataset without it via:
wine_data_chem <- wine_data %>% select(-quality)
head(wine_data_chem)
# A tibble: 6 x 11
`fixed acidity` `volatile acidity` `citric acid` `residual sugar` chlorides

1 7.4 0.7 0 1.9 0.076
2 7.8 0.88 0 2.6 0.098
3 7.8 0.76 0.04 2.3 0.092
4 11.2 0.28 0.56 1.9 0.075
5 7.4 0.7 0 1.9 0.076
6 7.4 0.66 0 1.8 0.075
# … with 6 more variables: free sulfur dioxide ,
# total sulfur dioxide , density , pH , sulphates ,
# alcohol
This is the data you should analyze.
a. (10 points) Using only scatterplots and the sample correlation matrices, summarize what you believe to
be are the most interesting associations you observe amongst these characteristics. Show both the
plots and summaries you generate to support your summaries.
b. (20 points) Perform a principal component analysis of this data using your preferred function. As part of
this analysis, please be sure complete the following tasks:
Report the eigenvalues for all 11 principal compoments.
For the first two principal components, plot and interpret compononents in terms of the original
variables. In particular, explain which variables are most highly correlated with each of these two
components and how these components are different from each other.
Choose the smallest number of principal components that you believe can be used to summarize
the information from the data and justify your choice.

QQ：99515681
WeChat：codinghelp
Email：99515681@qq.com
Work Time：8:00-23:00

Hots

Ghostwriter Cs1b Spring 2024 Tth Hw08h... 2024-04-19
Help With Managing Financial Risk Prob... 2024-04-19
Ghostwriter Cs 0449 – Project 5: /Dev/ 2024-04-19
Ghostwriter Elec 2141 Digital Circuit ... 2024-04-19
Help With Csc171 — Videogame Projecthe 2024-04-19
Help With Comp3411 Artificial Intellig 2024-04-19
Help With Stat3061: Random Processes &... 2024-04-19
Ghostwriter Accounting 452, Spring 202... 2024-04-19
Ghostwriter Finc5001 Foundations In Fi... 2024-04-19
Ghostwriter 7Ssmm712 – Topics In Appli 2024-04-19
Help With Com 337 - Film Studies For T... 2024-04-19
Ghostwriter Mes202tc - Digital Vlsi Sy... 2024-04-19
Ghostwriter Geography 2041B Distance S... 2024-04-19
Ghostwriter Ecos3006 International Tra... 2024-04-19
Help With Fit5225 2024 Sm1 Creating An... 2024-04-19
Help With Cit 593: Introduction To Com... 2024-04-19
Help With Math 4931: Take Home Examgho... 2024-04-19
Ghostwriter Csci 547|Info 533: Systems... 2024-04-19
Ghostwriter Cs536-S24 Intro To Pls And... 2024-04-19
Help With Fit5212 - Assignment 1Ghostw... 2024-04-19

Programming Assignment Help！