Introduction to Statistical Machine Learning Assignment 3: Implementation of PCA and Kernel PCA

The University of Adelaide, School of Computer Science

Introduction to Statistical Machine Learning

Semester 2, 2019 Assignment 3: Implementation of PCA and Kernel PCA

DUE: 4 Nov. 2019, Monday 11:55 PM

Submission

Instructions and submission guidelines:

• You must sign an assessment declaration coversheet to submit with your assignment.

• Submit your assignment via the Canvas MyUni.

Kernel PCA and PCA

You are required to write code for performing PCA and kernel PCA (KPCA) on a given dataset.

After performing PCA/KPCA, you will use various classifiers to perform classification. You will

investigate the effect of applying PCA, KPCA and KPCA with different types of kernels.

Specifically, the tasks you need to complete are:

• Write code for performing PCA

• Apply PCA to the given dataset and then perform classification with the nearest

neighbour classifier. Analyse the performance change against different reduced

dimensions. (suggestion: from 256 to 10)

• Write code for performing KPCA. Use three kernel functions, Linear Kernel

(equivalent to PCA), Gaussian-RBF kernel and Hellinger kernel (See the lecture slides)

• Apply KPCA to the given dataset and then perform classification with nearest

neighbour classifier. Analyse the performance change against different reduced

dimensions. (suggestion: from 256 to 10)

• [Master student only] Append 256 noisy dimensions to the original data, that is, for

each sample xi, appending a 256-dimensional random feature vector to xi to make it

a 512-dimensional feature. Then repeat the KPCA experiments by using a linear SVM

as the classifier. Test and analyse the results. The following lines give an example code

(Matlab) for appending noise features

o % Let X be the original data

o [N,d] = size(X); o R = randn(N,d); % using Gaussian noise in this experiment

o X_ = [X,R];

Hint:

For Gaussian-RBF kernel, you need to set the bandwith hyper-parameter 𝜎2

. An empirical

approach is to choose it as 12𝑁2 ∑ ||𝑥𝑖 − 𝑥𝑗||2

𝑖𝑗 . You are encouraged to scale up and down this

empirical chosen parameter a little bit to find out the best performed hyper-parameter.

You can choose either Matlab, Python, or C/C++. I would personally suggest Matlab or

Python.

The PCA and KPCA part of your code should not rely on any 3rd-party toolbox. Only Matlab's

built-in API's or Python/ C/C++'s standard libraries are allowed. However, you can use 3

party implementation of linear SVM for your experiments.

You are also required to submit a report (<10 pages in PDF format), which should have the

following sections (report contributes 50% to the mark; code 50%):

• An algorithmic description of PCA and KPCA. (5%)

• Your understanding of PCA and KPCA (anything that you believe is relevant to this

algorithm) (5%)

• Some analyses of your implementation. You should plot an error curve against the number

of reduced dimensions. For KPCA, results from different kernels should be plotted on the

same figure with different colours. (20% for master students and 40% for undergraduate

students)

• You should present and analyse the results of adding noisy dimensions before performing

PCA/KPCA. For example, you can compare the impact of noisy dimensions to different

kernels in KPCA. (20% for master students only)

In summary, you need to submit (1) the code that implements PCA/KPCA and (2) a report in

PDF.

Data

You will use USPS dataset to test your model. It contains 10 classes with 7291 training

samples and 2007 testing samples. The feature dimensionality is 256. The training data can

be found in usps.libsvm and testing data can be found in usps.t.libsvm.

QQ：99515681
WeChat：codinghelp
Email：99515681@qq.com
Work Time：8:00-23:00

Hots

Help With Artificial Intelligence Meth... 2024-04-18
Ghostwriter Kxo206 Database Management... 2024-04-18
Ghostwriter Comp9417 Project: Multitas... 2024-04-18
Ghostwriter Bl5611 Drugdiscovery2024 T... 2024-04-18
Ghostwriter Comp5313/Comp4313—Large Sc 2024-04-18
Ghostwriter Aem 4500 / Econ 3860 / Aem... 2024-04-18
Ghostwriter Math 1151, Spring 2024 Wri... 2024-04-18
Ghostwriter 7Ssmm712 – Topics In Appli 2024-04-18
Ghostwriter Elec Eng 3088/7088 Compute... 2024-04-18
Ghostwriter Pols0010 Data Analysis Ter... 2024-04-18
Ghostwriter Econ 602: Course Projecthe... 2024-04-18
Help With Economics 253 - Spring 2024 ... 2024-04-18
Ghostwriter Artd 6151: Sustainability ... 2024-04-18
Help With Ifn647 Text, Web And Media A... 2024-04-18
Help With Cse340 Project 2: Parsinggh... 2024-04-18
Help With Mane - 4500 Modeling And Con... 2024-04-18
Help With Civil 750 - Timber Engineeri... 2024-04-18
Ghostwriter Qbus6860 Sustainable Energ... 2024-04-18
Ghostwriter 25721 Investment Managemen... 2024-04-18
Help With Eeee4123 Hdl For Programmabl... 2024-04-18

Programming Assignment Help！