Home Page > > Details

COMP SCI 4094/4194/7094 Assignment 3 Distributed Databases and Data Mining

 COMP SCI 4094/4194/7094 - Distributed Databases and Data Mining

Assignment 3
DUE: 23:59 Thursday 28th October
Important Notes
• Handins:
– The deadline for submission of your assignment is 23:59 Thursday 28th October
, 2021. – You must do this assignment individually and make individual submissions.
– Your program should be coded in C++ and pass test runs on 4 test files. The sample
input and output files are downloadable in “Assignments” of the course home page
(https://myuni.adelaide.edu.au/courses/64886/assignments/238277).
– You need to use svn to upload and run your source code in the web submission system
following “Web-submission instructions” stated at the end of this sheet. You should
attach your name and student number in your submission.
– Late submissions will attract a penalty: the maximum mark you can obtain will be
reduced by 25% per day (or part thereof) past the due date or any extension you are
granted.
• Marking scheme:
– 16 marks for testing on 4 random tests: 4 marks per test.
For undergraduate students, We want your code cluster the Flows by Manhat￾tan distance:
1 mark for Flow.txt
3 marks for KMedoids.txt (3 marks for absolute value)
For postgraduate students, you should design a suitable code structure or API
to make this code expect more flexible. We want your code can easily change from
Manhattan distance to Euclidean distance. You should write this functions on
you code:
1 mark for Flow.txt
2 marks for KMedoids.txt (Use Manhattan distance to cluster)
1 mark for KMedoidsE.txt (Use Euclidean distance to cluster)
– 4 marks for the code styles. (Put your id, name, postgraduate or undergraduate
on the code header comment)
– Note: If it is found your code did not implement the required computation tasks
in this assignment, you will receive zero mark regardless of the correctness of testing
output.
If you have any questions, please send them to the student discussion forum. This way you
can all help each other and everyone gets to see the answers.
The assignment
In this assignment you are required to code a traffic packet clustering engine to cluster the raw
network packet to different applications, such as http, smtp. To accomplish this assignment, a
data preprocessing module and a clustering module should be implemented.
You will have two input files, and you should print two(undergraduate) or three(postgraduate)
output files.
0.1 Input File:
The input file1 contains a distance threshold and the raw network packet information, that is,
seven attributes of a packet: source address, source port, destination address, destination port,
protocol, arrival time, and packet length.
1. Input file1.txt is sample traffic flow information, which looks like:
src addr src port dst addr dst port protocol arrival time packet length
202.234.224.254 49880 31.65.181.210 80 6 115258 52
202.234.224.254 49880 31.65.181.210 80 6 115307 52
202.234.35.144 55256 74.39.124.220 443 6 115310 46
119.188.179.82 50592 150.79.7.129 80 6 115314 40
202.234.224.254 49880 31.65.181.210 80 6 115341 52
119.188.179.82 50592 150.79.7.129 80 6 115350 40
119.188.179.82 50592 150.79.7.129 80 6 115363 40
2. Input file2.txt has a number K, and on the next line include K integer numbers represent
an initial set of K medoids, which looks like:
1 (k=1)
0 (Start from index 0, as the initial start medoid)
0.2 Output File:
You should print out:
for undergraduate students:
1. Flow.txt (for data preprocessing result, 1 mark per test)
2. KMedoids.txt (for clustering result by Manhattan distance, 2 marks for absolute value, 1
mark for details).
for postgraduate students:
1. Flow.txt (for data preprocessing result, 1 mark per test)
2. KMedoids.txt (for clustering result by Manhattan distance, 2 marks).
3. KMedoidsE.txt (for clustering result by Euclidean distance, 1 mark).
What you need to do:
In the data preprocessing module, your program should prepare the flow data for clustering
by the raw packet data, two steps are involved: you need to firstly merge the packets into flows
by the rule: a network flow includes at least TWO packets with same source address, source
port, destination address, destination port, and protocol, then calculate two clustering features:
average transferring time and the average packet length of a flow.
In the clustering module, you need to apply k-medoids algorithm (course slides Chapter
10, not the book’s random method) to find the minimum number of clusters that the sum of the
distance of each flow to its centroid is less than the given threshold. Note: the clustering features
come from data preprocessing module, the distance measurement is Mannhaton distance.
For your convenience, below is the framework of the k-medoids algorithm which you should
follow:
We will use PAM algorithm on ClusBasic.pdf page 20: https://myuni.adelaide.edu.au/
courses/64886/discussion_topics/602515
Example
Sample traffic flow information
src addr src port dst addr dst port protocol arrival time packet length
202.234.224.254 49880 31.65.181.210 80 6 115258 52
202.234.224.254 49880 31.65.181.210 80 6 115307 52
202.234.35.144 55256 74.39.124.220 443 6 115310 46
119.188.179.82 50592 150.79.7.129 80 6 115314 40
202.234.224.254 49880 31.65.181.210 80 6 115341 52
119.188.179.82 50592 150.79.7.129 80 6 115350 40
119.188.179.82 50592 150.79.7.129 80 6 115363 40
Data preprocessing module
Firstly, we should identify different flows (different flows have different source and destination
addresses).
In the above traffic flow information, there are two flows: The first, second, and fifth packet
belong to the first flow(index is 0); the fourth, sixth, and seventh packet belong to the second
flow(index is 1).
The Average transferring time of first flow = (( the arrival time of fifth packet - the arrival
time of second packet ) + (the arrival time of second packet - the arrival time of first packet))
÷ (3 - 1) = ((115341 - 115307) + (115307 - 115258)) ÷ 2 = 41.5. The Average length of first
flow = (P packet length) ÷ 3 = (52 + 52 + 52) ÷ 3 = 52. Similarly, the Average transferring
time of second flow = 24.5, the average length of second flow = 40.
(arrival time is microsecond(µs))
Clustering module
We use Manhattan distance to measure the distance between flows. In our sample, the distance
between the two flows is |41.5 5 24.5| + |52 2 40|.
Example Output
At begin you should output the flow after Data preprocessing module, include index, average
transferring time x value and average length y value.
ID X Y
In this case, Flow.txt should print:
0 41.50 52.00
1 24.50 40.00
Rounding numbers (X,Y) to 2 decimal place. You can use:
cout << f ixed << setprecision(2) << 3.1415926;
or
printf(”%0.2f”, 3.1415926);
After doing KMedoid, you will get K clusters.
You should provide KMedoids.txt file:
It includes K+2 lines. First line is absolute-error criterion (First line it important, other lines is
help you to debug.). Next one line include K medoids’ index. Following each line have several
flow index (In order of number) represent each medoid includes which flows.
29 (Absolute-error of the cluster, 2 decimal places)
0 (Medoid is 0)
0 1 (This cluster include 2 flows index 0 and index 1 )
For postgraduate students, you should design a suitable code structure or API. This code
is expected more flexible. It should be easily changed from Manhattan distance to Euclidean
distance. You should write this functions on you code.
Tips: you can use object-oriented, class-based, or other well-organized methods.
You should print KMedoidsE.txt, the structure is same as KMedoids.txt https://en.wikipedia.
org/wiki/Euclidean_distance
Web-submission instructions
• First, type the following command, all on one line (replacing xxxxxxx with your student
ID):
svn mkdir - -parents -m “DDDM”
https://version-control.adelaide.edu.au/svn/axxxxxxx/2021/s2/dddm/assignment3
• Then, check out this directory and add your files:
svn co https://version-control.adelaide.edu.au/svn/axxxxxxx/2021/s2/dddm/assignment3
cd assignment3
svn add KMedoidsUG.cpp (or KMedoidsPG.cpp)
· · ·
svn commit -m “assignment3 solution”
• Next, go to the web submission system at:
https://cs.adelaide.edu.au/services/websubmission/
Navigate to 2021, Semester 2, Distributed Databases and Data Mining, Assignment 3.
Then, click Tab “Make Submission” for this assignment and indicate that you agree to the
declaration. The automark script will then check whether your code compiles. You can
make as many resubmissions as you like. If your final solution does not compile you won’t
get any marks for this solution.
• Note:
1. Please follow the forms in sample output files.
2. Your local file path will not work with our web-submission system.
3. We prepared ten test files in web-submission system, when you submit your program,
random test files will be allocated for you.
4. The auto-marker script compiles and runs named ”KMedoidsUG.cpp” or ”KMedoid￾sPG.cpp” by using following command(please only submit one cpp file, name KMe￾doidsUG.cpp or KMedoidsPG.cpp):
g++ -std=c++11 KMedoidsUG.cpp -o runKMedoids (for undergraduate students)
g++ -std=c++11 KMedoidsPG.cpp -o runKMedoids (for postgraduate students)
./runKMedoids network packets.txt initial medoids.txt
In this assignment, you need to read two files network packets.txt ( network pack￾ets traffic information) and initial medoids.txt (initial medoids) which are generated
randomly by the system.
5. Absolute-error is the total manhattan distances. K-medoid is aiming to narrow down
the distance between the each point and their clusters.
6. Your code should follow default order of the K-Medoid algorithm. If you not use the
default order. It may cause your absolute vaule is right but KMedoidsdeails.txt is
wrong.
7. If the answer is around the standard absolute value, we will accept this answers. Eg:
standard absolute value is 8223.23 and your absolute value is 8222.11, we will accept
your answer.
8. IF you have any questions on assignment 3 you can ask in this link: https://myuni.
adelaide.edu.au/courses/64886/discussion_topics/602515 . Tips: If you have
accuracy problem in final absolute-error, fristly, you can try to resubmit code(because
data is random generated). If that not fix the accuracy problem, you can put it on
discussion board, I will manual judge it.
You should print two or three output files as shown in the following two examples.
Example1
input:File1.txt
src addr src port dst addr dst port protocol arrival time packet length
202.234.224.254 49880 31.65.181.210 80 6 115258 52
202.234.224.254 49880 31.65.181.210 80 6 115307 52
202.234.35.144 55256 74.39.124.220 443 6 115310 46
119.188.179.82 50592 150.79.7.129 80 6 115314 40
202.234.224.254 49880 31.65.181.210 80 6 115341 52
119.188.179.82 50592 150.79.7.129 80 6 115350 40
119.188.179.82 50592 150.79.7.129 80 6 115363 40
input:File2.txt
10
output:Flow.txt
0 41.50 52.00
1 24.50 40.00
output:KMedoids.txt
29.00
0
0 1
output for postgraduate:KMedoidsE.txt
20.81
0
0 1
Example2
input:file1.txt
src addr src port dst addr dst port protocol arrival time packet length
61.43.24.146 80 133.227.178.71 55651 6 115164 1500
223.139.34.184 57258 203.146.250.47 80 6 115167 40
118.162.252.133 8100 150.79.7.129 80 6 115178 52
163.39.157.71 52864 199.252.216.15 443 6 115181 436
125.96.202.102 80 202.31.174.9 36122 6 115185 185
202.234.224.254 49880 31.65.181.210 80 6 115189 52
61.211.145.45 61611 150.79.7.129 80 6 115222 40
202.234.224.254 49880 31.65.181.210 80 6 115226 52
163.39.157.71 52864 199.252.216.15 443 6 115230 1426
163.39.157.71 52865 199.252.216.15 443 6 115233 436
118.91.103.40 53186 150.79.7.129 80 6 115244 52
133.244.153.246 54194 165.143.250.152 443 6 115247 52
202.234.224.254 49880 31.65.181.210 80 6 115251 52
163.39.157.71 52865 199.252.216.15 443 6 115254 1426
202.234.224.254 49880 31.65.181.210 80 6 115258 52
202.234.224.254 49880 31.65.181.210 80 6 115307 52
202.234.35.144 55256 74.39.124.220 443 6 115310 378
119.188.179.82 50592 150.79.7.129 80 6 115314 40
202.234.224.254 49880 31.65.181.210 80 6 115320 52
202.234.224.254 49880 31.65.181.210 80 6 115326 52
202.234.224.254 49880 31.65.181.210 80 6 115331 52
202.234.35.144 50070 173.199.56.254 80 6 115335 40
54.221.15.83 443 150.79.179.172 60804 6 115349 52
202.145.203.99 443 163.39.7.122 53326 6 115435 1500
133.227.171.14 52147 121.131.234.16 80 6 115439 818
131.14.216.241 24153 54.43.88.212 80 6 115443 1496
202.145.203.99 443 163.39.7.122 53326 6 115447 1188
203.146.250.47 80 5.98.62.124 47610 6 115461 1460
69.192.0.189 80 202.234.225.187 59368 6 115469 1500
69.192.0.189 80 202.234.225.187 59368 6 115491 1500
202.234.228.45 58507 38.249.43.123 443 6 115494 1500
163.39.110.212 49700 204.93.161.172 443 6 115501 819
126.71.29.111 61782 203.146.247.176 80 6 115512 40
126.71.29.111 61782 203.146.247.176 80 6 115516 40
131.14.216.241 24153 54.43.88.212 80 6 115519 1496
203.146.250.47 80 113.63.133.249 39564 6 115573 1500
202.231.242.67 49448 131.226.8.6 80 6 115576 249
157.210.227.245 60827 66.36.161.252 80 6 115580 52
203.146.250.47 80 113.63.133.249 39564 6 115584 1500
69.192.0.189 80 202.234.225.187 59368 6 115588 1500
69.192.0.189 80 202.234.225.187 59368 6 115597 1500
175.84.22.21 41639 150.42.176.170 54756 6 115601 60
219.80.177.15 33814 150.79.7.129 80 6 115605 52
202.234.228.45 58507 38.249.43.123 443 6 115609 1500
131.14.216.241 24153 54.43.88.212 80 6 115664 1496
163.39.157.71 52867 199.252.216.15 443 6 115751 1426
163.39.157.71 52867 199.252.216.15 443 6 115755 436
163.39.157.71 52864 199.252.216.15 443 6 115763 436
133.244.153.246 54194 165.143.250.152 443 6 115766 52
163.39.157.71 52864 199.252.216.15 443 6 115809 1426
131.14.216.241 24153 54.43.88.212 80 6 115815 1496
202.234.35.13 52171 185.213.144.150 80 6 115831 52
173.199.56.233 80 202.234.224.241 59801 6 115878 1500
173.199.56.233 80 202.234.224.241 59801 6 115893 1500
113.113.137.159 61396 150.79.7.129 80 6 115904 40
199.252.216.15 443 163.39.157.71 52864 6 115907 64
131.14.216.241 24153 54.43.88.212 80 6 115991 1496
199.252.216.15 443 163.39.157.71 52864 6 116014 52
133.244.153.246 54194 165.143.250.152 443 6 116049 52
96.227.76.37 3242 133.250.150.37 445 6 116075 48
131.14.216.241 24153 54.43.88.212 80 6 116084 1496
96.16.24.215 443 202.234.35.13 62476 6 116222 60
163.39.157.71 52865 199.252.216.15 443 6 116226 1426
131.14.216.241 24153 54.43.88.212 80 6 116229 1496
163.39.157.71 52865 199.252.216.15 443 6 116275 436
131.14.216.241 24153 54.43.88.212 80 6 116279 495
182.158.75.63 80 133.244.234.48 50169 6 116287 1490
61.210.137.135 56413 150.79.7.129 63190 6 116291 40
163.39.157.71 52867 199.252.216.15 443 6 116298 436
163.39.157.71 52867 199.252.216.15 443 6 116329 1426
133.244.153.246 54194 165.143.250.152 443 6 116333 52
131.14.188.92 34705 204.93.161.172 443 6 116349 52
211.73.188.247 443 202.234.35.13 36955 6 116358 1500
211.73.188.247 443 202.234.35.13 36955 6 116365 1500
211.73.188.247 443 202.234.35.13 36955 6 116400 1500
211.73.188.247 443 202.234.35.13 36955 6 116404 1500
126.71.29.111 61782 203.146.247.176 80 6 116415 40
202.234.228.45 58507 38.249.43.123 443 6 116423 1500
203.146.250.47 80 199.48.187.153 58554 6 116427 52
133.244.153.246 56862 31.65.185.141 443 6 116484 40
23.225.11.237 80 202.31.174.9 18622 6 116498 1430
23.225.11.237 80 202.31.174.9 18622 6 116501 1430
173.199.56.233 80 202.234.224.241 59801 6 116518 1500
173.199.56.233 80 202.234.224.241 59801 6 116522 1500
182.104.251.244 64598 150.79.7.129 80 6 116525 40
182.104.251.244 64598 150.79.7.129 80 6 116528 40
133.244.153.246 56862 31.65.185.141 443 6 116539 40
173.199.56.233 80 202.234.224.241 59801 6 116542 1500
173.199.56.233 80 202.234.224.241 59801 6 116566 1500
182.104.251.244 64598 150.79.7.129 80 6 116569 40
202.234.228.45 58507 38.249.43.123 443 6 116573 1500
173.199.56.233 80 202.234.224.241 59801 6 116576 1500
133.244.153.246 54194 165.143.250.152 443 6 116582 52
173.199.56.233 80 202.234.224.241 59801 6 116586 1500
106.127.152.45 56799 150.79.7.129 80 6 116589 40
124.44.132.23 443 202.231.242.67 49557 6 116601 294
211.3.241.186 14457 150.79.7.129 80 6 116605 40
223.139.34.184 57258 203.146.250.47 80 6 116610 40
200.98.164.214 3966 133.250.168.37 445 6 116615 48
199.252.216.15 443 163.39.157.71 52864 6 116623 52
199.252.216.15 443 163.39.157.71 52864 6 116630 52
133.244.153.246 56862 31.65.185.141 443 6 116641 40
61.43.24.146 80 133.227.178.71 55651 6 116645 1500
61.43.24.146 80 133.227.178.71 55651 6 116651 1500
223.25.5.131 64680 150.79.7.129 80 6 116654 40
175.167.20.236 10595 150.79.176.180 54762 6 116658 52
107.133.162.38 443 163.39.5.198 57375 6 116672 569
183.172.222.56 39620 150.79.7.129 80 6 116675 52
202.234.228.45 58507 38.249.43.123 443 6 116678 1500
183.172.222.56 39620 150.79.7.129 80 6 116682 52
183.172.222.56 39620 150.79.7.129 80 6 116688 52
183.172.222.56 39620 150.79.7.129 80 6 116692 52
183.172.222.56 39620 150.79.7.129 80 6 116696 52
183.172.222.56 39620 150.79.7.129 80 6 116699 52
183.172.222.56 39620 150.79.7.129 80 6 116703 52
183.172.222.56 39620 150.79.7.129 80 6 116706 52
183.172.222.56 39620 150.79.7.129 80 6 116709 52
183.172.222.56 39620 150.79.7.129 80 6 116713 52
183.172.222.56 39620 150.79.7.129 80 6 116716 52
133.244.153.246 56862 31.65.185.141 443 6 116727 40
203.146.253.28 6881 60.26.1.79 45729 6 116741 52
27.178.159.198 4419 150.79.7.129 80 6 116748 40
183.172.222.56 39620 150.79.7.129 80 6 116751 52
183.172.222.56 39620 150.79.7.129 80 6 116755 52
183.172.222.56 39620 150.79.7.129 80 6 116759 52
183.172.222.56 39620 150.79.7.129 80 6 116762 52
163.39.157.71 52864 199.252.216.15 443 6 116766 1426
163.39.157.71 52864 199.252.216.15 443 6 116769 436
133.244.153.246 56862 31.65.185.141 443 6 116773 40
65.119.5.150 80 150.79.7.11 52758 6 116777 192
182.104.251.244 64598 150.79.7.129 80 6 116781 40
163.39.157.71 52865 199.252.216.15 443 6 116788 436
126.71.29.111 61782 203.146.247.176 80 6 116796 40
202.234.228.45 58507 38.249.43.123 443 6 116801 1500
133.244.153.246 56862 31.65.185.141 443 6 116804 40
163.39.157.71 52865 199.252.216.15 443 6 116811 1426
163.39.157.71 52867 199.252.216.15 443 6 116818 436
61.43.24.146 80 133.227.178.71 55651 6 116822 1500
61.43.24.146 80 133.227.178.71 55651 6 116831 1500
126.71.29.111 61782 203.146.247.176 80 6 116838 40
133.244.153.246 54194 165.143.250.152 443 6 116841 52
163.39.157.71 52867 199.252.216.15 443 6 116844 1426
133.244.153.246 56862 31.65.185.141 443 6 116851 40
199.252.216.15 443 163.39.157.71 52864 6 116860 64
199.252.216.15 443 163.39.157.71 52864 6 116863 52
61.43.24.136 80 133.227.178.71 55658 6 116871 1500
202.234.228.45 58507 38.249.43.123 443 6 116875 1500
203.146.250.47 80 199.48.187.153 58554 6 116878 990
61.43.24.136 80 133.227.178.71 55658 6 116882 1500
40.17.153.225 443 133.244.144.247 22150 6 116885 434
223.139.34.184 57258 203.146.250.47 80 6 116888 40
223.139.34.184 57258 203.146.250.47 80 6 116892 40
133.244.153.246 56862 31.65.185.141 443 6 116898 40
36.10.160.187 64334 150.79.177.11 62064 6 116915 40
182.158.75.33 80 157.210.199.11 11540 6 116918 64
175.161.50.49 61316 150.79.7.129 80 6 116921 52
133.244.153.246 56862 31.65.185.141 443 6 116925 40
202.234.228.45 58507 38.249.43.123 443 6 116935 1500
133.244.153.246 54194 165.143.250.152 443 6 117021 52
133.244.153.246 56862 31.65.185.141 443 6 117028 40
23.238.55.225 80 202.234.35.13 54750 6 117031 1500
23.238.55.225 80 202.234.35.13 54750 6 117034 1500
133.244.153.246 56862 31.65.185.141 443 6 117038 40
113.150.148.134 9051 150.42.177.43 54756 6 117048 52
133.244.153.246 56862 31.65.185.141 443 6 117111 40
113.5.21.232 5328 150.79.7.129 80 6 117125 40
163.39.157.71 52864 199.252.216.15 443 6 117129 1426
163.39.157.71 52864 199.252.216.15 443 6 117133 436
163.39.157.71 52865 199.252.216.15 443 6 117136 436
133.244.153.246 56862 31.65.185.141 443 6 117193 40
1.106.21.96 1946 150.79.7.129 80 6 117204 40
133.244.153.246 54194 165.143.250.152 443 6 117207 52
60.36.215.88 51464 150.79.7.129 80 6 117212 52
163.39.157.71 52865 199.252.216.15 443 6 117215 1426
173.199.56.233 80 202.234.224.241 59801 6 117218 1500
173.199.56.233 80 202.234.224.241 59801 6 117225 1500
133.244.153.246 56862 31.65.185.141 443 6 117245 40
202.234.227.137 58409 89.57.134.9 80 6 117248 40
69.192.0.189 80 202.234.225.187 59368 6 117251 1500
202.234.227.137 58409 89.57.134.9 80 6 117258 40
69.192.0.189 80 202.234.225.187 59368 6 117261 1500
202.234.227.137 58076 89.57.134.158 80 6 117266 40
131.14.158.108 62531 216.19.170.177 80 6 117269 40
202.234.227.137 58409 89.57.134.9 80 6 117301 40
23.234.243.99 443 203.146.254.83 61708 6 117304 52
133.244.153.246 56862 31.65.185.141 443 6 117311 40
199.252.216.15 443 163.39.157.71 52864 6 117316 64
199.252.216.15 443 163.39.157.71 52864 6 117319 52
23.234.243.101 80 203.146.254.83 61718 6 117324 52
211.3.241.186 14457 150.79.7.129 80 6 117331 40
133.227.127.204 53917 103.238.115.79 80 6 117335 40
61.111.37.246 49473 150.79.7.129 80 6 117357 52
133.244.153.246 56862 31.65.185.141 443 6 117371 40
23.234.243.101 80 203.146.254.83 61712 6 117375 52
23.234.243.99 443 203.146.254.83 61711 6 117381 52
131.14.92.245 60000 54.20.141.183 443 6 117384 40
101.105.131.251 62325 203.146.240.134 80 6 117388 52
118.91.103.40 53186 150.79.7.129 80 6 117467 52
118.91.103.40 53186 150.79.7.129 80 6 117471 52
23.234.243.101 80 203.146.254.83 61719 6 117474 52
23.234.243.99 443 203.146.254.83 61710 6 117478 52
133.244.153.246 56862 31.65.185.141 443 6 117481 40
31.65.185.129 443 202.31.174.9 51205 6 117484 1430
31.65.185.129 443 202.31.174.9 51205 6 117487 1430
31.65.185.129 443 202.31.174.9 51205 6 117495 1430
31.65.185.129 443 202.31.174.9 51205 6 117502 1430
133.244.153.246 56862 31.65.185.141 443 6 117506 40
31.65.185.129 443 202.31.174.9 51205 6 117510 1430
222.165.41.192 55767 150.79.7.129 80 6 117514 40
23.234.243.101 80 203.146.254.83 61704 6 117518 52
23.234.243.101 80 203.146.254.83 61702 6 117521 52
23.234.243.99 443 203.146.254.83 61709 6 117525 52
23.234.243.101 80 203.146.254.83 61703 6 117531 52
202.126.14.111 80 150.79.179.98 59791 6 117569 52
150.33.47.65 8932 150.79.7.129 80 6 117576 40
133.227.127.204 53917 103.238.115.79 80 6 117587 40
133.244.153.246 56862 31.65.185.141 443 6 117590 40
23.234.243.99 443 203.146.254.83 61706 6 117603 52
163.39.157.71 52867 199.252.216.15 443 6 117606 436
23.234.243.99 443 203.146.254.83 61707 6 117617 52
163.39.157.71 52867 199.252.216.15 443 6 117633 1426
133.244.153.246 56862 31.65.185.141 443 6 117644 40
133.244.153.246 54194 165.143.250.152 443 6 117694 52
82.102.13.136 1364 133.250.174.99 445 6 117698 48
163.39.157.71 52864 199.252.216.15 443 6 117702 1426
163.39.157.71 52864 199.252.216.15 443 6 117705 436
163.39.157.71 52865 199.252.216.15 443 6 117718 1426
163.39.157.71 52865 199.252.216.15 443 6 117722 436
183.52.183.141 993 202.231.242.67 36931 6 117803 52
133.244.153.246 56862 31.65.185.141 443 6 117809 40
133.244.153.246 56862 31.65.185.141 443 6 117814 40
203.48.9.248 23617 150.79.7.129 80 6 117829 40
64.120.227.69 443 163.39.158.247 58667 6 117869 1426
133.244.153.246 56862 31.65.185.141 443 6 117878 40
131.14.92.245 60000 54.20.141.183 443 6 117881 40
119.125.248.106 10777 133.250.156.245 50356 6 117900 48
133.244.153.246 54194 165.143.250.152 443 6 117968 52
27.178.159.198 4419 150.79.7.129 80 6 117980 40
133.244.153.246 56862 31.65.185.141 443 6 117983 40
64.120.227.69 443 163.39.158.247 58667 6 117994 1426
202.234.224.241 59801 173.199.56.233 80 6 118007 40
125.51.122.124 32415 150.79.7.129 80 6 118016 40
61.211.145.45 61611 150.79.7.129 80 6 118021 40
133.244.153.246 56862 31.65.185.141 443 6 118024 40
203.48.9.248 23620 150.79.7.129 80 6 118043 40
202.234.224.241 59801 173.199.56.233 80 6 118062 40
202.234.224.241 59801 173.199.56.233 80 6 118065 40
23.11.86.235 80 157.210.156.203 47406 6 118069 40
115.109.126.31 42535 150.79.7.129 80 6 118075 40
133.244.153.246 56862 31.65.185.141 443 6 118078 40
114.125.195.70 17294 150.79.7.129 80 6 118092 52
133.244.153.246 54194 165.143.250.152 443 6 118180 52
133.244.153.246 56862 31.65.185.141 443 6 118193 40
61.43.24.136 80 133.227.178.71 55658 6 118196 1500
163.39.157.71 52864 199.252.216.15 443 6 118201 1426
163.39.157.71 52864 199.252.216.15 443 6 118220 436
110.135.17.73 48692 150.79.7.129 80 6 118420 52
110.135.17.73 48692 150.79.7.129 80 6 118424 52
110.135.17.73 48692 150.79.7.129 80 6 118427 52
150.33.47.65 8932 150.79.7.129 80 6 118439 40
27.178.159.198 4419 150.79.7.129 80 6 118447 40
103.246.81.74 47751 203.146.247.176 80 6 118451 64
133.244.153.246 56862 31.65.185.141 443 6 118491 40
61.243.110.158 10035 150.79.7.129 80 6 118507 40
61.243.110.158 10035 150.79.7.129 80 6 118511 40
157.210.154.200 54684 31.65.191.15 443 6 118514 558
23.225.11.237 80 202.31.174.9 18622 6 118518 1430
133.244.153.246 56862 31.65.185.141 443 6 118521 40
23.225.11.237 80 202.31.174.9 18622 6 118524 1430
23.225.11.237 80 202.31.174.9 18622 6 118566 1430
23.225.11.237 80 202.31.174.9 18622 6 118569 1430
23.225.11.237 80 202.31.174.9 18622 6 118573 1430
23.225.11.237 80 202.31.174.9 18622 6 118576 1430
23.225.11.237 80 202.31.174.9 18622 6 118580 1430
133.244.153.246 56862 31.65.185.141 443 6 118587 40
23.225.11.237 80 202.31.174.9 18622 6 118590 1430
133.244.153.246 54194 165.143.250.152 443 6 118601 52
23.225.11.237 80 202.31.174.9 18622 6 118606 1430
27.178.159.198 4419 150.79.7.129 80 6 118609 40
157.210.145.83 53131 66.36.161.181 443 6 118613 359
23.225.11.237 80 202.31.174.9 18622 6 118616 1430
150.33.47.65 8932 150.79.7.129 80 6 118620 40
150.33.47.65 8932 150.79.7.129 80 6 118623 40
150.33.47.65 8932 150.79.7.129 80 6 118627 40
27.178.159.198 4419 150.79.7.129 80 6 118630 40
27.178.159.198 4419 150.79.7.129 80 6 118635 40
27.178.159.198 4419 150.79.7.129 80 6 118639 40
211.73.188.247 443 202.234.35.13 36955 6 118653 1500
211.73.188.247 443 202.234.35.13 36955 6 118657 303
61.111.37.246 49473 150.79.7.129 80 6 118660 40
118.162.252.133 6095 150.79.7.129 80 6 118665 40
133.244.153.246 56862 31.65.185.141 443 6 118675 40
163.39.157.71 52865 199.252.216.15 443 6 118714 1426
163.39.157.71 52865 199.252.216.15 443 6 118718 436
106.43.9.102 2840 150.79.7.129 80 6 118725 40
221.2.4.133 58658 150.79.7.129 873 6 118729 52
163.39.157.71 52864 199.252.216.15 443 6 118733 436
83.217.137.20 62919 150.79.118.25 2821 6 118739 1458
163.39.157.71 52864 199.252.216.15 443 6 118742 1426
31.65.185.129 443 202.31.174.9 51205 6 118746 1430
input:file2.txt
12
1 12 13 15 17 21 22 23 27 29 31 36
output:Flow.txt
0 416.75 1500.00
1 575.00 40.00
2 273.92 931.00
3 20.29 52.00
4 2799.00 40.00
5 316.82 931.00
6 1113.50 52.00
7 304.91 52.00
8 12.00 1344.00
9 119.43 1370.88
10 358.40 1500.00
11 205.86 1500.00
12 331.50 40.00
13 11.00 1500.00
14 268.86 931.00
15 149.67 1500.00
16 201.71 56.50
17 459.80 1300.50
18 451.00 521.00
19 73.03 40.00
20 192.55 1430.00
21 85.33 40.00
22 726.00 40.00
23 6.21 52.00
24 315.17 40.00
25 662.50 1500.00
26 3.00 1500.00
27 26.50 40.00
28 252.00 40.00
29 1303.00 46.00
30 497.00 40.00
31 252.40 1430.00
32 262.75 40.00
33 125.00 1426.00
34 29.00 40.00
35 3.50 52.00
36 4.00 40.00
output:KMedoids.txt
1635.12
32 4 18 0 13 1 20 25 27 17 6 2
7 12 16 24 28 32
4
18
0 10
8 13 26
1 22 30
9 11 15 20 31 33
25
3 19 21 23 27 34 35 36
17
6 29
2 5 14
output for postgraduate:KMedoidsE.txt
1547.02
32 4 18 0 1 13 25 11 2 29 9 27
7 12 16 24 28 32
4
18
0 10 17
1 22 30
13 26
25
11 15 20 31
2 5 14
6 29
8 9 33
3 19 21 23 27 34 35 36
1 Background
A data preprocessing module and a clustering module should be implemented, the structure is
illustrated below:
Contact Us - Email:99515681@qq.com    WeChat:codinghelp
Programming Assignment Help!