Part1
Generate a set S of 500 points (vectors) in 2-dimensional Euclidean space. Use the Euclidean distance to measure the distance between any two points. Write a program to find all the outliers in your set S and print out these outliers. If there is no
outlier, your program should indicate so. Use any programming language of your choice.
https://careerfoundry.com/en/blog/data-analytics/how-to-find-
outliers/
Next, remove the outliers from S, and call the resulting set S’.
Part2
(1)Write a program that implements the hierarchical agglomerative clustering algorithm taught in the class to cluster the points in S’ into k clusters where
kis a user-specified parameter value.
(2)Repeat part 1 and (1) above on two additional different datasets.
Notes on the hierarchical agglomerative clustering algorithm
In determining the distance of two clusters, you should consider the following definitions respectively:
Øthe distance between the nearest two points in the two clusters, Øthe distance between the farthest two points in the two clusters, Øthe average distance between points in the two clusters, Øthe distance between the centers of the two clusters.
Use the definition that yields the best performance where the performance is measured by the Silhouette coefficient.
Last Completed Projects
topic title | academic level | Writer | delivered |
---|