Learning Outcome(s):
Demonstrate a wide range of clustering, estimation, prediction, and classification algorithms to solve a specific program or application.
Question One
By using Cosine Similarity Formula, find the similarity between documents: Document 1 (A) and Document 2 (B), with given value of A and B is as follows:
Document 1: [1, 1, 1, 1, 1, 0] let’s refer to this as A
Document 2: [1, 1, 1, 1, 0, 1] let’s refer to this as B
Above we have two vectors (A and B) that are in a 6-dimension vector space
[Given formula Cosine similarity (CS) = (A . B) / (||A|| ||B||)].
3 Marks
Learning Outcome(s):
Demonstrate a wide range of clustering, estimation, prediction, and classification algorithms to solve a specific program or application.
Question Two
1000 people (350 less than or equal to 20 years old, and 650 greater than 20 years old) were asked, “Which take-out food do you prefer – junk food or healthy food?
The results were:
Junk food
Healthy food
Ages <= 20 225 125 Ages > 20
350
300
Calculate chi-square
Note :
Expected value is calculated using the following equation
= =
2.5 Marks
Learning Outcome(s):
Demonstrate a wide range of clustering, estimation, prediction, and classification algorithms to solve a specific program or application.
Question Three
What is the Manhattan distance between different points as shown below? Fill the table with appropriate Manhattan distances. As an example, the distance between points A and C is computed in the appropriate table cell.
A
B
C
D
A
B
C
13
D
0.5 Marks
Learning Outcome(s):
Employ data mining and data warehousing techniques to solve real-world problems.
Question Four
Apply the discretization filter in iris dataset. (Note: iris dataset can be directly loaded into WEKA from the “C:\Program Files\Weka-3-8\data” link). After applying the discretization filter, list all the features
Last Completed Projects
topic title | academic level | Writer | delivered |
---|