Cosine similarity in data mining

What is Cosine similarity?

Cosine similarity is a measure to find the similarity between two files/documents.

Example of cosine similarity:

What is the similarity between two files, file 1 and file 2?

Formula: cos(file 1, file 2) =  (file 1 · file 2)  /  ||file 1|| ||file 2|| ,

file 1 (0, 3, 0, 0, 2, 0, 0, 2, 0, 5)

file 2(1, 2, 0, 0, 1, 1, 0, 1, 0, 3)

file 1 · file 2  =  0*1 + 3*2 + 0*0 + 0*0 + 2*1 + 0*1 + 0*0 + 2*1 + 0*0 + 5*3  

                     =  25

||d1||= (0*0 + 3*3 + 0*0 + 0*0 + 2*2 + 0*0 + 0*0 + 2*2 + 0*0 + 5*5)0.5

         =(42)0.5  = 6.481

||d2||= (1*1 + 2*2 + 0*0 + 0*0 + 1*1 + 1*1 + 0*0 + 1*1 + 0*0 + 3*3)0.5

          =(17)0.5       = 4.12

cos(d, d2 ) = 0.94

 Click Here to try Automatic Tool of cosine similarity 

Fazal Rehman Shamil
Welcome to all friends. The reason for our success is only your love for T4Tutorials. Our team is always available to answer your queries regarding any kind of confusions or discussion regarding your study and career matters. For discussion with us please join our facebook group "T4Tutorials.com". The link of the group is mentioned below. Thanks and love to all for connecting with us. We are nothing without you. Love you all.....
https://web.facebook.com/groups/2066136233601097/