Oops! It appears that you have disabled your Javascript. In order for you to see this page as it is meant to appear, we ask that you please re-enable your Javascript!

Cosine similarity in data mining

Last modified on December 9th, 2018 at 9:17 pm

What is Cosine similarity?

Cosine similarity is a measure to find the similarity between two files/documents.

Example of cosine similarity:

What is the similarity between two files, file 1 and file 2?

Formula: cos(file 1, file 2) =  (file 1 · file 2)  /  ||file 1|| ||file 2|| ,

file 1 (0, 3, 0, 0, 2, 0, 0, 2, 0, 5)

file 2(1, 2, 0, 0, 1, 1, 0, 1, 0, 3)

file 1 · file 2  =  0*1 + 3*2 + 0*0 + 0*0 + 2*1 + 0*1 + 0*0 + 2*1 + 0*0 + 5*3  

                     =  25

||d1||= (0*0 + 3*3 + 0*0 + 0*0 + 2*2 + 0*0 + 0*0 + 2*2 + 0*0 + 5*5)0.5

         =(42)0.5  = 6.481

||d2||= (1*1 + 2*2 + 0*0 + 0*0 + 1*1 + 1*1 + 0*0 + 1*1 + 0*0 + 3*3)0.5

          =(17)0.5       = 4.12

cos(d, d2 ) = 0.94

 Click Here to try Automatic Tool of cosine similarity 

Prof. Fazal Rehman Shamil
Researcher, Publisher of International Journal Of Software Technology & Science ISSN: 2616-5325
Instructor, SEO Expert, Web Programmer and poet.
Feel free to contact.