OVERLAP COSINE vs BINARY COSINE SIMILARITY FOR CO-AUTHOR COCITATION
This page shows crossplots of overlap cosine similarities against binary cosine similarities from 27 collections of papers covering various subjects. For each collection of papers the 200 most cited reference authors were used. A paper to reference author matrix, O(p,ra) was constructed. Element o(i,j), the (i,j)th element of O(p,ra) is equal to the number of times paper i cites reference author j.
Assume the general formula s(i,j)= A(i,j)/sqrt[B(i)*B(j)] defines the similarity between author i and author j.
For binary cosine similarity A(i,j) is the number of papers citing both authors i and author j. B(i) and B(j) is the number of papers citing author i and author j respecively.
For overlap cosine similarity A(i,j)=sum[min(o(k,i),o(j,k)], with k summed over all papers. Thus A(i,j) is the sum over all papers of the overlap in the number of citations to author i and author j within each paper. B(i) and B(k) is the sum over all papers of the number of times in each paper that author i and author j were cited respectively.
The overlap cosine and binary cosine s values are crossplotted below. Overlap cosine is on the x axis and binary cosine s is on the y axis.
These papers were collected from ISI's Web of Science product in the period from January 2002 to December 2003 by using queries and seed references.
