Animate y forma parte del Clan !!!

Dice's Coefficient

Dice's coefficient


Formula :
  • S = ( 2 * Nt ) / ( Nx + Ny )

AWK code:

#Obtain the number of Bigrams in the intersection between Word1 and Word2
function Intersection(BigramsWord1,BigramsWord2) {

Nt=0;
for (Bigram in BigramsWord1) {
if (Bigram in BigramsWord2) {
if (BigramsWord1[Bigram] <= BigramsWord2[Bigram])
Nt = Nt + BigramsWord1[Bigram]

else
Nt = Nt + BigramsWord2[Bigram];
}
}
return Nt;
}


#Obtain the Bigrams and the number of Bigrams per Word

function
ObtainBigrams(Word,LettersWord,BigramsWord)
{
Bigram = "";

Cardinality = 0;
NumberBigrams = 0;

WordLength=length(Word);
for (i=1; i<=WordLength; i++) {
Bigram = Bigram""LettersWord[i];

Cardinality++;

if (Cardinality == 2) {
Cardinality = 1;
NumberBigrams++;
BigramsWord[Bigram]++;

Bigram=LettersWord[i];

}
}
return
NumberBigrams;

}

#Obtain the DICE's coefficient between two words
function DICE (Word1,Word2) {
split
(Word1, LettersWord1,"");

split(Word2, LettersWord2,"");
Nx = ObtainBigrams(Word1,LettersWord1,BigramsWord1);
Ny = ObtainBigrams(Word2,LettersWord2,BigramsWord2);

Nt = Intersection(BigramsWord1,BigramsWord2);
if ((Nx+Ny) > 0)
return (2*Nt)/(Nx+Ny)
else
return 0;
}

BEGIN{
FS=",";
}
{
Word1=$1;

Word2=$2;
print DICE(Word1,Word2);
}

Example:

Posted in Etiquetas: |

0 comentarios: