Measuring Semantic Similarity between Words Using Web Search Engines
The University of Tokyo
Hongo, 7-3-1, Tokyo, Japan
National Institute of Advanced Industrial Science & Technology
Sotokanda, 1-18-13, Tokyo, Japan
The University of Tokyo
Hongo, 7-3-1, Tokyo, Japan
Copyright is held by the World Wide Web Conference Committee (IW3C2).
Distribution of these papers is limited to classroom use, and personal use by others.
WWW 2007, May 8-12, 2007, Banff, Alberta, Canada.
Semantic similarity measures play important roles in information retrieval and Natural Language Processing. Previous work in semantic web-related applications such as community mining, relation extraction, automatic meta data extraction have used various semantic similarity measures. Despite the usefulness of semantic similarity measures in these applications, robustly measuring semantic similarity between two words (or entities) remains a challenging task. We propose a robust semantic similarity measure that uses the information available on the Web to measure similarity between words or entities. The proposed method exploits page counts and text snippets returned by a Web search engine. We define various similarity scores for two given words P and Q, using the page counts for the queries P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using automatically extracted lexico-syntactic patterns from text snippets. These different similarity scores are integrated using support vector machines, to leverage a robust semantic similarity measure. Experimental results on Miller-Charles benchmark dataset show that the proposed measure outperforms all the existing web-based semantic similarity measures by a wide margin, achieving a correlation coefficient of . Moreover, the proposed semantic similarity measure significantly improves the accuracy (-measure of ) in a community mining task, and in an entity disambiguation task, thereby verifying the capability of the proposed measure to capture semantic similarity using web content.
Categories & Subject Descriptors
H.3.3Information SystemsInformation Search and Retrieval Algorithms semantic similarity, Web mining
semantic similarity, Web mining
1 IntroductionThe study of semantic similarity between words has long been an integral part of information retrieval and natural language processing. Semantic similarity between entities changes over time and across domains. For example, apple is frequently associated with computers on the Web. However, this sense of apple is not listed in most general-purpose thesauri or dictionaries. A user who searches for apple on the Web, may be interested in this sense of apple and not apple as a fruit. New words are constantly being created as well as new senses are assigned to existing words. Manually maintaining thesauri to capture these new words and senses is costly if not impossible.
We propose an automatic method to measure semantic similarity between words or entities using Web search engines. Because of the vastly numerous documents and the high growth rate of the Web, it is difficult to analyze each document separately and directly. Web search engines provide an efficient interface to this vast information. Page counts and snippets are two useful information sources provided by most Web search engines. Page count of a query is the number of pages that contain the query words 1. Page count for the query P AND Q can be considered as a global measure of co-occurrence of words P and Q. For example, the page count of the query ``apple'' AND ``computer'' in Google 2 is , whereas the same for ``banana'' AND ``computer'' is only . The more than times more numerous page counts for ``apple'' AND ``computer'' indicate that apple is more semantically similar to computer than is banana.
Despite its simplicity, using page counts alone as a measure of co-occurrence of two words presents several drawbacks. First, page count analyses ignore the position of a word in a page. Therefore, even though two words appear in a page, they might not be related. Secondly, page count of a polysemous word (a word with multiple senses) might contain a combination of all its senses. For an example, page counts for apple contains page counts for apple as a fruit and apple as a company. Moreover, given the scale and noise in the Web, some words might occur arbitrarily, i.e. by random chance, on some pages. For those reasons, page counts alone are unreliable when measuring semantic similarity.
Snippets, a brief window of text extracted by a search engine around the query term in a document, provide useful information regarding the local context of the query term. Semantic similarity measures defined over snippets, have been used in query expansion , personal name disambiguation  and community mining . Processing snippets is also efficient as it obviates the trouble of downloading web pages, which might be time consuming depending on the size of the pages. However, a widely acknowledged drawback of using snippets is that, because of the huge scale of the web and the large number of documents in the result set, only those snippets for the top-ranking results for a query can be processed efficiently. Ranking of search results, hence snippets, is determined by a complex combination of various factors unique to the underlying search engine. Therefore, no guarantee exists that all the information we need to measure semantic similarity between a given pair of words is contained in the top-ranking snippets.
This paper proposes a method that considers both page counts and lexico-syntactic patterns extracted from snippets, thereby overcoming the problems described above.
For example, let us consider the following snippet from Google for the query Jaguar AND cat.
Here, the phrase is the largest indicates a hypernymic relationship between the Jaguar and the cat. Phrases such as also known as, is a, part of, is an example of all indicate various semantic relations. Such indicative phrases have been applied to numerous tasks with good results, such as hyponym extraction  and fact extraction . From the previous example, we form the pattern X is the largest Y, where we replace the two words Jaguar and cat by two wildcards X and Y.
Our contributions in this paper are two fold:
- We propose an automatically extracted lexico-syntactic patterns-based approach to compute semantic similarity using text snippets obtained from a Web search engine.
- We integrate different web-based similarity measures using WordNet synsets and support vector machines to create a robust semantic similarity measure. The integrated measure outperforms all existing Web-based semantic similarity measures in a benchmark dataset. To the best of our knowledge, this is the first attempt to combine both WordNet synsets and Web content to leverage a robust semantic similarity measure.
The remainder of the paper is organized as follows. In section 2 we discuss previous works related to semantic similarity measures. We then describe the proposed method in section 3. Section 4 compares the proposed method against previous Web-based semantic similarity measures and several baselines on a benchmark data set. In order to evaluate the ability of the proposed method in capturing semantic similarity between real-world entities, we apply it in a community mining task. Finally, we show that the proposed method is useful for disambiguating senses in ambiguous named-entities and conclude this paper.
2 Related Work
Semantic similarity measures are important in many Web-related tasks. In query expansion [5,25,40] a user query is modified using synonymous words to improve the relevancy of the search. One method to find appropriate words to include in a query is to compare the previous user queries using semantic similarity measures. If there exist a previous query that is semantically related to the current query, then it can be suggested either to the user or internally used by the search engine to modify the original query.
Semantic similarity measures have been used in Semantic Web related applications such as automatic annotation of Web pages , community mining [23,19], and keyword extraction for inter-entity relation representation .
Semantic similarity measures are necessary for various applications in natural language processing such as word-sense disambiguation , language modeling , synonym extraction , and automatic thesauri extraction . Manually compiled taxonomies such as WordNet3 and large text corpora have been used in previous works on semantic similarity [16,31,13,17]. Regarding the Web as a live corpus has become an active research topic recently. Simple, unsupervised models demonstrably perform better when -gram counts are obtained from the Web rather than from a large corpus [14,15]. Resnik and Smith  extracted bilingual sentences from the Web to create a parallel corpora for machine translation. Turney  defined a point-wise mutual information (PMI-IR) measure using the number of hits returned by a Web search engine to recognize synonyms. Matsuo et. al,  used a similar approach to measure the similarity between words and apply their method in a graph-based word clustering algorithm.
Given a taxonomy of concepts, a straightforward method to calculate similarity between two words (concepts) is to find the length of the shortest path connecting the two words in the taxonomy . If a word is polysemous then multiple paths might exist between the two words. In such cases, only the shortest path between any two senses of the words is considered for calculating similarity. A problem that is frequently acknowledged with this approach is that it relies on the notion that all links in the taxonomy represent a uniform distance.
Resnik  proposed a similarity measure using information content. He defined the similarity between two concepts and in the taxonomy as the maximum of the information content of all concepts that subsume both and . Then the similarity between two words is defined as the maximum of the similarity between any concepts that the words belong to. He used WordNet as the taxonomy; information content is calculated using the Brown corpus.
Li et al.,  combined structural semantic information from a lexical taxonomy and information content from a corpus in a nonlinear model. They proposed a similarity measure that uses shortest path length, depth and local density in a taxonomy. Their experiments reported a Pearson correlation coefficient of on the Miller and Charles  benchmark dataset. They did not evaluate their method in terms of similarities among named entities. Lin  defined the similarity between two concepts as the information that is in common to both concepts and the information contained in each individual concept.
Recently, some work has been carried out on measuring semantic similarity using Web content. Matsuo et al.,  proposed the use of Web hits for extracting communities on the Web. They measured the association between two personal names using the overlap (Simpson) coefficient, which is calculated based on the number of Web hits for each individual name and their conjunction (i.e., AND query of the two names).
Sahami et al.,  measured semantic similarity between two queries using snippets returned for those queries by a search engine. For each query, they collect snippets from a search engine and represent each snippet as a TF-IDF-weighted term vector. Each vector is normalized and the centroid of the set of vectors is computed. Semantic similarity between two queries is then defined as the inner product between the corresponding centroid vectors. They did not compare their similarity measure with taxonomy-based similarity measures.
Chen et al.,  proposed a double-checking model using text snippets returned by a Web search engine to compute semantic similarity between words. For two words and , they collect snippets for each word from a Web search engine. Then they count the occurrences of word in the snippets for word and the occurrences of word in the snippets for word . These values are combined nonlinearly to compute the similarity between and . This method depends heavily on the search engine's ranking algorithm. Although two words and might be very similar, there is no reason to believe that one can find in the snippets for , or vice versa. This observation is confirmed by the experimental results in their paper which reports zero similarity scores for many pairs of words in the Miller and Charles  dataset.
We propose a method which integrates both page
counts and snippets to measure semantic similarity between a
given pair of words. In section 3.2, we define four similarity scores
using page counts. We then describe an automatic lexico-syntactic
pattern extraction algorithm in section 3.3. We rank the patterns extracted by our
algorithm according to their ability to express semantic
similarity. We use two-class support vector machines (SVMs) to
find the optimal combination of page counts-based similarity
scores and top-ranking patterns. The SVM is trained to classify
synonymous word-pairs and non-synonymous word-pairs. We select
synonymous word-pairs (positive training examples) from WordNet
synsets4. Non-synonymous word-pairs (negative
training examples) are automatically created using a random
shuffling technique. We convert the output of SVM into a
posterior probability. We define the semantic similarity between
two words as the posterior probability that they belong to the
synonymous-words (positive) class.
Page counts for the
query P AND Q, can be considered as an approximation of
co-occurrence of two words (or multi-word phrases) and on the Web. However, page counts for the query P AND
Q alone do not accurately express semantic similarity. For
example, Google returns as the page count for ``car'' AND
``automobile'', whereas the same is for ``car'' AND ``apple''. Although,
automobile is more semantically similar to car than
apple is, page counts for query ``car'' AND
``apple'' are more than four times greater than those for the
query ``car'' and ``automobile''. One must consider the
page counts not just for the query P AND Q, but also for
the individual words P and Q to assess semantic
similarity between P and Q.
2 Page-count-based Similarity Scores
We modify four popular co-occurrence measures; Jaccard, Overlap (Simpson), Dice, and PMI (Point-wise mutual information), to compute semantic similarity using page counts.
For the remainder of this paper we use the notation
to denote the page counts
for the query in a search engine.
The WebJaccard coefficient between words (or multi-word
phrases) and ,
Therein, denotes the conjunction query P AND Q. Given the scale and noise in Web data, it is possible that two words may appear on some pages purely accidentally. In order to reduce the adverse effects attributable to random co-occurrences, we set the WebJaccard coefficient to zero if the page count for the query is less than a threshold 5.
Similarly, we define WebOverlap,
WebOverlap is a natural modification to the Overlap (Simpson) coefficient.
We define the WebDice coefficient as a variant of the
We define WebPMI as a variant form of PMI using page
Here, is the number of documents indexed by the search engine. Probabilities in Eq. 4 are estimated according to the maximum likelihood principle. To calculate PMI accurately using Eq. 4, we must know , the number of documents indexed by the search engine. Although estimating the number of documents indexed by a search engine  is an interesting task itself, it is beyond the scope of this work. In the present work, we set according to the number of indexed pages reported by Google.
snippets are returned by search engines alongside with the search
results. They provide valuable information regarding the local
context of a word. We extract lexico-syntactic patterns that
indicate various aspects of semantic similarity. For example,
consider the following text snippet returned by Google for the
query ``cricket'' AND ``sport''.
3 Extracting Lexico-Syntactic Patterns from Snippets
Here, the relationship between Toyota and Nissan is that they are both car manufacturers. Identifying the exact set of words that convey the semantic relationship between two entities is a difficult problem which requires deeper semantic analysis. However, such an analysis is not feasible considering the numerous ill-formed sentences we need to process on the Web. In this paper, we propose a shallow pattern extraction method to capture the semantic relationship between words in text snippets.
Our pattern extraction algorithm is illustrated in Figure 3. Given a set of synonymous word-pairs, GetSnippets function returns a list of text snippets for the query ``A" AND ``B" for each word-pair in . For each snippet found, we replace the two words in the query by two wildcards. Let us assume these wildcards to be X and Y. For each snippet in the set of snippets returned by GetSnippets, function GetNgrams extract word -grams for and . We select -grams which contain exactly one X and one Y. For example, the snippet in Figure 2 yields patterns X and Y, X and Y are, X and Y are two. Finally, function CountFreq counts the frequency of each pattern we extracted. The procedure described above yields a set of patterns with their frequencies in text snippets obtained from a search engine. It considers the words that fall between and as well as words that precede and succeeds .
To leverage the pattern extraction process, we select pairs of synonymous nouns from WordNet synsets. For polysemous nouns we selected the synonyms for the dominant sense. The pattern extraction algorithm described in Figure 3 yields unique patterns. Of those patterns, occur less than times. It is impossible to train a classifier with such numerous sparse patterns. We must measure the confidence of each pattern as an indicator of synonymy. For that purpose, we employ the following procedure.
First, we run the pattern extraction algorithm described in Figure 3 with a non-synonymous set of word-pairs and count the frequency of the extracted patterns. We then use a test of statistical significance to evaluate the probable applicability of a pattern as an indicator of synonymy. The fundamental idea of this analysis is that, if a pattern appears a statistically significant number of times in snippets for synonymous words than in snippets for non-synonymous words, then it is a reliable indicator of synonymy.
To create a set of non-synonymous word-pairs, we select two nouns from WordNet arbitrarily. If the selected two nouns do not appear in any WordNet synset then we select them as a non-synonymous word-pair. We repeat this procedure until we obtain pairs of non-synonymous words.
For each extracted pattern , we create a contingency table, as shown in Table 1 using its frequency in snippets for synonymous word pairs and in snippets for non-synonymous word pairs. In Table 1, denotes the total frequency of all patterns in snippets for synonymous word pairs () and is the same in snippets for non-synonymous word pairs ().
|Freq. in snippets for|
|synonymous word pairs|
|Freq. in snippets for|
|non-synonymous word pairs|
Using the information in Table 1, we
calculate the  value for each pattern
Before we proceed to the integration of patterns and page-counts-based similarity scores, it is necessary to introduce some constraints to the development of semantic similarity measures. Evidence from psychological experiments suggest that semantic similarity can be context-dependent and even asymmetric [39,22]. Human subjects have reportedly assigned different similarity ratings to word-pairs when the two words were presented in the reverse order. However, experimental results investigating the effects of asymmetry reports that the average difference in ratings for a word pair is less than percent . In this work, we assume semantic similarity to be symmetric. This is in line with previous work on semantic similarity described in section 2. Under this assumption, we can interchange the query word markers X and Y in the extracted patterns.
4 Integrating Patterns and Page Counts
In section 3.2 we defined four similarity scores using page counts. Section 3.3 described a lexico-syntactic pattern extraction algorithm and ranked the patterns according to their ability to express synonymy. In this section we describe leverage of a robust semantic similarity measure through integration of all the similarity scores and patterns described in previous sections.
For each pair of words , we create a feature vector as shown in Figure 4. First, we query Google for ``" AND ``" and collect snippets. Then we replace the query words and with two wildcards and , respectively in each snippet. Function GetNgrams extracts -grams for and from the snippets. We select -grams having exactly one and one as we did in the pattern extraction algorithm in Figure 3. Let us assume the set of patterns selected based on their values in section 3.3 to be . Then, the function SelectPatterns selects the -grams from which appear in . In , we normalize the count of each pattern by diving it from the total number of counts of the observed patterns. This function returns a vector of patterns where each element is the normalized frequency of the corresponding pattern in the snippets for the query . We append similarity scores calculated using page counts in section 3.2 to create the final feature vector for the word-pair . This procedure yields a dimensional ( page-counts based similarity scores and lexico-syntactic patterns) feature vector . We form such feature vectors for all synonymous word-pairs (positive training examples) as well as for non-synonymous word-pairs (negative training examples). We then train a two-class support vector machine with the labelled feature vectors.
Once we have trained an SVM using synonymous and
non-synonymous word pairs, we can use it to compute the semantic
similarity between two given words. Following the same method we
used to generate feature vectors for training, we create a
feature vector for the given
pair of words
which we need to measure the semantic similarity. We define the
between and as the posterior probability
that feature vector belongs to
the synonymous-words (positive) class.
Being a large-margin classifier, the output of an SVM is the distance from the decision hyper-plane. However, this is not a calibrated posterior probability. We use sigmoid functions to convert this uncalibrated distance into a calibrated posterior probability (see  for a detailed discussion on this topic).
We conduct two sets of experiments to evaluate the proposed semantic similarity measure. First we compare the similarity scores produced by the proposed measure against Miller-Charles benchmark dataset. We analyze the behavior of the proposed measure with the number of patterns used as features, the number of snippets used to extract the patterns, and the size of the training dataset. Secondly, we apply the proposed measure in two real-world applications: community mining and entity disambiguation.
1 The Benchmark DatasetWe evaluate the proposed method against Miller-Charles  dataset, a dataset of word-pairs6 rated by a group of human subjects. The word pairs are rated on a scale from (no similarity) to (perfect synonymy). Miller-Charles' data set is a subset of Rubenstein-Goodenough's  original data set of word pairs. Although Miller-Charles experiment was carried out years later than Rubenstein-Goodenough's, two sets of ratings are highly correlated (pearson correlation coefficient=). Therefore, Miller-Charles ratings can be considered as a reliable benchmark for evaluating semantic similarity measures.
2 Pattern Selection
|X, Y :|
|X or Y|
|X Y for|
|X . the Y|
|with X ( Y|
|X and Y are|
|X of Y|
We trained a linear kernel SVM with top pattern features (ranked according to their values) and calculated the correlation coefficient against the Miller-Charles benchmark dataset. Results of the experiment are shown in Figure 5. In Figure 5 a steep improvement of correlation with the number of top-ranking patterns is appearent; it reaches a maximum at features. With more than patterns correlation drops below this maximum. Considering that the patterns are ranked according to their ability to express semantic similarity and the majority of patterns are sparse, we selected only the top ranking patterns for the remaining experiments.
Features with the highest linear kernel weights are shown in Table 2 alongside their values. The weight of a feature in the linear kernel can be considered as a rough estimate of the influence it imparts on the final SVM output. WebDice has the highest kernel weight followed by a series of pattern-based features. WebOverlap (rank=, weight=), WebJaccard (rank=, weight=) and WebPMI (rank=, weight=) are not shown in Table 2 because of space limitations. It is noteworthy that the pattern features in Table 2 agree with intuition. Lexical patterns (e.g., X or Y, X and Y are, X of Y) as well as syntax patterns (e.g., bracketing, comma usage) are extracted by our method.
We experimented with different kernel types as shown in Table 3. Best performance is achieved with the linear kernel. When higher degree kernels such as quadratic (Polynomial degree=) and cubic (Polynomial degree=) are used, correlation with the human ratings decreases rapidly. Second best is the Radial Basis Functions (RBFs), which reports a correlation coefficient of . For the rest of the experiments in this paper we use the linear kernel.
3 Semantic Similarity
|Word Pair||Miller-||Web||Web||Web||Web||Sahami ||CODC ||Proposed|
We score the word pairs in Miller-Charles' dataset using the
page-count-based similarity scores defined in section 3.2, Web-based semantic similarity
measures proposed in previous work (Sahami , Chen ) and the proposed method (SemSim).
Results are shown in Table 4. All figures,
except those for the Miller-Charles ratings, are normalized into
values in range for ease of
comparison. Pearson's correlation coefficient is invariant
against a linear transformation. Proposed method earns the
highest correlation of in our
experiments. It shows the highest similarity score for the
word-pair magician and wizard. Lowest similarity is
reported for cord and smile7. Our
reimplementation of Co-occurrence Double Checking (CODC) measure
 indicates the second-best
correlation of . The CODC
measure is defined as,
Therein, denotes the number of occurrences of in the top-ranking snippets for the query in Google. is the page count for query . is a constant in CODC model and it is set to according to Chen et al., . CODC measure reports zero similarity scores for many word-pairs in the benchmark. One reason for this sparsity in CODC measure is that even though two words in a pair are semantically similar, we might not always find among the top snippets for (and vice versa). As might be appearent from the definition of the CODC measure in Eq. 7, it returns zero under these conditions. Ranking of snippets, (hence the value of ), depends directly upon the search engine's specifications. A search engine considers various factors such as novelty, authority, link structure, user preferences when ranking search results. Consequently, CODC measure is influenced by these factors.
Similarity measure proposed by Sahami et al.  is placed third, reflecting a correlation of . This method use only those snippets when calculating semantic similarity. Among the four page-counts-based measures, WebPMI garners the highest correlation (). Overall, the results in Table 4 suggest that similarity measures based on snippets are more accurate than the ones based on page counts in capturing semantic similarity.
4 Taxonomy-Based Methods
|Li et al. (2003)|
|Jiang & Conrath (1998)|
Table 5 presents a comparison of the
proposed method to the WordNet-based methods. The proposed method
outperforms simple WordNet-based approaches such as Edge-counting
and Information Content measures. It is comparable with Lin
(1998)  Jiang &
Conrath (1998)  and Li
(2003)  methods. Unlike
the WordNet based methods, proposed method requires no a
hierarchical taxonomy of concepts or sense-tagged definitions of
Therefore, in principle the proposed method could be used to calculate semantic similarity between named entities, etc, which are not listed in WordNet or other manually compiled thesauri. However, considering the high correlation between human subjects (), there is still room for improvement.
5 Accuracy vs. Number of Snippets
We computed the correlation with the Miller-Charles ratings for different numbers of snippets to investigate the effect of the number of snippets used to extract patterns upon the semantic similarity measure. The experimental results are presented in Figure 6. From Figure 6 it is appearent that overall the correlation coefficient improves with the number of snippets used for extracting patterns. The probability of finding better patterns increases with the number of processed snippets. That fact enables us to represent each pair of words with a rich feature vector, resulting in better performance.
6 Training Data
We used synonymous word pairs extracted from WordNet synsets as positive training examples and automatically generated non-synonymous word pairs as negative training examples to train a two-class support vector machine in section 3.4. To determine the optimum combination of positive and negative training examples, we trained a linear kernel SVM with different combinations of positive and negative training examples and evaluated accuracy against the human ratings in the Miller-Charles benchmark dataset. Experimental results are summarized in Figure 7. Maximum correlation coefficient of is achieved with positive training examples and negative training examples. Moreover, Figure 7 reveals that correlation does not improve beyond positive and negative training examples. Therefore, we can conclude that examples are sufficient to leverage the proposed semantic similarity measure.
Measuring semantic similarity between
named entities is vital in many applications such as query
expansion , entity
disambiguation (e.g. namesake disambiguation) and community
mining . Because
most named entities are not covered by WordNet, similarity
measures that are based on WordNet cannot be used directly in
these tasks. Unlike common English words, named entities are
being created constantly. Manually maintaining an up-to-date
taxonomy of named entities is costly, if not impossible. The
proposed semantic similarity measure is appealing for these
applications because it does not require pre-compiled taxonomies.
7 Community Mining
In order to evaluate the performance of the proposed measure in capturing the semantic similarity between named-entities, we set up a community mining task. We select personal names from communities: tennis players, golfers, actors, politicians and scientists , ( names from each community) from the open directory project (DMOZ)8. For each pair of names in our data set, we measure their similarity using the proposed method and baselines. We use group-average agglomerative hierarchical clustering (GAAC) to cluster the names in our dataset into five clusters.
Initially, each name is assigned to a separate cluster. In
subsequent iterations, group average agglomerative clustering
process, merges the two clusters with highest correlation.
clusters and is defined as the following,
Here, is the merger of the two clusters and . denotes the number of elements (persons) in and is the semantic similarity between two persons and in . We terminate GAAC process when exactly five clusters are formed. We adopt this clustering method with different semantic similarity measures to compare their accuracy in clustering people who belong to the same community.
We employed the B-CUBED metric  to evaluate the clustering results. The B-CUBED evaluation metric was originally proposed for evaluating cross-document co-reference chains. We compute precision, recall and -score for each name in the data set and average the results over the dataset. For each person in our data set, let us denote the cluster that belongs to by . Moreover, we use to denote the affiliation of person , e.g., ``Tiger Woods"``Tennis Player". Then we calculate precision and recall for person as,
Since, we selected people from each of the five categories, the denominator in Formula 10 is for all the names .
Then, the -score of person
is defined as,
Overall precision, recall and -score are computed by taking the averaged sum over all the names in the dataset.
Here, is the set of names selected from the open directory project. Therefore, in our evaluations.
Experimental results are shown in Table 6. The proposed method shows the highest entity clustering accuracy in Table 6 with a statistically significant ( Tukey HSD) F score of . Sahami et al. 's snippet-based similarity measure, WebJaccard, WebDice and WebOverlap measures yield similar clustering accuracies.
- Doulike.com - online dating sites
- seattle dating
- houston dating
- chicago dating
- madison dating
- atlanta dating
- las vegas dating
- san diego dating
- mobile dating
- omaha dating
- dallas dating
- denver dating
- san antonio dating
- boston dating
- new york dating
- pittsburgh dating
- portland dating
- austin dating
- miami dating
- san francisco dating
- orlando dating
- charlotte dating
- milwaukee dating
- los angeles dating
- lake charles dating
- tampa dating
- phoenix dating
- nashville dating
- sacramento dating
- tucson dating
- el paso dating
- colorado springs dating
- huntsville dating
- tulsa dating
- spokane dating
- indianapolis dating
- albuquerque dating
- philadelphia dating
- cincinnati dating
- louisville dating
- knoxville dating
- minneapolis dating
- kansas city dating
- grand rapids dating
- new orleans dating
- san jose dating
- detroit dating
- cleveland dating
- boise dating
- asheville dating
- fort worth dating
- fresno dating
- reno dating
- baltimore dating
- myrtle beach dating
- wichita dating
- bakersfield dating
- memphis dating
- columbia dating
- salt lake city dating
- columbus dating
- charleston dating
- raleigh dating
- jacksonville dating
- pensacola dating
- ann arbor dating
- des moines dating
- corpus christi dating
- buffalo dating
- anchorage dating
- tallahassee dating
- sarasota dating
- toledo dating
- savannah dating
- chattanooga dating
- lubbock dating
- sioux falls dating
- baton rouge dating
- fargo dating
- honolulu dating
- odessa dating
- fort collins dating
- vancouver dating
- washington dating
- rochester dating
- fort wayne dating
- evansville dating
- virginia beach dating
- amarillo dating
- richmond dating
- oklahoma city dating
- eugene dating
- greenville dating
- syracuse dating
- mesa dating
- tacoma dating
- las cruces dating
- lansing dating
- modesto dating
- boulder dating
- gainesville dating
- kalamazoo dating
- hollywood dating
- greensboro dating
- muskegon dating
- naples dating
- shreveport dating
- dayton dating
- cedar rapids dating
- little rock dating
- green bay dating
- augusta dating
- women seeking for men
- ocala dating
- birmingham dating
- pueblo dating
- rockford dating
- abilene dating
- south bend dating
- midland dating
- akron dating
- topeka dating
- santa rosa dating
- flint dating
- college station dating
- fort myers dating
- scottsdale dating
- lincoln dating
- albany dating
- tuscaloosa dating
- long beach dating
- stamford dating
- west palm beach dating
- winston salem dating
- harrisburg dating
- fort lauderdale dating
- laredo dating
- new haven dating
- wichita falls dating
- bellevue dating
- berkeley dating
- palm bay dating
- temecula dating
- lafayette dating
- beaumont dating
- lakewood dating
- clarksville dating
- springfield dating
- lexington dating
- visalia dating
- billings dating
- canton dating
- murfreesboro dating
- elk grove dating
- peoria dating
- irvine dating
- santa maria dating
- ogden dating
- clearwater dating
- olathe dating
- stockton dating
- athens dating
- simi valley dating
- naperville dating
- gilbert dating
- cape coral dating
- lakeland dating
- newark dating
- plano dating
- killeen dating
- victorville dating
- bradenton dating
- thousand oaks dating
- pasadena dating
- baytown dating
- longview dating
- panama city dating
- fairfield dating
- wilmington dating
- overland park dating
- lancaster dating
- everett dating
- youngstown dating
- erie dating
- aurora dating
- worcester dating
- fayetteville dating
- beaverton dating
- boca raton dating
- conroe dating
- katy dating
- santa clarita dating
- ventura dating
- arlington dating
- macon dating
- hartford dating
- durham dating
- manchester dating
- providence dating
- oceanside dating
- riverside dating
- jersey city dating
- norfolk dating
- frisco dating
- carlsbad dating
- round rock dating
- spartanburg dating
- oakland dating
- marietta dating
- daytona beach dating
- cambridge dating
- roseville dating
- huntington beach dating
- chula vista dating
- fremont dating
- reading dating
- chandler dating
- hampton dating
- tempe dating
- tyler dating
- allentown dating
- decatur dating
- kissimmee dating
- montgomery dating
- garland dating