Evaluating Anaphora Resolution Algorithms

Author(s) Algorithm Type Key Features & Characteristics  Tested Results  Corpus Used  Paper(s) 
Lappin S. and Lease H. Pronominal Resolution Using Salience Measures Operates on in-depth syntactic information. Salience-based discourse model: weighting factors include Context Recency, Subject/Direct-Object emphasis, etc. Use Equivalence Classes to maintain coreference. Recall=85% Precision=85-87% 5 Computer manual texts containing 82,000 words with 560 pronominal anaphors "An Algorithm For Pronominal Anaphora Resolution" (1994) 
Kennedy C. and Boguraev B. Modified Lappin/Leass Model For Pronominal Resolution Flat morpho-syntactic analysis using output of POS tagger. Less robust than Lappin/Leass's. Does not require full syntactic text parsing. Use COREF classes and Salience weights similar to Lappin/Leass.  Accuracy=75% 27 Random Web Documents, 231/306 anaphors correctly resolved  "Anaphora in a Wider Context: Tracking Discourse Referents" 
Srinivas B. and Baldwin B. Super-tag Representation for Proper-noun Resolution LTAG formalism: syntactic trees of sentences. Each word associated with a number of supertags for each syntactic configuration it may appear in. Use supertag disambiguation to select the appropriate one. Use established dependencies among supertags for resolution. Recall=32% Precision=79% (Proper-Noun Resolution Only) 1000 Sentences (source unknown) "Exploiting Super tag Representation for Fast Coreference Resolution" (1996) 
Azzam, Humphreys, Gaizauskas Focus-Based Approach For Pronoun Resolution Assumes that anaphor generally refer to the current discourse focus. Use focus registers and stack to keep track of states and events. Weakness: too reliant on the accuracy and completeness of grammatical information. No significant performance increase over the simple heuristic-based approach  Recall=55.4% Precision=70-75% 30 Wall Street Journal articles, averaging 462 words. Total number of coreferences = 1627. "Quantitative Evaluation of Coreference Algorithms in an Information Extraction System" (2000) 
Mitkov R.  Knowledge-poor Apporach for Pronominal Resolution Using text preprocessed by a POS tagger and some syntactic constraints to score antecedent candidates. Scope of candidate antecedents limited to 2 sentences before/after the anaphor  Accuracy=89.7% (Ratio of Precision to Recall  Random sample text from an English technical manual(141 pgs). Out of 71 pronouns, 48 are anaphorically relevant.  "Robust Pronoun Resolution With Limited Knowledge" (1998) 
Williams S..  Rule-Based Resolution of Non-Reference Noun Phrases (NRNP) and Reference Noun Phrases (RNP) in unrestricted text For summarization purposes. Using rules and knowledge bases of names, titles, and General Knowledge. Accuracy=76% (After 3 estimates) 61% (after first estimate)  collection of sentences from newspaper articles and New Scientist Journal "Rule-Based Reference Resolution for Unrestricted Text using POS tagging and NP parsing(1996)