THE COLLEGE OF NEW JERSEY
Computer Science Department
CMSC 485 03 : Special Topics : Question Answering Systems
Spring 2003


About the course :

The goal of the information extraction (IE) is the design of systems that are capable of analyzing only the text passages which contain relevant (the system given) information. In addition, such systems do not try a comprehensive analysis of all text documents but purpously overlook the irrelevant information.

Textual Question/Answering (QA) systems represent the most current trend in the information extraction from free on-line sources of text. A goal is the construction of systems, which can identify the answers to a natural-language question from a large quantity of on-line text documents. In contrast to information retrieval systems which supply a quantity of documents as a result of a simple, word-based search, QA systems are capable of  identifying the exact passages in the text of relevant documents which represent the concrete answer. In addition, there are no restrictions concerning the subject of the natural-language questions.

Topics to be addressed include, but are not limited to:
   - Question answering language resources (LR) and scientific algorithm developments
   - Guidelines, standards, specifications, models and best practices for question answering LR
   - Methods, tools, and procedures for the acquisition, creation, management, access, distribution, and use of
      question answering LR
    - LR and evaluation and benchmarking of question answering systems and algorithms for tasks including:
    - Advanced question analysis
    - Answer discovery and integration
    - Answer explanation and presentation generation
    - Interactive question answering

Possible joint products to be created include:
    - List of existing resources and ones under development (with planned release dates)
    - Updates to ARDA Q&A Roadmap (www-nlpir.nist.gov/projects/duc/papers/qa.Roadmap-paper_v2.doc)
    - List of Evaluation methods and benchmarks of question answering systems
    - List of unresolved research problems and/or areas in question answering
    - Shared knowledge of research groups and efforts

Prerequisites : Advanced Algorithms + Instructor's Permission


Class Time:
 
Section 03 : Monday, Wednesday 9:30-10:50 at Holman Hall 128.


Textbooks:
 
          I.
"Modern Information Retrieval"
by: R. Baeza-Yates, B. Ribeiro-Neto
Published by Addison Wesley
ISBN 0-201-39829-X
          II.
"Mathematical Foundations of Information Retrieval "
by: S. Dominich
Published by Kluwer Publishing
ISBN 0-7923-6861-4


Instructor:
 
          Dr. Miroslav Martinovic.


E-mail Address :
 
          mmmartin@tcnj.edu


Telephone :
 
          (609) 771-2789.


Office :
 
          Holman Hall 243.


Office Hours :
 
Monday : 
9-9:30
12:30-2:30
Wednesday : 
9-9:30
12:20-1:20 (by appt. only)
 
Thursday : 
9-11
12-3 (Research Presentation Appointments)


Grading Policy:
 

Attendance, Class Participation and Effort
20%
Topic or Paper Presentation
         Topic/Paper List
         Paper critique and presentation guidelines
35%
Project with Presentation and Demo
         Project with guidelines and resources (Courtesy of C. Cardie)
45%

 
 
 



 
 
CMSC 485 03
Tentative Schedule



 
Week 1 and 2 
Introduction to Corpus-Based Question Answering

What is corpus-based Q&A ?
Evaluations of Q&A Systems : TREC
Current Approaches to Q&A
NLP & IR for Q&A Systems
Semantics in Q&A Systems

Slides (transparencies used in class)

Courtesy of : C. Monz and M. de Rijke



 
Week 2 and 3
What's in Store for Question Answering ? Ask Jeeves

"Take-home" messages when considering Q&A task
Some anectdotes and a few statistics
Prognostications

Slides (transparencies used in class)

Courtesy of : J.B. Lowe



 
Week 4 
Web Information Retrieval : Google's Success Paper presentation and critique.
Papers/Google/icde.pdf



 
Week 5 
Essential Properties of Information Retrieval : NLP for IR Paper presentation and critique.
Papers/NLPforIR/NLP-IR.pdf


Week 6
NLP Tools : Generic Retrieval Systems (SMART System) Paper presentation and critique with a demonstration session.
Papers/SMART/SmartCourse.html



 
  Week 7 
NLP Tools : Part-of-Speech Tagger (Eric Brill's Part-of-Speech Tagger) Paper presentation and critique, tagger installation and demonstration. 
Paper : Papers/POSTagger/aaai94-tagger.ps
Resource directory : ~mmmartin/Information Retrieval/EricBrill'sTagger/



 
Week 8 
NLP Tools : Parsers (Apple Pie Parser for English) Paper presentation  and critique with a demonstration session.
Papers : Papers/APParser/manual.ps, Papers/APParser/APParser.htm
Resource directory (springfield) : /projects/mmmartin/Information Retrieval/NYU Parser/



 
Week 9 
NLP Tools : Electronic Lexicons (WordNet) Paper presentation and critique with a demonstration session.
Documentation : http://www.cogsci.princeton.edu/~wn/doc.shtml 
Resource directory : ~mmmartin/www/CMSC485/Papers/WordNet/



 
Week 10 and 11 
Advanced Question Answering : Pleanty of Challenges to Go Around

AQAINT Program
Introducing ARDA
Advanced Question Ansering
     Multiple Approaches
     AQAINT Program
     Challenges from AQAINT Perspective
Some Final Thoughts

Slides (transparencies used in class)

Courtesy of : ARDA and J.D. Prange



 
Week 12 and 13 
Issues, Tasks and Program Structures to Roadmap Research in Q&A

Issues in Q&A Research
    Question Classes: Need for question taxonomies
    Question Processing: Understanding, Ambiguities, Implicatures and Reformulations
    Context and Q&A
    Data Sources for Q&A
    Answer Extraction: Justification and Evaluation of Answer Correctness
    Answer Formulation
    Real Time Question Answering
    Interactive Q&A
    Advanced Reasoning for Q&A
    User Profiling for Q&A
    Collaborative Q&A
Milestones in the Program
Evaluation Framework

Slides (transparencies used in class)

Courtesy of : J.Burger, et. al.



 
Week 13 
Named Entity Recognition Paper presentation and critique.
Paper Resource Directory : Papers/NER/



 
Week 14 
Anaphora Resolution Paper presentation and critique.
Paper Resource Directory : Papers/Anaphora/


Project Presentations and Demos
Week 14