CMSC485 03 Special Topics : Question Answering Systems

Spring 2003

Course Project

Due at the beginning of the scheduled presentation, on Monday, April 28 or Wednesday, April 30


Goal for the assignment: to gain a basic experience in the design, implementation, and evaluation of question­answering (QA) systems. The project is fairly open­ended. You are a member of a project team of two who is to implement a QA system that will operate in the standard TREC QA framework: the input to the system is a question, the output is a ranked list of five guesses for the answer. No human intervention is allowed in deriving answers.


For the assignment, we are providing a QA corpus that contains a set of questions and the expected answer(s) for each question. Since we can't make available to you the actual 9GB TREC collection used in the TREC QA studies, we will instead provide the top 20 documents retrieved by the Smart IR system (from a similarly large text collection) for each question in the corpus. Answers to each question are to be extracted from these 20 documents. Note that it is possible for some questions that none of the 20 retrieved documents contains the answer.

As noted above, the project is completely open­ended: you are free to build whatever components you'd like to include in your QA system and are free to use any publicly available software that you wish. You can even share components that you build with others in the class.

The primary caveats are that your system cannot use the answers provided and must make clear in the write­up what components you used that you did not write yourself.
Assume that your system has entered the 50­byte (short answer) QA track so all answers should be 10 or fewer words in length. In addition, the output for each question should be the following:
    question# document­id answer­text(for top­ranked guess)
    question# document­id answer­text(for second guess)
    question# document­id answer­text(for third guess)
    question# document­id answer­text(for fourth guess)
    question# document­id answer­text(for fifth guess)
The document­id refers to the document where the answer string was found. Use "nil" as the answer­text if your system finds no answer for a particular question.


What is provided :
    questions.txt: the questions (http://www.tcnj.edu/~mmmartin/CMSC485/Project/questions.txt).
            Feel free to change the format of this file if it makes automatic processing of the questions easier. Alternatively, you can use the questions as they appear in
            the "answers" file described below. In either case, you will need to keep around the question number to include as part of the answers file that your system produces.
    answers.txt: all answers found by TREC assessors for each question (http://www.tcnj.edu/~mmmartin/CMSC485/Project/answers.txt).
            The format of this file should be pretty clear. For each question, the file contains: (1) one line with the question number, (2) one line with the question, (3) list of document
            id's followed by answer strings, one per line, (4) a blank line separates the information for each question. Feel free to modify the format of this file if it's easier for your
            system to process.
    top 20 documents retrieved for each question: A gzipped­tar file with the top 20 documents retrieved for each question by Smart can be downloaded from
            http://www.tcnj.edu/~mmmartin/CMSC485/Project/top-20.tar.gz. (WinZip should open this file as well.)


Implementation hints :
    Start simple!! Select some really really dumb strategy to produce answers for each question just to make sure that you will have something to evaluate and to turn in. Only after you can do that should you proceed to something more sophisticated. It's fine to try a strategy very different from anything discussed in class. It's even fine if the system that you produce does terribly in terms of performance. You just need to be able to argue (in your write­up) why the strategy that you investigated MIGHT have worked. One possibility is to try using Lemur (http://www-2.cs.cmu.edu/~lemur/) or Smart system to implement a passage retrieval strategy for question answering. Another is to instead focus on one type of question, e.g. "who" questions, and develop a strategy specifically for that question type.


What to turn in :
    1. A description of your QA system. Enough detail should be provided so that, in theory at least, I could re­implement it. The description should explain each component in your QA system, the steps that your system takes to answer a question, any additional on­line sources of information used by the system, etc. Make clear which components of the system you built yourself vs. downloaded from elsewhere vs. got from another student in the course.
    2. The output file of answers produced by your system for the questions from the development corpus that we provided. The answers should be in the format described above.
    3. An evaluation (e.g. using the mean reciprocal rank evaluation measure) and analysis of your system's performance on the questions from the development corpus provided. How well did the system work? What worked? What didn't work? Can you say anything about which component is strongest/weakest?
    4. A detailed walk­through of what your system did to handle one question (any one) in the corpus.
    5. The output from your system for the question selected in (4) above. Enough information should be included in the output to convince me that the system is following the steps
described in (1). It is not necessary to submit your code, but I may ask to see it in cases where the system description is unclear.


Presentation guidelines :
    1. The presentation should be a 35 minutes talk.
    2. An additional 5 minutes questions session should follow the talk.
    3. The talk should include a simple demonstration which is not to exceed 15 minutes in length.