Clay v1.3.x Requirements Specification

Clay Logo

Document Summary

This document specifies in a succinct manner the requirements for Clay v1.3.x.

Document Version and Status

This is version 0.2 of this document. The latest copy can be found at http://www.tcnj.edu/~assistf/clay/clay-1.3/docs/requirements.spec.html This document is pending approval.

Document Language and Keywords

This document uses the words MUST, MAY, WILL, SHOULD, & NOT in a specific manner.

If the system MUST support a requirement, then the system is incomplete without the implementation of the requirement. If the system MAY support a requirement, then the system WILL work with or without the implementation of the requirement. If the system WILL support the implementation of the requirement, then the implementation of that requirement forms part of the core functionality of the system. WILL is similar to MUST. If the system SHOULD support the requirement, then the system's core functionality is complete without the requirement, but for normal use the requirement is to be implemented. If the word NOT is used in conjuction with the previous four keywords, it reverses the sense of the meaning. For example, if the system MUST NOT support a requirement, the system is considered broken if it does. If the system MAY NOT support a requirement, the system is complete with or without the requirement's implementation, and is not generally guarenteed to be present. If the system WILL NOT support the implementation of a requirement, then the system is complete without that functionality. If the system SHOULD NOT support a requirement, it is considered poor practice to support the implementation of that requirement.

Several key terms are defined here:

  • Servlet - a Java object that represents a protocol-independent server-side thread of execution.
  • Tomcat - a Servlet Container written in Java (manages a Servlet's lifecycle)
  • SOAP - a protocol for exchanging arbitrarily complex data objects in a language independent manner
  • Client - a software program that communicates with Clay to edit a piece of media.
  • Peer - a user of a client

Project Focus:

The vision for this project is to use the Tomcat {Catalina} servlet container to "host" a number of servlets that act as the publisher for changes to a specific "document". The servlets contain the master copy of the "source web", which is a graph of nodes, the parsed semantic representation of some media. A generic ClaySOAPServlet class is written to provide an implementation of the protocol that we design.

In this way, we can concentrate on the protocol, storage, and client interaction without worrying about the server environment.

The following figure depicts a high level view of this architecture:

Clay v1.3 Environment


Requirements Table of Contents

  1. Primary Functional Requirement
  2. Secondary Functional Requirement
  3. Tertiary Functional Requirements
  4. Major Limitations
  5. Background Research Requirements
  6. Protocol Requirements
    • Security Requirements
    • Stability Requirements
    • Stateful-Session Requirements
    • Interoperability Requirements
    • SOAP
    • Message Control Center
    • Unsolicited Notification or Response
    • Asynchronous
    • Publish-Subscribe
    • Request-Response
    • Query language
    • Constants
    • Actions
  7. Locking and Synchronization Requirements
  8. Performance Requirements
  9. Security Requirements
    • Integrity of edited media
    • Availibility of edited media
    • Administrative Requirements
    • Access Control Requirements
      • Authentication
      • Authorization
    • Encryption Requirments
  10. Environmental Requirements
    • JDK/JVM version
    • Application Directory Layout
    • Application Packaging Structure
    • Application Scripts
    • Data storage support (LDAP/JDBC)
    • additional libraries
    • hardware requirements
    • network requirements
    • Support for CVS and test server
  11. Architectural Requirements
    • Embedded Servlet Container: Tomcat
    • SOAPServlet
    • Source Code eXchange Protocols [CStore v2]
      • SCSP
      • SCTP
      • data store
      • schema document
      • schema container
      • source-web
      • compiler compiler
      • Dr. Neff
      • object access control
    • Tomcat Environment [included support]
  12. Design Requirments
    • Code Refactoring
      • datautil.jar
      • logging.jar
      • textutils.jar
      • java web services development pack
  13. Overall Project & Personnel Requirements
    • People Involved
      • Dr. Wolz
      • Bob Hutzel
      • Maciej Nowacki
      • M. Locasto
      • Jon King
    • Individual Motivations and Requirements
      • Part wanted
      • Part needed
      • Want in?
      • Want out?
      • Individual Comments or Suggestions
    • Rules of Interaction
      • How to communicate
      • When to meet
    • Focus of Research
    • Focus of Software
    • Define and Enumerate Research Problems
  14. Timeline Requirements
  15. Testing Requirements
  16. Maintenance Requirements

Requirements

Primary Functional Requirement

The primary functional requirement for this software is to provide a server environment and protocol that support concurrent remote editing of the same document or media by a group of trusted peers. The peers MAY use dramatically different clients to interact with the media. Therefore, the SOAP protocol WILL be used to support clients implemented in a different language. This MUST replace the current RMI method of client-server communication, which limits the clients to Java clients. The protocol and server environment MUST be implemented so as to protect against the infinite "replay" or "echo" effect.

Secondary Functional Requirement

The secondary functional requirement for this software is to provide centralized storage of the edited media and to keep track of the versions of the media by saving snapshots and keeping audit data.

Tertiary Functional Requirements

The third functional requirement for this software is to provide some standardized communications services "built-in" to the protocol. This standard communications services WILL be chat and whiteboard. A client application MAY elect to disregard these two services (for example, if the client is editing media that has nothing to do with text or images) but the server MUST provide them. These services MAY be configured to be off for a given server if the client does NOT require them.

Major Limitations

This section WILL detail the limitations required by the system. It is NOT an explanation of the limitations of a design or implementation.

There are currently no required limitations, because I cannot think of any besides "this program won't attempt to save the world."

Background Research Requirements

Literature Review

Synchronized editing of documents has been a goal of software engineering for a while. Chat is easiest to accomplish (ICQ, Unix "talk", Instant messenging programs) because that problem domain doesn't actually edit the same area - it only appears to do so.

Many complex and specialized systems have been developed to aid in creating an environment where people can edit a document at the same time over a network. The primary driver of all this research is the assumption that such interaction WILL lead to greater productivity and communication, and therefore reduce the time and cost of producing a finished work.

A literature review of this work MUST be accomplished and WILL address the following questions:

  1. What is the primary driver for concurrent remote editing?
  2. What systems already support concurrent remote editing? (plus ease of use)
  3. Do systems exist to remote edit different types of media (other than text)
  4. Would a remote editing environment actually achieve what it proposes to achieve?
  5. What do current systems lack? What is the current focus of research in this area?
  6. Who are the major people in this field?
  7. How are these systems tested?
  8. Are there easier ways to achieve the end result (human-level protocols, division of labor, project management)
  9. What other current tools support asynchronous communications? (email, mailing lists, CVS, web publishing content management (Dreamweaver,Interwoven Teamsite))
  10. Identify the relative advantages and disadvantages of using asynchronous communications.

Review old version of Clay (1.2.x)

http://www.tcnj.edu/~assistf/clay/mikeylo/clay.engineering.html [very complete detail of how Clay 1.1.9 and 1.2 work, a MUST read, even if it's bloody boring]

http://www.tcnj.edu/~assistf/clay/mikeylo/clay.soap.servlet.html [explains new thrust for 1.3, but some ideas have evolved, don't take for gospel truth]

Review WedDAV

Web DAV is a method of publishing content via the web and HTTP. It MAY offer some significant insight to us when we consider how to design our protocol.

http://www.webdav.org/

Review Servlet Spec and Servlet API and Servlet Tutorial

http://jakarta.apache.org/tomcat/tomcat-4.0-doc/servletapi/index.html [the Servlet API Javadocs, for a general sense of "what" Servlets are and why we can extend them to implement our own SOAP-based protocol]

http://java.sun.com/docs/books/tutorial/servlets/index.html [the java tutorial on Servlets. A must read.]

Review and download Tomcat

http://jakarta.apache.org/tomcat/ [check out docs for and download Tomcat 3.3.1 and Tomcat 4.1.9]

Read XML tutorials

There are MANY XML tutorials out there. Read them so you understand how to use XML (and correct syntax). Know at least the basics of parsing and writing XML documents via something like JDOM or JAXP.

Here are three good links to get you started:

Review SOAP Specification

http://www.w3.org/TR/SOAP/ [SOAP specification, at least a cursory glance and then a second, more in-depth reading]

http://java.sun.com/ [get the WSDP, web services developer pack, final release, for Java-SOAP stuff]

Protocol Requirements

Security Requirements

See the section on overall security requirements.

Stability Requirements

The protocol SHOULD be stable. That is, it SHOULD be fairly straightforward to implement and test the core functionality and respond well to load. Both HTTP and LDAP SHOULD be used as examples. The protocol SHOULD be fail-fast and unambigious. The stability of the protocol will be measured by the performance of a ProjectServlet and client pair that implement the protocol. The protocol specification SHOULD be small and light.

Stateful-Session Requirements

The protocol MUST support server-side session tracking. The protocol MUST be a stateful one (as opposed to HTTP, which is stateless).

Interoperability Requirements

The protocol MUST be language independent and NOT rely on any features in any given programming language. Any client or server SHOULD be able to talk the protocol. The protocol WILL contain packet composed of structure XML (SOAP envelopes).

SOAP

As mentioned above, the protocol WILL use SOAP envelopes to communicate between client and server.

Message Control Center

Each server that implements the protocol WILL act as a message passing and control center.

Unsolicited Notification or Response

The protocol MUST support unsolicited notifcations or responses. That is, once a client has established a legal session with the server, the server MUST be able to send updates to the client as long as the session is valid and the connection between client and server has NOT been closed.

Asynchronous

The protocol is asynchronous. Client events and server events are NOT synchronized on any flag or timing, but rather on spontaneous events.

Publish-Subscribe

Due to the event-model nature of the client-server environment involved in collaborative remote editing, the protocol MUST be implemented in SOAP using some sort of Publish-Subscribe mechanism, in which client subscribe to a channel (server) for a certain project and receive updates from all other clients of the master copy of the object (editable media) that resides on the server. Each client MAY publish a change to the server, and the server WILL propagate the change to each client. The server MUST NOT send the update to the client it originated from.

Request-Response

The protocol MUST NOT be a request-response based model, even though we MAY define a request and response object. A request and response are NOT guarenteed to be paired. Instead, a Request or Response MUST carry an Action that MAY or MAY NOT require further processing from either the client or the server.

Query language

The protocol MAY support a simple query language for locating objects on the server. This SHOULD NOT be considered for phase 1. JNDI or other lookup mechanisms are preferred. Since the protocol SHOULD be kept small and light, adding query functionality beyond a basic GET or login authorization is overkill. Besides, all protocol actions SHOULD be specified via contstants and an ClayAction object.

Constants

Integer contstants defining individual actions make the protocol very light and quick. These integers can be stored either in an XML file (more configurable, less compile) or as constants in a Java class.

Actions

A ClayAction class WILL encapsulate the data and action code (constant) that the client or server SHOULD take.

Locking and Synchronization Requirements

Most systems that edit the same document at the same time MUST "lock" the file for writing so as to guarantee that the write was successful (even if it is overwritten after a save operation). Unless the protocol that is designed and proposed manages to define a very fine grainularity for write tasks or uses some other mechanism to fool the user into thinking they have editing the same document, Clay WILL need to deal with many locking and synchronization issues. In addition, each ProjectServlet is by definition a multi-threaded server.

Since Java Threads are to be used heavily, a good understanding of them and their synchronization issues SHOULD be undertaken.

http://java.sun.com/docs/books/tutorial/essential/threads/

Clay WILL NOT attempt to preserve a peer's updates in face of conflicting editing from another peer. For example, if a peer publishes a change and Clay propagates that change and the change is immediately overwritten by another peer, the first change is lost.

Clay MUST keep an audit log (change list, log of the publish actions) of each change to a document.

Note that there MAY be user-object level granularity for the source-web; that is, as required by the Security Requirements, certain objects MAY only be written to by certain users. This may provide some metric or heuristic for simplifing certain cases of synchronized editing (1 privileged writer, n readers).

Performance Requirements

The server SHOULD scale well to 400 concurrent users. Response time on t1 should not exceed 1.5 seconds at 400 users. At 800 users, a response time of 5 seconds is acceptable. At 100 users or less, response time SHOULD be less than 500 milliseconds.

Security Requirements

Integrity of edited media

Since remote editing goes over the network, it is susceptible to an attack that can modify the media and propagate that change to all clients. The media integrity must be secure. The media updates (and initial feed) SHOULD be transmitted encrypted and with some check bits or hash that guarentees the integrity of the media or update.

Availibility of edited media

The media SHOULD be served as quickly as the client can pull it. DOS attacks SHOULD be accounted for - since the protocol is stateful, we have a better chance of denying new connections and still serving our authorized customers.

Administrative Requirements

There are five levels of access control (roles) for the overall Clay server software.

  1. Clay-Tomcat admin: responsible for installation and uptime of the Clay-Tomcat server itself.
  2. Outside resources admin: a database or LDAP directory admin
  3. Clay developer: responsible for developing parts of the Clay server software.
  4. Clay Project developer: owns a server context and develops a specialized application on top of the ClaySOAP protocol.
  5. Project user: a user (who MAY have many roles, refer to the below requirements on tuples and authentication/authorization) of a ProjectServlet or application. They are a client who connects to the server to interact with other clients.

Access Control Requirements

Authentication

Every user MUST be authenticated by a login procedure against an LDAP directory, simple XML file, or relational database. Every user is represented as a 4-tuple {username,password,application,role}

Authorization

Based on the values in the authentication 4-tuple, a user MAY be authorized as a certain role in a certain application.

Encryption Requirments

The system MAY include the option to encrypt traffic between the client and server using strong encryption TLS (SSL) generally available as part of the Java SDK. In addition, the system MAY use strong encryption between the authentication mechanism and the applications (ProjectServlets) and between the media storage facility and the applications (ProjectServlets) or any other tier and the ProjectServlets.

It is suggested that the initial version design provide for encryption capabilities, but not implement those capabilities until basic functionality is completed.

Environmental Requirements

JDK/JVM version

The server software MUST be built by the Java SDK 1.4.x for the JVM/JRE 1.4.x

Application Directory Layout

			$CLAY_HOME=c:\clay\clay-1.3

			$CLAY_HOME/docs
			$CLAY_HOME/logs
			$CLAY_HOME/conf
			$CLAY_HOME/tomcat
				$CLAY_HOME/tomcat/bin
				$CLAY_HOME/tomcat/classes
				$CLAY_HOME/tomcat/common
					$CLAY_HOME/tomcat/common/classes
					$CLAY_HOME/tomcat/common/lib
				$CLAY_HOME/tomcat/conf
				$CLAY_HOME/tomcat/lib
				$CLAY_HOME/tomcat/logs
				$CLAY_HOME/tomcat/server
					$CLAY_HOME/tomcat/server/classes
					$CLAY_HOME/tomcat/server/lib
				$CLAY_HOME/tomcat/temp
				$CLAY_HOME/tomcat/webapps/
					$CLAY_HOME/tomcat/webapps/clayapp-n
					$CLAY_HOME/tomcat/webapps/clayapp-n/WEB-INF
					$CLAY_HOME/tomcat/webapps/clayapp-n/WEB-INF/web.xml
					$CLAY_HOME/tomcat/webapps/clayapp-n/WEB-INF/lib
				$CLAY_HOME/tomcat/work
			$CLAY_HOME/src
				$CLAY_HOME/src/edu
				$CLAY_HOME/src/edu/tcnj
				$CLAY_HOME/src/edu/tcnj/cs
				$CLAY_HOME/src/edu/tcnj/cs/clay
					$CLAY_HOME/src/edu/tcnj/cs/clay/core
					$CLAY_HOME/src/edu/tcnj/cs/clay/data
					$CLAY_HOME/src/edu/tcnj/cs/clay/protocol
					$CLAY_HOME/src/edu/tcnj/cs/clay/server
					$CLAY_HOME/src/edu/tcnj/cs/clay/util

			

Application Packaging Structure

The package structure for Clay begins with the prefix edu.tcnj.cs.clay. There are then five primary packages core, data, protocol, server, util that form the nucleus of the Clay server. Clay WILL utilize Tomcat as the basic server environment, and the code in the previously named packages WILL serve to implement the design solutions to the requirements stated in this document.

Note that this package structure WILL be contained within a single JAR file named clay-1.3.x

This JAR file WILL be included in $CLAY_HOME/tomcat/common/lib so that all applications (ProjectServlets) can use the code in these packages. The API definition of a ProjectServlet WILL be provided in edu.tcnj.cs.clay.server.ProjectServlet.

While this JAR file WILL define the support architecture for CLAY (above Tomcat's server environment), developers of ProjectServlets MAY use the WEB-INF/lib and WEB-INF/classes directories as provided by Tomcat and the Servlet specification to house their own specialized data structures, code, beans, workers, etc. Each Servlet Context receives it's own environment (Classloaders, temp directories, etc) and is defined by a folder in $CLAY_HOME/tomcat/webapps/ (or a <Context> entry in server.xml)

In general, the packages WILL be used like so:

  • core - concrete definition of abstract classes and interfaces declared in server and protocol
  • data - Javabeans and other easily serializeable entities that represent objects in outside resources
  • protocol - interfaces that define how Clay WILL talk SOAP (SOAPRequest,SOAPResponse)
  • server - interfaces (SOAPServlet and ProjectServlet) defining the API that developers WILL use in order to develop a ProjectServlet.
  • util - general purpose utilities (pools, workers, static utility classes, special URL objects, connectors)

Application Scripts

Compilation scripts for Clay MUST be written (or, the developers MAY use Jakarta's Ant build tool) to compile the Clay core package/JAR. Once this JAR file is placed in Tomcat's $CLASSPATH, the normal Tomcat start/stop scripts can be used to control the running Tomcat instance.

Data storage support (LDAP/JDBC)

Because the SCXP WILL be developed in parallel with Clay, initial support for authentication information and media storage MAY use either some LDAP directory (via JNDI) or some relational database (via JDBC) or both. It is suggested that LDAP be used for authentication information and JDBC used for storage of media (as a BLOB).

Additional Libraries

The Java SDK v1.4.x SHOULD have any additional libraries needed, except for the Java-SOAP utilities, which is available in the WSDP.

Hardware Requirements

The Clay server (+Tomcat) SHOULD run well on a Pentium II, 500 Mhz processor with 64 Megs of RAM.

The Clay server (+Tomcat) SHOULD run very well on a Pentium III, 1 Ghz processor with 128 Megs of RAM.

The Clay server (+Tomcat) SHOULD run exceptionally well on a Sun Solaris server or dual Pentium 3 @ 1.x Ghz with 512 Megs of RAM.

Network Requirements

The speed of the networks connecting the peers is of paramount concern. If a peer does not have a fast connection, their message queue could back up, and their updates would not be published as quickly as "real-time"

Support for CVS and test server

A test hardware platform SHOULD be designated for demo purposes (Springfield?)

Architectural Requirements

Embedded Servlet Container: Tomcat

Tomcat WILL provide most of the "server" environment, the core Clay package will act as an add-on library to Tomcat that provides interpretation for the SOAP protocol that the group designs. Familarity with Tomcat is required in order to have a clear idea of where Tomcat leaves off and Clay picks up.

SOAPServlet

The SOAPServlet and ProjectServlet are the primary interfaces for a server process that coordinates synchronous communication between editing clients.

Source Code eXchange Protocols [CStore v2]

The SCXP are an ambitious attempt to provide a next-generation storage mechanism for editable media (especially textual source code).

The current proposal for SCXP is available at: http://www.cs.columbia.edu/~locasto/projects/scp/protocol.proposal.html. A short summary is provided below regarding its role in Clay.

As far as "database" stuff goes with Clay, there is plenty of it...from user authentication (storing encrypted passwords, roles, and whatnot) to Directory access via JNDI/LDAP to the Source Code Exchange stuff I'm trying to outline. The major challenge there is creating some abstraction of a "source-web" (basically a Graph) that can be stored in a number of different data stores and then have a single copy pulled into the ProjectServlet and modified there via a Publish-Subscribe type protocol.

SCSP

Source Code Storage Protocol defines the basic mechanism for storage of code and media.

SCTP

Source Code Transport Protocol defines how to transport and query source code.

Data Store

The underlying physical storage of the media is unimportant. The only requirement is that the physical store WILL NOT be a bottleneck to performance, and support live updates and check-in/out of media according to the SCTP and SCSP.

Schema Document

A Schema document specifies the grammar for a language, or a set of key relationships that a particular type of media MUST follow.

Schema Container

The object that implementes the SCSP WILL act as a container for many schema documents. As such, it is responsible for managing the lifecycle for the schema documents and the parsers it generates from them.

Source-web

A source-web (which applies much more to text which has some semantic meaning more than simple media like sound and image) is the semi-compiled and parsed Tree/Graph of all the objects in the media. These objects are assumed to have some overall relationship between them. This relationship (or rather, the rules for this relationship) are documented in the schema document.

Compiler Compiler

The "Compiler Compiler" is a phrase relayed to us by Dr. Wolz, who recalled that Dr. Goldberg had told students to create such a beast.

The storage mechanism WILL essentially be a factory for parsers for different languages, that is, a program that WILL generate any given parser or perhaps a compiler.

Dr. Neff

Dr. Neff's research into the MiniOO Compiler and AST can shed some valuable light onto how to create and maintain the storage mechanism.

Object Access Control

Since people produce objects (code) it WILL be very important to enforce access controls on individual pieces of the source web. An object can have many different owners and permissions associated with it. Careful thought MUST be given to access control and how this affects the use of the code in other people's code.

Tomcat Environment [included support]

There is another section in this document about what Tomcat provides. Also, see the Tomcat website

Tomcat provides a server environment in which we can run our "ProjectServlets".

Design Requirments

Code Refactoring

datautil.jar

A simple JDBCConnectionPool and its associated Factory, along with a simple utility to wrap SQL queries and commands and log them. Depends on logging.jar.

logging.jar

A simple logging utility with support for logging filters.

textutils.jar

A collection of simple utilities for parsing generated SQL statements with some error correction, as well as translating HTML to escaped HTML.

java web services development pack

Some JAR files from Sun that enable Java programs to speak SOAP and SOAP-RPC.

Overall Project & Personnel Requirements

The topics and roles for people listed below SHOULD be addressed in the first meeting or before the semester starts. The only requirement here is that the team has an honest working relationship. The topics in this section are meant to spur communication.

People Involved

Dr. Wolz
Bob Hutzel
Maciej Nowacki
M. Locasto
Jon King
Al Audick?

Individual Motivations and Requirements

Part wanted
Part needed
Want in?
Want out?
{Dis}satisfaction with current design process/leadership
Individual Comments or Suggestions

Rules of Interaction

How to communicate

Via email to the group/list with subject [clay 1.3]+topic for filtering purposes.

When to meet

Suggested bi-weekly or weekly meetings where each requirment and design point is discussed and resolved, and action items prepared. MUST provide meeting day and time: XXXXX

Focus of Research

The focus of this research is three-fold:

  • Address the requirements specified in this document
  • Learn to perform research and software engineering on a complex topic in a large team setting.
  • Fullfill the functional requirements, test the results, and document the conclusions.

Focus of Software

See primary, secondary, and teritary requirements. In addition, the software will attempt to provide both performance and security. The Performance Requirement and Security Requirements detail more.

Define and Enumerate Research Problems

There are many individual good problems to chew on that do NOT necessarily require a deep understanding of the rest of the application. Therefore, it SHOULD be possible for many student to perform a one semester contract with the group to broaden the understanding of a particular area or implement a particular mechanism, or otherwise support the group's research.

Timeline Requirements

Decide on lifecycle for 1.3.x and how long design, implementation, testing, and writeup WILL take. Split over 2 semesters?

This is a proposed timeline and a layout of the project milestones:

Milestone Summary Milestone Date Milestone Completed? Completed By
Requirements Specification 9.1.02 Y Michael Locasto
Core Design Specification 9.1.02 N Michael Locasto
Everyone in group familiar with Tomcat 9.20.02 N all
Detailed Design Specification 9.22.02 N ??
Implementation of a minimal ProjectServlet running in Tomcat, speaking SOAP 10.02.02 N ??
Midterm Review 10.10.02 N all
Performance review of minimal ProjectServlet 10.12.02 N all
Summary of Literature Review 10.20.02 N ??
Approve Maintenance (Lifecycle) Document 10.25.02 N all
Finalize implementation 11.01.02 N all
Approve Testing Document 11.05.02 N ??
Review Goals for next semester 11.15.02 N all
Test Clay, Write up results 11.25.02 N ??
Approve Semester Summary Documents (res/pres papers) 12.01.02 N all

Testing Requirements

See Bob Hutzel's testing document for the basis of general testing requirements.

The server WILL need to be stress and load tested. The network components MUST be stressed.

Maintenance Requirements

A maintenance plan MUST be written after the design document is approved.


Copyright © 2002 Michael E. Locasto [8.19.02]