Notes from Workshop on Future Directions of Data
Structures.
U. Wolz, March 25, 1997
Disclaimer: the following is NOT intended to be a
definitive summary of the activities of the workshop. It is heavily
biased by my personal perspective. I would hope that it can be used
as a starting point for a more objective report on the proceedings.
Summary of impressions (not in any particular order):
General Conclusions:
- The first two courses are about programming and computational
problem solving and to some extent Software Engineering (w/o the
managerial issues.)
- The courses addresses programming in the small and medium,
leaving in the large for later. The object-oriented approach
should be used as a basis for this. There was a fair amount of
controversy over where some topic come in, CS 1 or CS 2. The focus
seemed to be on where students should be at the end of CS 2. Note
that many schools had a "CS 3" for programming in the
large/software engineering as well as a "CS 7" for more
theoretical algorithm analysis.
- Abstraction is key to the second course. Implied but never
really discussed, the first course offers basic tools, the second
course introduces notion of abstractions and their implementation.
The question we kept asking was "but which abstractions should
they learn, to what level of detail, and to what end."
- My concern was with the confusion between language driven
curriculum and language in service of curriculum. I kept hearing
"I've given up N, in order to cover M". N tended to be a concept
(e.g. graphs) M tended to be language "feature" e.g. templates in
C++. To be fair, at one point in our small group discussion we
brought a C++ feature issue (deep vs. shallow copy) up to the
conceptual level. However, in later discussion with the whole
group this did not emerge as an obvious part of CS 2 content.
- The prevailing language at the conference seemed to be C++,
not necessarily by choice. Although cogent arguments were made for
using C++, the prevailing point was that this is the language
students wanted/needed. Does this mean Java will supersede it? We
asked this question, but didn't discuss it.
- As an extension of point 5, we COULD NOT get away from the
language issue. People's own frustration at mastering C++ came
out. Yet we continue to assume we will use this as an introductory
language. Little discussion of Java as an alternative. I would
have liked more discussion from Gore and the Reeks about Eiffel.
They did quietly point out that the inheritance problems(which are
SO ugly in C++) are irrelevant to Eiffel where inheritance is
natural.
- At a number of places the WAY we teach was discussed rather
than WHAT we teach. Mentorship, apprenticeship, practicum, close
student, faculty contact, closed collaborative labs all came up.
There seemed to be general agreement that these courses needed to
be "non-lecture" based. Students learn by doing, not by listening.
- There was almost no discussion of continuity and a cohesive
thread in the program, beyond the very vague term "abstraction."
No one seemed to be particularly concerned with a collection of
ideas (e.g. encapsulation, pointers, ADTs (although no real
agreement on what these were). The question WHY are students doing
this was either taken for granted, or lost in the focus on topics.
I'm not sure which, but suspect the former. I think this one is
critical to programs that want to attract women, assuming one buys
into the problem of small isolated topics driving women away. We
never really said that the reason we all like teaching CS 2 is
that the idea of abstraction is "really cool." Many people
admitted that this is a fun course to teach. But we never really
said why.
- It was heartening to see that analysis was taken seriously, in
the context of trade-offs between implementations. Formal analysis
was viewed as not nearly as important as having intuitions about
space and time complexity.
Conclusions for CS 2 at TCNJ:
- The object paradigm is important for both courses:
encapsulation first, and inheritance (if done well) second. To do
objects right in the second course we need to do it in the first.
Our current approach is far to fragmented: too much time spent on
learning C/C++ in the second course. Two languages (Java and then
C++) would create a conceptual thread and still address the
problem of a "favored language."
- We need to make a clear distinction between concept, use of
concept and implementation of concept. We should teach students to
the use of the black box, to know when to use the black box, and
then later show them what is inside.
- Informal analysis is critical. Students should be well versed
in doing intuitive analysis. This circles back to the trade-offs
issue. In both CS 1 and CS 2 the ability to decide which tools to
use under what circumstances is key. Discrete mathematics as a
vehicle for aiding this analysis is essential. I think this is
what we do differently: by closely articulating the three courses
we specifically address how theory contributes to practice.
The following is a re-organization of my notes from the
workshop.
Michael Mann
Focused on three areas. These kept recurring in later discussions:
- Teaching large classes with diverse backgrounds (I
extrapolated this to mean nature of student/teacher interaction.
His point was that he needed to find ways to learn what was going
on in individual's heads.) He touched on:
- (C vs. C++)
- Special tracks for students w/o proper background
- Honors projects
- Close contact in one on one session.
- A more minor concern was his remark that recursion is hard to
teach. Will address this elsewhere in these notes.
- Language & PDE's, he asked to what degree do we focus on
these.
- modifying the course to accommodate C++ Why?
- Language focus vs. concept focus.
- There seemed to be a tension between concepts and language.
Furthermore, what does it mean to "build from scratch" especially
in C++.
- Fundamental Issues what is the content of the course?:
- how are the DTs implemented
- STL etc.
- Dynamic Structures
- Abstractions vs. Implementations
- Building from scratch (allocations deallocations)
- classes, constructors etc.
- Algorithm analysis.
- What is the relationship between: CS 1, CS 2 and Discrete
Math:
formal courses vs. include discrete in various places.
- Perspective of the course: Science vs. engineering:
- what about pre and post conditions (program correctness)
- Intuitions vs. Formal proofs.
- The course was originally a taxonomy of data structures
- architecture's of programming (destructors, constructors,
iterators)
Jacob Gore
Presentation focused on the object paradigm and why you should use
it. His point was that the paradigm should service the concepts. OOP
for its own sake (e.g. worrying about templates) looses sight of the
purpose of the course. My conclusion was that the problem is the bias
of a heavily typed language like C++. In an untyped language like
CLOS you don't run into these conceptual problems. In particular:
- how do you say it vs. how do you express ideas
- Can you do OOP w/o inheritance, why bother if you aren't going
to.
- He presented the following.
Class B can be a subclass of class A only if under all
circumstances an object of class B can be used instead of an
object of class A.B inherits from A means: "a B can act as a A"
- He described the SE issues as follows:
Pre/post conditions- contract between supplier and user of
class
between supplier of class and supplier of subclass
this makes testing easier
- Performance: trade-offs between multiple implementations
My comments on Mann/Gore presentations:
Are we creating ambiguities:
- Language considerations vs. concepts: pushing things into the
curriculum because of the language
- Are CS 1 and CS 2 programming courses? bite the bullet they
are. Period.
- Engineering vs. Science: what are we focusing on anyway. How
to or why?
Other issues:
- What is the language for? Expressing ideas.
- Inheritance: use it early
- Intuitions on time complexity,
The following are my notes on the small group discussion. We
focused on:
- Paradox of programming vs. theory
- SE as a part
- Application vs. theory
We have to teach programming, but low status, but important, but a
learning vehicle.
- What IS Programming:
- What do you implement?
- Main themes: abstractions
- tools vs. concepts
- Need to use objects to capture abstractions
We went around the room and asked: What ARE the postconditions for
CS 2, what do we want students to come away with?
- Appreciation for encapsulation and modularization
- Algorithmic development (a body of basic tools)
- A collection of tools, they should do design, not invention
- Ability give a problem to implement it: go through the cycle:
- characterize
- describe solution
- implement
- debug
- (not going to get this elsewhere)
- They should understand how principles are applied to
engineering (trade-offs)
- mastery of one language & one environment achievement vs.
flexibility
- Different ways of doing the same thing: trade-offs
- Independent programmers
SE principles
Abstractions
- SE archeology: finding the right tools. Trade-offs: tools to
make judicious choices within the task at hand
- precognition and understanding of abstractions
- Construction of a class (C++) encapsulation
- Learn how to read a problem statement
- Learn debugging vs. learning to use a debugger
- Because of C++ don't talk about inheritance because don't want
to spend the two weeks on virtual functions.
We then asked what to avoid:
- I don't want the language to drive (interfere with) the course
- Long lectures
- viewing analysis as extraneous
- the student who don't care
- programming is a craft avoid over abundance of language
features.
- avoid magic (oh well, stuff happens)
Interesting Points from the whole group discussion:
- Good design comes from experience, & experience comes from
bad design
- Don't be ashamed of teaching programming
- Designing is knowing what the building blocks are, and how to
use them.
- Software Architecture: we want to teach students how to do
DESIGN, not INVENTION. how do you do it is critical
- Tools and techniques: shared knowledge
The afternoon small group discussions were to focus on :
- Concrete things
- what must go into the curriculum
- what might go into the curriculum
Our group, having been articulate in the morning was less
productive with regard to the task at hand. We never did articulate
what we thought should be in the curriculum. One important point for
me was the discussion of deep vs. shallow copy. It is necessary to
explain this to students if they are going to "build" objects in C++.
But if you don't focus on the big picture (dynamic vs. static
storage) then it is just an awkward piece of "gunk" for them.
Although the concept is important, does it belong in CS 2? We never
really answered that. This again returns to the them of language
driving content or content driving content. Especially since the
other groups didn't come up with this as a major item of CS 2, I
believe it is an example of language driving content. Copy
constructors are iky in C++. Period.
Results from afternoon small groups: other two groups:
These are my text-based transcriptions of the slides that were
shown:
First Group:
- Goal is to develop an independent apprentice
practitioner:
design
code
test
debug experiment
group coordination
documentation
- Degree to which Analysis is incorporated
informal time/space
making informed choices
appreciation for design choice
- Need to distinguish interface from implementation
read, understand, use spec. for:
common container classes
apply common storage techniques and relevant algorithms
- Abstractions
Functional abstractions (black box)
Data abstractions (encapsulation and info hiding)
- Intro to Programming in the Large
use of libraries
group work
Second group
- Programming considerations
The following should be included in the curriculum:
tree
BST
one other (e.g. expression)
traversals
binary search
read/write/modify/write interfaces
implement linked structures
[linear?, nonlinear?]
map/dictionary
alternate implementation that differs in complexity
recursion
linear structure/ADT
stack, queue, list
(able to implement)
sorts (fundamental algorithm)
n^2, nlogn
bigO
average, worst case
The following should fall outside the curriculum:
balanced tree
backtracking
priority queues
hashing
My comments on these:
ADT has well defined interface
Which is an ADT and which is an implementation
An assumption here is that algorithms are <= n squared.
What about applications that are exponential or
uncomputable?
What are the basic operations? Find, merge sort organize
- Abstraction :
Include:
ADT
encapsulation
information hiding/access
data orthogonal to structure [e.g. generic]
iterative enhancement
use provided components/data structure
Exclude:
project oriented (inheritance, polymorphism
iterator
information hiding [build]
- Controversies: no consensus within the group regarding
place in the curriculum. (AKA Gunk?)
component libraries e.g. STL, JGL
design that students do
group based programming
build from scratch
patterns, software architecture
there are only 4 languages (Java, C++, Ada95, Eiffel)
Summary of the day:
There was remarkable consensus (at least from the outspoken
members of the workshop) on the general goals of CS 2 (as articulated
by our group in the morning), and in the specifics of content as
presented by the other two groups in the afternoon. We even agreed on
the "controversies" those things that we couldn't agree upon.) All in
all it was a very productive and enlightening meeting.