Ursula Wolz
Department of Computer Science
The College of New Jersey (formally Trenton State College)
Hillwood Lakes Box 4700
Trenton, NJ 08650
+1 609 730 1180
email: wolz@tcnj.edu
Abstract
A novel approach to teaching Computer Science I and II is presented. The emphasis is on a "goal-centered" approach in which students learn the foundational concepts of computer science by focusing on computational goals rather than language functionality. Although this approach diminishes the focus on programming language it still requires careful consideration of both a first and second language through which to explore computational ideas. This is especially important in the age of programming within distributed environments with graphical user interfaces. The "event-driven" model has the potential to radically change the way programming constructs are presented. The ability of a language to adequately support events becomes critical. This paper reports on our efforts during the past five years. We have used a variety of languages and programming paradigms in both CS I and CS II. We report here on our successes and failures as well as our analysis of the appropriateness of the "hot" languages, Java and C++.
INTRODUCTION
The last five years have seen dramatic changes in the way introductory computer science is taught. Some changes have occurred at a conceptual level, others are merely a switch from using Pascal to C and C++ as a first programming language. Regardless of whether the move was conceptual, language driven or both, the changes have brought a new set of complex issues to computer science curriculum development. The promoters of Java proclaim it will solve many of the problems raised by these issues. Its detractors, many of whom were burned by too quick a rush to C and C++ (henceforth C/C++), dismiss it as yet another fad.
The focus on language is misguided because it forces an emphasis on the mechanics of expressing key ideas rather than focusing on the key ideas themselves. The first programming language is merely a vehicle for exploring foundational concepts. The dilemma in computer science education is that presenting those concepts creates a conflict between balancing breadth and depth. In recent years the problem has been compounded. First, the explosion of interest and hasty adoption of C/C++ and now Java keep the language rather than the concept issues at the forefront. Second, the event-driven model, essential to both network and graphical user interface programming has created a conceptual shift in how the parts of a program fit together.
This paper reports on our experiences during the past five years. We have used three different languages for the first course in Computer Science: Pascal, Scheme and Mathematica, and have kept careful records of effectiveness for the latter two. We provide our students with a "goal-centered" approach in which the tasks one can accomplish with a program are emphasized[7,8]. We stress the trade-offs between solutions to problems. Within this context we still expect our students to learn to program. Our intent whenever possible is to avoid, de-emphasize or at least downplay the "minutia" associated with any particular programming language. We have also given careful thought to the use of C/C++ and Java as a first language based on our approach. We will report on that analysis here. We will not use Java in the classroom until Fall 1997 because the environments for using it have not been sufficiently stable for beginning programmers. However, the lessons we have learned from using four other languages (Pascal, C/C++, Scheme and Mathematica) have provided a basis for a rigorous analysis of Java.
Our curriculum focuses on breadth in a first semester lab-based course in which students learn major concepts by programming in a high level language that minimizes minutia. We have used both Scheme and Mathematica. They allowed us to introduce a variety of programming paradigms since imperative, functional and object-oriented concepts can be expressed in both without horrible attention to detail. Java may be an alternative.
The second semester emphasizes the "guts" of computing, addressing the underlying sequential machine. Problem solutions that were expressed in a very abstract form in the first semester are re-examined for efficiency within the more machine-based imperative form using C/C++. We report here on our success and failures with C and with C++. We also discuss why Java may not be the ideal language for this course, especially within a curriculum that does not expose students to lower level programming (for example in an assembly language).
A new consideration for foundations courses is the event-driven model which is critical to graphical user interfaces (GUIs) and distributed computing[6]. There are two problems with the traditional introduction to programming. First, the classic interactive menu loop imbedded in a "main" function provides the wrong mind set for beginning programmers[9]. The interactive loop creates a dialogue between the user and the program which is controlled by the program. In the event-driven model the user controls the dialogue. The program waits for the user to initiate events and then reacts to the event. The basic organization of the program changes. It becomes far more modular, which should benefit foundations courses. However the second problem is that techniques for programming GUIs have not matured sufficiently for general principles to have emerged. GUI programming is still primarily an exercise in controlling machine dependent minutia. The Java developers claim they have addressed this problem.
Throughout this paper claims will be based on experience at our institution that is the result of extensive data collection and ongoing analysis. We have collected student work, exams, questionnaires and conducted interviews over the past three years. The findings point toward considering Java as a first programming language and for the short term at least, still introducing C and C++. Rather than taking a stand for or against Java, and C and C++, this paper presents the pros and cons within the context of our perspective of a goal-centered approach.
WHAT ARE THE FOUNDATIONS OF CS?
In most academic fields a foundations course provides the student with an overview of the field. Possible exceptions are English composition and Foreign Languages where the emphasis is on teaching skills in using a language. Computer science is the rare field that attempts to do both. Introductory courses are no longer "programming language" courses in which students become proficient at programming. Curricula for such courses are expected to include an overview of good problem solving style, and also introduce the core concepts of the field [1,4,5,7].
The question is: are we trying to do too much? Should we separate the teaching of programming from the foundational core? Some institutions have adopted this approach, requiring students to take a "baby" course in programming before embarking on the computer science sequence. Yet such courses are routinely promoted as outside the core curriculum [3,4].
The problem can be reduced to a question of depth versus breadth. One cannot hope to develop a broad appreciation for the field while simultaneously developing sophisticated proficiency in programming. Formal languages, especially as implemented on computers require an incredible attention to minute detail, what we term the "minutia." Ideally, to achieve breadth, one ought to be able to avoid it altogether. However, to achieve depth one must study the subtlety of the minutia. For example, proponents of strict data typing will argue that appreciation for the concept is essential to understand space efficiency. Yet, is it necessary for novice programmers to know the distinctions between ten types of integers before they have ever composed a program? Programming language choice is therefore critical, because some languages simply contain more minutia than others. A core sequence should instill in students an appreciation for programming, it should not get bogged down in the minutia. The question is whether any language and its implemented environment can support the needs of a range of users from novice to expert. Perhaps a solution lies in introducing more than one language.
A GOAL-CENTERED APPROACH
At our institution we have developed a "goal-centered" approach to the teaching of programming within the context of two foundations courses. A full curriculum is available at our web site (http://www.trenton.edu/~dummlib). Our focus is on the goals or tasks that can be accomplished by writing a computer program. Although these courses assume no prior knowledge of programming, they do not focus on programming, particular languages or even particular language paradigms. For example, students are explicitly taught the distinction between using a functional, imperative and object-oriented approach. The emphasis in the courses is on task satisfaction. For example, instead of teaching about "loops" we present computational tasks that can be solved with repetitive constructs. A fundamental theme is that there is always more than one way to do a task, and analysis of the trade-offs between solutions is stressed. The first course emphasizes concepts, the second addresses machine implementation of those concepts.
For example, in the first course lists, and list manipulation are introduced as a means of solving tasks involving collections of data. We present students with five simple tasks that are essential to list processing: 1) Computing the items in a list (for example summing the items); 2) Modifying all of the items in a list (for example incrementing their value); 3) Collecting a sublist of the list (for example, finding all items with a specified property - search and item deletion are forms of this task); 4) Inserting items into a list; 5) Sorting a list. Both recursion and iteration are discussed as techniques for traversing a list. Students manipulate the abstractions without attending to the detail of the representation within the machine.
The second course attends to machine implementation and seriously considers efficiency issues. For example, in the second course both arrays and dynamically linked nodes are presented as implementation alternatives for lists. Time and space efficiency considerations are discussed as well as trade-offs between iterative and recursive solutions.
Our approach avoids dealing with details until they are relevant, in contrast to a traditional introduction to list traversal using Pascal or C in which students may be confronted with arrays, pointers, data typing and notoriously bug prone iteration as they struggle with the abstract concept . Because we start with languages in which recursion is easily expressed, we address repetition from a divide and conquer standpoint: a problem can be reduced to the first case in combination with solving the problem on the remaining cases. Each of the five types of list processing tasks has a simple recursive structure regardless the language used.
Finally, is it necessary to consider the implementation level? For example, does it really matter whether a list construct is implemented as an array or a dynamic node structure. For a foundations course the answer is obviously yes. Therefore it is necessary to expose students to a language that allows them to see the distinctions between these implementations.
Language choice has four possibilities. 1) Choose one that supports both the abstract and implementation levels well. The promoters of Java claim it does, however we are not yet convinced. 2) Choose a language that supports abstractions well, but that may have limitations in supporting the implementation level. Scheme is an example, and w Java has the potential to be. 3) Choose a language that supports the implementation level, but may introduce unnecessary minutia at the abstract level. C and C++ are examples. 4) Choose languages appropriate to each level.
FOUNDATIONS IN A RAPIDLY CHANGING FIELD
As little as five years ago most students learned to in text-based environments. Today one can expect most students to learn to program within GUI environments that are linked to the World Wide Web.. GUIs are wonderful to work in but horrible to program, especially when tied to the web. We will focus here on GUIs here. Incorporating GUI programming into a core course has two problems: Programming GUIs is an exercise in memorizing detail and the event model dramatically changes simple interaction.
The first problem can be addressed with languages like Java that streamline access to major GUI tools like windows, and dialog boxes[2]. As multi-media matures, standard concepts will emerge. In the mean time, Java holds real promise for giving novices access to interactivity.
The second problem is conceptual. The event model has produced a new framework in which to consider interactivity. It requires writing isolated objects that trust the operating system to support them. This is radically different from the Pascal/C need for a "main" function that contains an interactive loop. This model encouraged students to use simple "input" and "output" functions to solicit data and display results. This technique is not only obsolete, but is detrimental to learning the event model. It discourages, rather than encourages students to think in a modular fashion. Rather than encouraging students to focus on the task at hand it forces them to mix tasks. For example, the classic solution for summing a list of numbers is to solicit numbers from a user until the user declares the list is complete. A running tally is kept while simultaneously checking for bad input and end of data. In the event-driven model, interactive tasks are irrelevant to the primary task of summing. The method responsible for summing may assume it has received proper data, leaving other methods (procedures or objects) to handle the end of data and bad input events.
The event-driven model in concept has great advantages for foundations courses, however in practice it is still mired in minutia. Techniques for implementing the model are dependent upon the underlying operating system and typically require esoteric function calls, especially in C and C++ implementations. Java claims to have simple techniques for controlling events and is therefore a good candidate for addressing these concerns. But even in Java, setting up all the pieces for event-driven interaction is even more complex than a "main" function with a menu loop.
The ideal scenario for the novice programmer is an environment in which code fragments can be tested in isolation. The "get/eval/print" loop of the interpreted functional languages is good for this approach, since it avoids the complex "wrapper" required by the traditional menu loop or when setting up event handlers. The disadvantage is that students do not get experience writing "applications." Instead of writing complete, stand-alone programs they may only be exposed to writing pieces of programs. On a conceptual level this may not be a problem. The prejudice for writing a complete stand-alone application comes from the Pascal and C bias that a complete program had to be compiled, the data had to be loaded (from the user or a file) and results had to be displayed. Requiring first semester students to manage this complexity is inappropriate in the event-driven model. In fact, we have found that first asking students to write simple interactive loops makes it harder form them to grasp the event model later
THE FIRST SEMESTER COURSE
The objective of our first semester course is to present the key abstractions of the field and encourage students to write real programs that exploit those abstractions. Our first choice for a language was Scheme on Macintosh computers. Macintoshes were chosen because of the rapidity with which students can master the environment. Scheme was chosen because of its clean interpreted environment in which students can write small, powerful programs almost immediately.
We used Scheme in academic years 92-93 and 93-94. Students were able to use the environment to write simple programs within the first week of the course. Recursive solutions were being written within the first three weeks. We found this learning curve to be significantly flatter than that of our Pascal/DOS-based course. Scheme was a good choice for teaching modularization and recursion. By giving them the functionality for stacks, queues and trees, students were able to write solutions using these abstractions.
The interpreted environment allowed us to avoid having to teach interactive techniques early on. Students wrote functions and fed test data as input. The rich editing environment on the Mac allowed them to cut and paste the input to function calls with data sets defined within the environment. Late in the course they were taught how to write simple interactive programs; after they had developed the discipline to concentrate on the task at hand.
A side benefit of the Macintosh Scheme environment was that students could construct lab reports very naturally. Code could be developed and commented as a narrative in a window. The output appeared in a second window, and could be inserted into the code window as comments. We saw a marked improvement in the quality of lab reports and analysis of program writing.
Our attempt to introduce the concepts of structures and objects had mixed results. Students gained an appreciation for "record" structures by creating imbedded lists and specialized functions for data retrieval. The form of these however proved to be less natural than with Pascal "records." Students couldn't quite see the point of creating specialized functions (creating walls around the data abstraction) when it was easy to gain direct access to the desired value using Scheme primitives. Scheme supports an object-oriented overlay to which students were introduced. They found this frustratingly restrictive, and again, couldn't see the point of the extra baggage.
We encountered two major problems with Scheme that are caused by its heavy functional flavor. First, it has an unnatural notation for expressing iteration and array indices, and second, it does not adequately support interactivity. Both are tasks that are well suited to sequential execution which is not well supported in Scheme because the formalism is awkward.
Students found the notation for both iteration and interactivity to be confusing. Worse, although students were writing complex programs, they did not become excited about the programs they were writing. The interface between Scheme and the Macintosh Tool Kit required a sophisticated understanding of data typing. Students complained that the programs they were writing were far less interesting than those produced in a non-majors' course using Visual Basic.
In academic years 94-95 and 95-96 we used Mathematica as a first programming language. Mathematica includes powerful functionality for expressing solutions functionally, interactivity, and with some added definitions, in an object-oriented manner. It must be stressed that we do not teach Mathematica, we use it as a vehicle for teaching computer science. Consequently, much of the power of Mathematica is never introduced to our students.
Our successes here matched those using Scheme: modularity, recursion, the advantage of the interpreted environment were successfully taught. Mathematica had the added advantage of its "notebook" front end in which text, function calls and function definitions (calls to define the function) can all be mixed and formatted in a window and later saved and printed. Students' lab reports became better organized than they had been in Scheme. Students found expressing structure and object concepts to be far more straightforward than in Scheme. The subtle distinction in notation between the languages proved to provide a huge advantage for Mathematica. Mathematica also proved to be better for creating a bridge to array-based concepts. It has interactive constructs (for, while, do) that mimic the notations of C or Pascal. It uses a far more natural array indexing notation than Scheme. This has allowed us to present the different expressive forms for a repetitive problem in the first semester. By comparing the expressions, students learn to see the critical parts of a "loop" much more clearly. Mathematica proved to be more cumbersome than Scheme for creating interactive programs because doing so draws the student out of the notebook environment. In a sense one is trapped in the notebook model, which is neither the classic menu environment nor an event-driven one.
Java has the potential to address these problems. It forces modularity through its demand for class definitions, it has a notional form that easily expresses functional and imperative paradigms, and obviously the object-oriented one. It supports lists with a clean notation at a high level. It has a relatively easy to use tool kit for expressing interactivity. It contains hooks for doing very impressive internet code, and consequently is highly motivating from both a GUI and distributed computing standpoint. Its disadvantage is that to date environments to support it are primitive. It requires a rather absurd wrapper around a "main" object that contains references to concepts that would confound a novice. However, assuming truly interpreted forms of the language become available within a rich supportive environment, it is worth consideration. On balance, its ease of use outweighs its demand for attention to minutia.
THE SECOND SEMESTER COURSE
Our second semester course peels the onion. Concepts learned in the first course are re-addressed but from the perspective of how the constructs are built (for example, lists from arrays or pointers), and how solutions can be expressed more efficiently by more directly accessing the machine. The first course is characterized as "what one can do", the second as "how it works."
We used variations of C and C++ since 1993. In academic year 92-93 Pascal as a follow up to using Scheme. This combination did not work well. Students who had successfully learned to write functions that returned complex values in Scheme were derailed by Pascal's inability to return complex data types. The elegance achievable in Scheme was not possible in Pascal.
In academic year 93-94 we naively taught C with the standard IO library "stdio." Later in the course we tried to introduce objects via C++. This proved to be too much syntax (minutia) for a single semester course. In academic year 94-95 we used Mathematica in the first course and tried going directly into C++ with a heavy emphasis on objects. This too proved overwhelming because the amount of minutia associated with writing even a simple application from an object standpoint in C++ is huge. Our experience taught us that C++ is a terrible language for beginners if used in the spirit in which it was intended. The ideal of object-orientedness promoted by SmallTalk is thwarted in C++ because the notational minutia overwhelms the programming process. Since C is implicitly supported within C++, there is far too much temptation for beginning programmers to resort to imperative rather than object-oriented style. A part of our problem stemmed from an environment that had an outstanding debugger for studying pointer-based code, but that did not adequately present good visualizations of objects.
In academic year 95-96 compromised in order to focus on studying implementations. We used a C++ compiler, but did not require students to fanatically define their solutions within objects. We chose this route because the standard IO package "IOstream" contains less minutia, and admittedly less functionality for our purposes than "stdio". We did however require students to use pre-defined object classes for their programs. This proved particularly useful when comparing implementations of data types. For example, a list class can be implemented using a variety of array and pointer techniques. Students saw the power of data encapsulation when their surface applications could link any one of these implementations. They saw the efficiency issues first hand by collecting experimental results from different implementations.
Our current C/C++ hybrid is proving successful because of its flexibility. Although Java will support some access to the machine, it is still too protective of the programmer. For example, at some point the principle of recycling should be studied: memory allocated must be de-allocated.
SUMMARY
The goal of an introductory sequence in computer science should be to present the foundations of the field. Becoming a proficient programmer is secondary. Instruction in programming should not create a mind set that precludes learning new languages and paradigms.
Our first group of students taught within the new curriculum have just graduated. Almost all of our current juniors and seniors hold part time jobs in industry. Interviews with them, their advanced course instructors and their employers provide us with insights into our success. In our previous Pascal-based curriculum, a minority of our students had extremely high proficiency in a single environment. Our current students exhibit less knowledge of the details of any particular language or environment. This is seen as a disadvantage by instructors of upper level courses who relied on students extensive experience in a single language. However, the current students seem to be far more adept at mastering new concepts, working in new environments and learning new languages. This is viewed by employers and the students themselves as a huge advantage.
Regarding the issue of language choice, we believe the Mathematica/C++ combination is adequate but certainly not perfect. It is possible that Java can supplant Mathematica because it supports the high level expressibility we need for a task-centered approach.
BIBLIOGRAPHY
[1] ACM Curriculum Committee on Computer Science Computing
Curricula 1991, Report of the ACM/IEEE-CS Joint Curriculum Task
Force, ACM Press, 1991.
[2] Cornell, G and C. Hostrmann, Core Java, Mountain View CA, SunSoft Press, 1996
[3] Decker, R., Hirshfield S. The Analytical Engine, An Introduction to Computer Science Using HyperCard 2.1 . Boston, MA: PWS Publishing 1994.
[4] Decker, R. & Hirshfeld, S. Top-Down Teaching: Object-Oriented Programming in CS 1. SIGCSE Bulletin , Vol. 25, no. 1, 1993.
[5] Osborne, M.& Johnson, J. An Only Undergraduate Course in Object-Oriented Technology. SIGCSE Bulletin , Vol 25, no. 1, 1993.
[6] Shneiderman, B. Designing the User Interface Strategies for Effective Human-Computer Interaction. Reading, MA: Addison Wesley, 1992.
[7] Wolz, U. and E. Conjura, Integrating Mathematics And Programming Into A Three Tiered Model For Computer Science Education, SIGCSE Bulletin , Vol. 26, no. 1, 1995.
[8] Wolz, U. and E. Conjura, Abstraction to Implementation: A Two Stage Introduction To Computer Science, Proceedings of the 1994 National Educational Computing Conference , Boston MA, June, 1994.
[9] Wolz, U., S. Weisgarber, D Domen and M. McAuliffe, Teaching Introductory Programming in the Multi-Media World, Proceedings of the ACM SIGCSE/SIGCUE Conference on Integrating Technology in Computer Science Education . June 1996.