This document presents a rationale for a programming language, O'Haskell, that has been consciously designed with these considerations in mind. O'Haskell is defined by conservatively extending the purely functional language Haskell with the following features:
The informal object model is naturally concurrent, due to the simple fact that real world objects communicate and ``execute their methods'' in parallel. On this informal plane, the intuition behind an object is also entirely reactive: its normal, passive state is only temporarily interrupted by active phases of method execution in response to external requests. Concurrent object-oriented languages, however, generally introduce a third form of state for an object that contradicts this intuition: the active, but indefinitely indisposed state that results when an object executes a call to a disabled method.
The view of indefinite blocking as a transparent operational property dates back to the era of batch-oriented computing, when interactivity was a term yet unheard of, and buffering operating systems had just become widely employed to relieve the programmer from the intricacies of synchronization with card-readers and line-printers. Procedure-oriented languages have followed this course ever since, by maintaining the abstraction that a program environment is essentially just a subroutine that can be expected to return a result whenever the program so demands. Selective method filtering is the object-oriented continuation of this tradition, now interpreted as ``programmers are more interested in hiding the intricacies of method-call synchronization, than preserving the intuitive responsiveness of the object model''.
Some tasks, like the standard bounded buffer, are arguably easier to implement using selective disabling and queuing of method invocations. But this help is deceptive. For many clients that are themselves servers, the risk of becoming blocked on a request may be just as bad as being forced into using polling for synchronization, especially in a distributed setting that must take partial failures into account. Moreover, what to the naive object implementor might look like a protocol for imposing an order on method invocations, is really a mechanism for reordering the invocation-sequences that have actually occurred. In other words, servers for complicated interaction protocols become disproportionately easy to write using selective filtering, at the price of making the clients extremely sensitive to temporal restrictions that may be hard to express, and virtually impossible to enforce.
Existing concurrent object-oriented languages tend to address these complications with an even higher dose of advanced language features, including path expressions [3], internal concurrency with threads and locks [4, 5], delegated method calls [6], future and express mode messages [7], secretary objects [8], explicit queue-management [6, 9], and reification/reflection [10]. O'Haskell, the language we put forward in this document, should be seen as a minimalistic reaction to this trend.
Concurrent execution is introduced in O'Haskell by invoking asynchronous methods, which let the caller continue immediately instead of waiting for a result. A blocking method call must generally be simulated by supplying a callback method to the receiver, which is the software equivalent of enclosing a stamped, self-addressed envelope within a postal request. Two additional features of O'Haskell contribute to making this convention practically feasible. Firstly, methods have the status of first-class values, which means that they can be passed as parameters, returned as results, and stored in data structures. Secondly, the specification of callbacks, their arguments, and other prescribed details of an interaction interface, can be concisely expressed using the statically safe type system of the language.
There is, however, an important subcase of the general send/reply pattern that does not require the full flexibility of a callback. If the invoked method is able to produce a reply in direct response to its current invocation, then the invoking object may safely be put on hold for this reply, since there is nothing but a fatal error condition that may keep it from being delivered. For such cases, O'Haskell offers a synchronous form of method definitions, thus making it possible to syntactically identify the value-returning methods of foreign objects that truly can be called just as if they were subroutines.
With these contrasting forms of communication in mind, we may draw an analogy to the cost-effective, stress-reducing coordination behaviour personified by a top-rated butler(!). Good butlers ask their masters to standby only if a requested task can be carried out immediately and under full personal control (or, if necessary, using only the assistance of equally trusted servants). In all other cases, a good butler should respond with something like an unobtrusive ``of course'' to assure the master that the task will be carried out without the need for further supervision, and in applicable cases, dutifully note where and how the master wants any results to be delivered. Only an extraordinarily bad butler would detain his master with a ``one moment, please'', and then naively spend the rest of the day waiting for an event (e.g. a phone call) that may never come.
Reactive objects in O'Haskell allow only ``good butlers'' to be defined. We believe that this property is vital to the construction of software that is robust even in the presence of such complicating factors as inexperienced users, constantly changing system requirements, and/or unreliable, long-latency networks like the Internet. As an additional bonus, reactive objects are simple enough to allow both a straightforward formal definition and an efficient implementation. Still, the model of reactive objects is sufficiently abstract to make concurrency an intuitive and uncomplicated ingredient in mainstream programming, something which cannot be said about many languages which, like Java, offer concurrency only as an optional, low-level feature.
Such a change in semantics can have, and should have, consequences to the whole structure of the user of a service. This fact is well illustrated by programs that utilize Sun's distributed file system NFS. For pragmatic reasons, Sun decided not to change the file system interface when going from a local implementation to a distributed design, sensitive to partial failures. Since file system clients accordingly did not need to change either, most programs that access NFS remain totally unprepared for the fact that a remote file server may go down, which shows up to the user as a total loss of responsiveness in those situations [11]. Another example on the same topic is given by common web-browsers. These programs are generally carefully designed to allow continuous interaction even if a remote web server is not responding. However, most browser implementations fail to notice the fact that name-lookup of web addresses usually takes the shape of a remote service as well, and hence their user-interfaces freeze completely in the event of an inaccessible name server. Such a mistake would not be possible in a language where indefinite blocking cannot be masqueraded as a simple method call.
In fact, revealing the potential for indefinite blocking by requiring a callback is just a continuation of the argument that operational properties should be captured by types. Monadic programmers are already used to the idea that changing a pure function by adding side-effects should not be possible without simultaneously changing the type of the function as well. Reactive objects are simply an application of this idea to the temporal aspects of computing.
Having that said, we would also like to point out that even established object-oriented practice is characterized by a desire to keep the message-passing metaphor free from blocking complications as far as possible, e.g. by attempting to concentrate all uses of potentially blocking operations to the head of a so-called event-loop. Programming styles developed with these considerations in mind are thus immediately expressible in O'Haskell, even without the introduction of callbacks. In fact, many common services do not even return a reply (e.g. methods that implement event-signaling or simple data pipelining), so in these cases simple asynchronous communication would suffice. The consequence of preserving reactivity primarily manifests itself in the encoding of classical synchronization primitives like semaphores and channels. Programming with explicit synchronization operations seems to require a major rethink in O'Haskell, a topic which is illustrated on the O'Haskell programming examples page.
Accordingly, all but the most spartan object-oriented languages give basic types such as numbers some special treatment that more closely matches our intuition. But there is a wealth of mathematically inspired structures that suffer from the same conceptual mismatch - these include pairs, lists, records, functions, and algebraic datatypes. Encoding such structures in terms of stateful objects is not only tiresome and error-prone, it may also prohibit efficient and safe sharing of data.
As an example, consider an encoding of vectors in space in terms of objects. In such an encoding, arbitrary vector values cannot simply be written as canonical expressions when needed, because values are represented by object references that must be created at some point in time by an instantiation command. Moreover, ordinary equality on these references actually makes it possible to discriminate values on basis of their location, i.e. to distinguish between this origin and that origin, etc. Hence the programmer must make sure that special purpose code gets called whenever a mathematical notion of vector equality is desired. And not the least, the programmer must also resist the temptation to implement a vector operation by destructively updating the state of the called object. If not, the meaning of vector values will indeed vary with the operations being performed, which is just as illogical for vectors as it is for numbers. Yet in an imperative language, destructive updating may often be the only reasonable implementation alternative, especially in the encoding of slightly more complex structures such as dynamic lists. Sharing stateful values with unknown code is obviously unsafe from this point of view, all the more if the unknown code happens to be a concurrently executing process. So this leaves the programmer with the choice of either conservatively cloning local objects that represent invariant values before sending them off to an unknown context, or (which is more likely) circumscribing stateful messages with informal restrictions on what can be done with them, and then simply hope for the best.
This property also goes under the name of referential transparency, and is often considered to be the defining characteristic of a purely functional language. Referential transparency enables the use of equational reasoning as a program verification technique, which indeed is a very strong argument in favour of a declarative programming style. Additional benefits commonly attributed to functional programming include a succinct notation, an efficient form of declaration by pattern-matching, and the important ability to treat functions themselves as first-class data values [14].
Purely functional programs are however inherently weak in expressing interaction. Their calculator-like model of computation intuitively suggests that input is a single value that somehow happens to be available at program initiation, and that output can be limited to a single value returned at termination. Interacting objects can at best be modeled indirectly, but then only by means of encodings that reintroduce terms like state, change, and identity in the model - i.e. the very notions that advocates of functional programming deliberately have shun. Besides that, a fundamental problem with the functional model is that it cannot host the non-determinism of concurrently executing objects without a substantial repercussion on either its semantics or pragmatics [15, 16].
The functional and object-oriented paradigms achieve conceptual efficiency by means of purification in terms of either values or objects. Yet both views are indispensable in modern programming [17]. Values and value-oriented programming capture the fact that the computer indeed is a fast and flexible calculator, whereas objects and object-oriented programming focus on the storage capacity of computers, and the need for well-structured interaction with this storage. Value-oriented programming manifests computation by means of expressions that denote a result, while object-oriented programming uses commands that, when executed, will cause an effect. Values in this sense represent ideas, or abstractions in our minds, that have a name but no definition. Objects, on the other hand, represent our interpretation of concrete things in the world around us, as we understand them in terms of their constituent parts and their dynamic behaviour. The programming style for computing with values is thus inherently declarative, whereas the nature of object-oriented programming is intrinsically imperative. And just as declarative thinking and imperative doing are complementary aspects of our daily life, so are the declarative and imperative styles when it comes to modern computer programming. Fortunately, the discovery of monads has made it possible for a programming language to be truly declarative and truly imperative at the same time. We will capitalize on this achievement in our integration of value- and object-oriented programming in O'Haskell.
Monads have been successfully utilized to incorporate a quite diverse set of classically imperative features in Haskell, including traditional I/O [20], graphical user interfaces [24, 25], first-class pointers [20], concurrent processes and synchronization variables [13], and exceptions [26]. The common denominator in all these proposals is the unifying property of a stratified semantics, that limits the effect of imperative commands to the top-level of a program, while leaving the purely declarative semantics of expression evaluation unaffected. This means that a monadic command under execution may very well request the evaluation of an expression, but an expression being evaluated can never trigger the execution of a command.
O'Haskell brings the world of object-based concurrency into the realm of monadic functional programming. Compared to other monad-based Haskell extensions, O'Haskell represents a higher level of abstraction, by its use of objects instead of pointers, and methods instead of peeks and pokes as the basic imperative building blocks. The high-level flavour of O'Haskell is also evident in its syntax for monadic programming, that improves on Haskell by also taking the need for unconvoluted use of assignable state variables into account.
The essence of the monadic approach is the insight that doing something and merely thinking about doing something are radically different activities. Translated into programming terms, this means that an imperative command is also considered to be a first-class declarative value, although a value of an abstract type (the monad) that just represents the hypothetical effect of a command, should it ever be executed. Command values can thus be declaratively combined, named, parameterized, and calculated with just like any other value, since the evaluation of a command expression is kept entirely distinct from the actual realization of the effect it represents.
All commands in O'Haskell, including those that refer or assign to state variables, enjoy the status of first-class values. Furthermore, since any command can be individually named, the name of a method invocation may equally well be considered the name of the actual method it invokes, just as the name of an object creation command may be identified with the class of objects it instantiates. Hence we are able to promote even methods, as well as object templates, to full-blown value status in O'Haskell (O'Haskell actually uses the term template instead of class to avoid confusion with the established Haskell concept of type classes).
This last property makes parameterization over callback methods easy. In fact, higher-order parameterization as a general programming technique has the potential of facilitating very simple solutions to many problems that immediately seem to lead to quite contorted encodings in established object-oriented languages. Examples of this include parameterization of an object template over (i) a set of neighbouring objects, (ii) other templates from which private but unknown sub-objects may be created, or (iii) a single command denoting a yet undefined aspect of an object's behaviour. Other speculative examples are methods creating object instances from a supplied template parameter on behalf of external clients, methods taking a communication pattern as a command argument, or simply a method parameterized over a function value that expresses some unknown computation on the local state. This flexibility should be compared to the complicated exercise in inheritance and override that is required in an object-oriented language like Java to express even such a simple concept as a callback parameter.
The communication interface returned when an object is created is also
a full-fledged value in O'Haskell, that may or may not reveal
information about the identity of the object behind the interface.
Most likely an object interface is defined as a record of method
values, but it might also be a tuple of such records, or even a
function that returns different interfaces depending on a supplied
argument such as a password! Separating values from stateful objects
opens up many interesting possibilities, as we hope our coding
examples will demonstrate.
The predominance of a particular kind of type-forming operator in each school is not a coincidence, though [29]. Programming is essentially the task of manipulating data structures, and different parts of a program take the roles of either data-producers or data-consumers in this play. The functional approach to programming is to ask ``how is data constructed?''. This leads to a style of programming where the data constructors are considered primitive, and where data-consuming code accordingly is defined by pattern-matching over these constructors. The object-oriented programmer instead starts out by asking ``what can we do with the data?'', which fosters the view that it is the data selectors that should be seen as foundational, and that the code of a data-producer consequently should be defined as an enumeration of implementations for these selectors. (Here already lies the key to the distinctive form of data abstraction that goes under the name object-orientation: the producer of a data object gets full control over what the result of consuming the object will be - the consumer is effectively hindered from making that decision on basis of how the object was constructed.)
The preference for a certain form of data structures also reflects a particular view of where extensibility is most needed. The functional style ensures that adding another data-consumer (a function) is a strictly local change, whereas adding a new data-producer requires extending a datatype with a new constructor, and thus a major overhaul of every pattern-matching function definition. Likewise, in object-oriented programming the addition of a new data-producer (a new class) is a cinch, while the introduction of another data-consumer means the extension of an record type with a new selector, a global undertaking with implications to potentially all existing classes.
To alleviate these drawbacks to some extent, functional and object-oriented traditions prescribe their own custom methods. Parametric polymorphism in functional languages allows functions to be defined uniformly for arbitrary sets of data constructors, including constructors not yet visualized. Object-oriented subtyping, on the other hand, facilitates incremental definition of record types, so that old data-consuming code still can be reused even on data objects defined by a larger set of selectors than what was originally envisaged. The latter openness to future refinements in design is actually the intuition behind the unmistakable is-a-relation that forms a central part of the biological metaphor permeating object-oriented classification [30]. However, as is well known, nothing really precludes the merits of subtyping from being applied to the definition of datatypes as well; or for that matter, polymorphism to be utilized in conjunction with records [31].
As a hybrid language, O'Haskell attempts to bring the design issues that accompany constructor- or selector-based programming to the fore, so that the programmer can choose the right kind of data structure for each task. Hence the type system of O'Haskell offers equal opportunities for working with labeled sums and labeled products. Polymorphism is available in both cases, and so is the possibility of defining types incrementally, (i.e. subtyping is supported for both parameterized records and parameterized datatypes). Furthermore, neither kind of data structure is tied to any particular choice between declarative and imperative programming, nor is the availability of type inference dependent on whether a datatype or a record type is used. However, type inference in conjunction with subtyping holds its special set of problems, which have been given a somewhat unconventional, yet very effective solution in O'Haskell.
Still, although complete algorithms for polymorphic subtype inference exist, practical language implementations that take advantage of these are in short supply. The main reason seems to be that ``the algorithms are inefficient and the output, even for relatively simple input expressions, appears excessively long and cumbersome to read'' [44]. Many attempts have been made at simplifying the output from these algorithms, but they have only partially succeeded, since the problem in its generality seems to be intractable, both in theory and practice [44, 41].
However, even if type simplification were not an issue, there is an inherent conflict between generality and succinctness in polymorphic subtyping that is not present in the original Hindley/Milner type system. While the principal type of (say) a Standard ML expression is also the syntactically shortest type, the existence of subtype constraints in polymorphic subtyping generally makes a principal type longer than its instances. In particular, the principal type for a given expression may be substantially more complex than the simplest type general enough to cover an intended set of instances! Thus, type annotations, which give the programmer direct control over the types of expressions, are likely to play a more active role in languages with polymorphic subtyping than they do in Haskell or Standard ML, irrespective of advances in simplification technology.
In the design of a type system for O'Haskell, we have taken a pragmatic standpoint and embraced type annotations as a fact of life. This has enabled us to focus on the much simpler problem of partial polymorphic subtype inference. As one of the main contributions of the language, we present an inference algorithm for our type system that always infers types without subtype constraints, if it succeeds. This is a particularly interesting compromise between implicit and explicit typing, since such types possess the desirable property of being syntactically shorter than their instances, even though they might not be most general. We might say that the algorithm favours readability over generality, leaving it to the programmer to put in type annotations where this strategy is not appropriate.
Our inference algorithm is based on an approximating constraint solver, that resorts to ordinary unification when two type variables are compared. An exact characterization of which programs the algorithm actually is able to accept is still an open problem, but we prove, as a lower bound, that it is complete with respect to the Hindley/Milner type system, as well as the basic type system of Java. The algorithm is furthermore both efficient and easy to implement, and as our programming examples will indicate, explicit type-annotations are rarely needed in practice.
We believe that working with named and parameterized types, whose subtype relationships are defined by declaration, has the immediate benefit of giving the programmer full control over the type structure of a program. This can be a valuable tool in the design phase of a large system, and it also offers greater protection from certain logical errors. Furthermore, name inequivalence is more in line with both the way datatypes are treated in Haskell and other functional languages, and with the object-oriented notion of subtyping between named classes. And not the least, the ability to refer to types by name has a big impact on the readability of the output from our inference algorithm.
[Back to the O'Haskell homepage]