July 15, 2004

1.0 Defining Characteristics

Blog root page
next post in category

There is really nothing new in OO development. One can trace evey characteristic of OO developmnet to some good programming practice or elegant technique that already existied in bygone days. The main thing that is uniuqe about OO development is the overall package. It is the way those practices are tied together that makes OO development unique and provides a synergy that makes the whole greater than the sum of its parts. In other words, it is about the way good practices paly together.

Nonetheless there are some fundamental characteristics that are crucial to OO development just because of the emphasis placed upon them or they way they play with other practices. A number of authors who should know better claim that one can't do OO development without inheritance and polymorphism. That's basically a crock. Both inheritance and polymorphism are secondary features that are enabled by more fundamental characteristics. So what are the fundamental cahracteristics of OO development? So glad you asked...

Abstraction. This is the biggee. It is absolutely impossible to do OO development without abstracting some problem space. Basic OO technique involves identifiying problem space entitites that are relevant to the problem in hand and then abstracting their intrinsic properties that are necessary to solve the problem in hand. When abstracting properties one does it in terms of responsibilities for knowing something or doing something.

Encapsulation. Encapsulation is the localizes related properties for a single entity. It does so by exposing responsibilities to other entities without exposing how the responsibility is met. This is somewhat more subtle than implementation hiding (below) because supporting a responsibility may not require an implementation at all. For example, to satisfy a responsibility for knowing its age, a Person object could compute it from its DOB and the current date rather than providing a data store. Encapsulation is a generic term that incorporates several other OO ideas like implementation hiding, information hiding, decoupling, logical indivisibility, and cohesion. The best way to think of encapsualtion is as a package or structure for providing other fundamental OO characterisitcs.

Implementation Hiding. While an object exposes its responsibilities, it hides the implementation of How it meets its responsibilities. That is, the responsibility defines What the object knows or does while the implementation defines How it knows or does it. If this sounds hauntingly familair, it should. OO development, particularly OOA, borrowed heavily from requirements analysis. Typically this is achieved through an interface.

Information Hiding. Many people treat information hiding as essentially synonymous with 'implementation hiding' or, alternatively, apply 'implementation hiding' only to behavior responsibilities and apply 'information hiding' to knowledge responsibilities. In my opinion neither is quite correct. Information hiding is concerns hiding information about the object itself. Most of the time that is information about its implementation. However, it can include other information, such as collaborations. As a general rule the client of an object responsibility should not have any notion of who else the object may colalborate with. That is, the client should not have to understand the rules and policies that govern the service object's participation in other relationships. Thus the fact that the relationship may be implemented through a referential attribute is a separate issue.

Cohesion. The notion of cohesion is closely linked to encapsulation and logical indivisibity (below). The basic idea is that a subsystem or object should provide a cohesive view of the underlying problem space entity. Cohesion, though, is more about the semantics of the entity. One abstracts intrinsic properties of the entity without regard to context. In addition, those property abstractions shold be logically related in the context of the problem in hand.

Decoupling. This is a general term that a number of OO characteristics are designed to support. The basic idea is that maintainability is directly linked to the amount of knowledge one entity has of another. The more carnal the knowledge, the more likely that both will have to be touched when reauirements change. In OO development we are most concerned with implementation coupling where a collaboration object depends on How another object does something. However, it also applies to other circumstances, such as two objects having copies of the same data. One can argue that a primary goal of OO development is to minimize coupling among software entities.

Separation of Message and Method. This is one mechanism for providing encapsulation. By separating the message the client sends to an object from the method that responds to it, one hides the information and implementation of the receiving object. This allows the client to generate messages as announcements of what it has done. The OO developer can then route that message to whatever other object should respond to the message. At the OOA/D level this routing can be done at a different level of abstraction than the implementation of the sender's method that generates the message.

Interfaces. This is another mechanism for providing encapsulation. This is the primary mechanism available at the 3GL level because of the procedural message passing and type systems employed to implement the OOPLs. However, the same basic idea applies as for separation of message and method: the interface defines what type properties (responsibilities) are accessible while tne method implementation remains hidden.

[Unfortunately this is an imperfect mechanism because the message is defined as the method signature by the 3GL type system. Since we name methods by what they do the message becomes an imperative by virtue of naming conventions. (In OOA/D one defines the class interface separately from the class rather than as part of its definition so the namespace separation is preserved.) Fortunately this is relatively harmless so long as the OOA/D is done properly. That is, separation of message and method in the OOA/D eliminates dependencies that might creep in if one goes to OOP directly.]

Peer-to-Peer Collaboration. When objects communicate they do so directly on a peer-to-peer basis. There is none of the hierarchical flow of control that characterised "spaghetti code". That is, there are no "high-level" controller objects corresponding to upper nodes in a classic functional decomposition tree that coordinate the sequencing of other object's operations. One way to think of this is that objects correspond to the leaf processes in a functional decomposition tree and they talk directyl to one another.

Logical Indivisibility. Peer-to-peer collaboration can only work if one has a flexible view of logical indivisibility. In OO development logical indivisibility is defined at three levels: subsystem, class, and responsibility. A subsystem encapsulates a single logically indivisible subject matter. A class encapsulates properties from a single logically indivisible problem space entity. A class responsibilitiy encapsulates a single, logically indivisible property. The flexiblity lies in abstraction. At each level the notion of indivisibility depends upon the level of abstraction. The level of abstraction, in turn, depends upon the problem context.

Self-containment. When objects are abstracted from a problem space the properties are supposed to be intrinsic properties that are independent of the context of a particular problem solution. A corollary to that is that object properties are self-contained. That means that the responsibility can be fully specified without reference to behaviors in any other objects. As a practical matter that means that any object responsibility can be exhaustively unit tested without implementing any other object behaviors (or providing an active stub in the test harness). This characteristic is crucial to avoiding hierarchical depenendencies.

Classes. In OO development the intrinsic properties of a problem space entity are abstracted as knowledge and behavior responsibilities. Classes group entities based upon all members sharing a unique suite of responsibilities. So the OO notion of 'class' maps directly into the notion of a set. OO just uses properties to define the members of the set.

[At the OOP level OOA/D classes are mapped into the ubiquitous 3GL type systems. Types are defined in terms of properties that can be accessed. The mapping is 1:1 but it results in a substantial difference in viewpoint. OOA/D classes are about set membership while OOPL type systems are about property access.]

Subclassing. OO subclassing is an extension of the class idea that deals with entities that are rather similar yet different enough to warrant being in different classes. Basically one defines a class for a large group of entities based upon some common set of charateristic properties that they all have. But there may be subsets of those members that each have additional properties. That presents a conundrum if one defines a class as a unique suite of properties common to all of the entities. To get around that the OO approach introduces the notion of specialization where the subsets form their own subclasses where the special properties are in common.

Thus the superclass that is the union of the subclass subsets defines the common properties for all members and the subclasses represent unique suties of specialized properties. This is often referred to as generalization (superclass) and specialization (subclass). However, a unique thing about OO subclassing is that all of the classes in the subclassing relation exist to resolve the properties of a single instance of the root superclass. That is, unlike Data Modeling using Entity Relationship Diagrams for RDB schemas, the superclasses cannot be instantiated independently of the subclasses. This is why the OO subclassing relation is commonly known as an "is-a" relation; a member of a leaf subclass is-a member of every superclass and resolves the properties of every superclass from which it is a direct descendant. [Their are some additional rules for forming OO subclassing relations that further constrain construction compared to Data Modeling. But that's a more detailed issue...]

Inheritance. Inheritance itself is a relatively simple concept. Inheritance is simply a set of rules for resolving the properties of a leaf object in a subclassing relation. The subclassing relation itself is essentially just a Venn Diagram in tree that identifies subsets of members having the same properties. The tree format is used because a single 2D mapping would be, at best, confusing and, at worst, topologically impossible. Basically inheritance is just a set of simply rules for "walking" the tree to collect all the relevant properties for a single member of the root superclass based upon its subclass membership.

Inclusion Polymorphism. In OO development several forms of polymorhism are supported. However, most people are referring to this particular form of polymoprhism with they don't provide a qualifier. Polymorphism in general is about substitution of behaviors in the OO context. When a general behavior responsibility is accessed through a superclass, the descendant subclasses can provide a unique implementation for that behavior. Thus the actual behavior implementation will be different (substituted) based upon which subclass the the actual member in hand belongs to.

Using the word 'implementation' is somewhat misleading here. One usually does not have different implementations of exactly the same semantic behavior responsibility in mulitple subclasses where the implementations all produce the same results. For example, one usually does not implement a generic superclass 'sort' responsibility with an insertion sort in one subclass and an Quicksort in another subclass. Instead one normally has the subclasses implement a different behavior that produces different detailed results for a more generic superclass responsibility.

It is that abililty to substitute behaviors that produce different detailed results that provides great power to inclusion polymorphism. Note, though, that inclusion polymporhism is enabled by inheritance (which provides the rules for resolving substitution) and inheritance is enabled by the special way OO provides subclassing. In addition, inclusion polymorphism could not work well unless side effects were eliminated by self-containment, implementations were hidden, and interfaces provided decoupling. So, while incusion polymoprhism may be fairly ubiquitous, at least at the OOP level, I don't see it as a definitive characteristic.

Blog root page
next post in category

June 28, 2004

2.0 Why do OO development?

Blog root page
previous post in category
next post in category

I did my first program on a plug board in '57. That was an intersting era where Assembly language was the Silver Bullet that would solve the Software Crisis. Between changing vacuum tubes and programming plug boards I had such a character-building experience that I didn't go back to software development until the late '60s when 3GLs were commonly available and computers were solid state.

The '60s and '70s were known as the Hacker Era. Back then 'hacker' was a complimentary term that described someone who could produce prodigious amounts of code in a short time and who had an almost supernatural ability to get it back up and running when it broke. I had a couple of good years in the era when I hit 100 KLOC. Fortunately that code is long gone and no one will ever see it again.

Alas, by the late '70s people figured out that if the code didn't break in the first place there would be no need for supernatural powers to fix it. Perhaps more important, if the code were written better there would be no need for those indispensible hackers who were the only ones who could decipher the code. That was when the term 'hacker' became a pejorative.

The first systemmatic attempt to eliminate hackers appeared in the form of Structured Programming that provided a collection of good practices for writing 3GL code. That was quickly followed by Structured Design and Structured Analysis, both of which introduced more abstract graphical representations of programs. The dominant design technique became top-down functional decomposition where the solution was started with a very simple and general statement of the problem solution and then one successively decomposed that solution into more detailed levels. Each statement of functionality was collected as a node in an inverted "tree" whose lowest leaves were logically indivisible.

The impact of SA/SD/SP was enormous. Defect rates dropped from 150/KLOC to 5/KLOC. In addition, productivity for large projects where multiple programmers had to coordiante efforts improved greatly. Instead of 1000 programmers working for 10 years to produce 1 MLOC, 200 programmers could do the same job in 2-5 years

Alas, there was still a problem. Writing new code was one thing, but maintaining old code was quite another. Depending on whose data one examined it to 5-20 times more effort to modify existing code than the write it originally. As a result 60-80% of all developer effort was expended in maintaining existing software.

There were a lot of problems that led to the Maintainability Gap but they could be broadly categorized as having two root causes: uncontrolled access to state variables and hierarchical implementation dependencies. State variable access was primarily a defect problem as data was modified in unexpected ways at unexpected times during execution. That resulted in additional test and repair cycle time when one modified existing code because it was difficult to predict how changes would affect untouched code that happened to access the same data.

Hierarchical implementation dependencies resulted in the legnedary "spaghetti code". That was because the leaf nodes in the functional decpomposition tree were at a very fine level of abstraction -- essentially arithmetic or logical operators in the 3GL. It was simply too tedious to cobble together lengthy sequences of such atomic operations to do complex tasks. However, the higher-level nodes in the functional decomposition tree quite conveniently captured such sequences as descendants. Since this nodes were systemmatically derived they had defined functional semantics. That allowed them to be reused (i.e., accessed by "clients" in different parts of the application that happened to need the same sequence of leaf oeprations).

That sort of reuse through accessing higher-level functions was a boon to developers and led to the notion of "procedural development" because it made excellent use of the core characteristic of 3GLs, block structuring around procedures. The problem, though, was that the functional decomposition "tree" now became a lattice where each node potentially had multiple ancestors (clients) as well as multiple descendants. It was that fanout of dependency that led to spaghetti code.

The dependencies existed because in top-down functional decomposition the lower-level functions are extensions of their parent higher-level function. That is, the specification of the higher-level function included the specifcation of the lower-level functions. Thus any contract between the client and the higher-level function dependend upon the specification of the entire descedant tree of functions. So if one changed the specification of a lower-level function, the specification of all of its higher-level ancestors was also changed.

That was no problem so long as the access structure was a pure tree. That's because the change was probably triggered by a need to change the specification of a higher-level function and implementing the fix in the lower-level function was simply the easiest place to do it. However, when one has a lattice, the higher-level functions have multiple clients. If only one client wants the change, the other clients may be broken by the change. Worse, there can be a client at any level of ancestry in the tree, so the change may break clients that are not even direct clients of the original higher-level function. The result was a disaster for maintainability because every change for one client could potentially break a host of other clients. Fixing things to keep all clients happy often resulted in major surgery to the tree or very complex parameterization that complicated the functions.

In the '80s and '90s two very different approaches to software construction evolved to address these problems. One was functional programming, which grew up in the scientific programming arena. In functional programming persistent state variables are completely eliminated; all state is passed as function arguments and resutls. However, the hierarchical structure remained. That was because in scientific arenas algorithms are primariy defined mathematically so they tend to be quite stable. Therefore hierachical dependencies were not very relevant because there were no client changes to accommodate.

In addition functional programming introduced a number of features (e.g., sophiscticated parametric polymorphism, mixins, etc.) that allowed the construction of very compact and elegant programs in a computational environment. Typically functional programs are very intuitive to construct and they are often integer factors more compact that programs employing other construction techniques. So in the rare event that the program does have to be changed, it is no big deal if it is rewritten rather than simply modified. At the same time the problems of global data access are completely eliminated.

The second new contruction approach was OO development. It grew up in IT and R-T/E where requirements are highly volatile. In addition, the problem spaces are not defined with mathematical precision so there is a gap between the customer view of the problem and the computing space view. So OO development sought to address improved mapping between the customer space and the computing space in addition to managing global data and minimizing hierachical dependencies.

The priorities OO development placed on these goals was quite different than those placed by functional programming. Basically the priorities (1 is highest) were:

| functional | OO | ---------------------------+-----------------+--------------+ global data management | 1 | 3 | ---------------------------+-----------------+--------------+ hierarchical dependencies | 2 | 1 | ---------------------------+-----------------+--------------+ customer space mapping | none | 2 | ---------------------------+-----------------+--------------+

These priorities are no surprise, given the quite different problem domains in which they evolved. However, it had profound affects upon the construction paradigms -- to the point that the two apporaches are fundamentally incompatible. Thus any attempt to mix & match features across the approaches is doomed to defeat the benefits of either approach.

The OO approach addresses hierachical dependencies by completely eliminating the tree. One still does functional decomposition but only to identify the leaf nodes. Once the leaf nodes are identified, the tree essentially disappears. This works because several OO features play together...

Logical indivisibility. In OO developement we have a very flexible view of logical indivisibility. There are basically three levels at which it applies: subsystem, class, and reponsibility. A subsystem represents a large scale encapsulation of a single subject matter. A class represents the encapsualtion of a single problem space entity. A responsibility represents the encapsulation of an atomic element of knowledge or behavior.

In all cases the notion of indivisibility depends upon the level of abstraction one needs to solve the problem in hand. For example, it is not uncommon for a single responsibility in one subsystem to expand into an entire class or even an entire subsystem outside the context of the given subsystem's subject matter. This flexibility avoids the box one gets into when the notion if 'indivisible' is tied to something like 3GL arithmetic operators.

Peer-to-peer collaboration. In the OO approach objects collaborate directly with one another rather than through higher-level controllers. IOW, at a given level of abstraction, all entities are peers and communicate directly with one another. Such collaboration is supported by the the notion of relationships between entities. Such relationships are very important in OO development because they provide a static structure on which message addressing is based. That structure is independent of the semantics of individual classes.

Separation of message and method. In OOA/D the message that one object sends is a quite different thing than the method with which the receiving object responds. That allows messages to be generated independently of external context; they simply announce that the sender has done something. It is up to the developer to determine who cares about what happened enough to provide a response. In UML that can be done at the level of Interaction Diagrams, which is a higher level of abstraction that individual object implementations.

[Alas, the OOPLs don't provide a similar separation. That's because they are 3GLs and they have to make compromises with the computational model at that level of abstraction. So the message identifier is also the responding method's identifier because the 3GLs all employ procedural message passing. However, if the OOA/D has been done properly the methods and collaborations will have been defined so that this is benign.]

Encapsulation and Implementation Hiding. These work to ensure that the implementations are properly decoupled. That is, the specification of a responsibility only needs to be defined in terms of the intrinsic rules and policies implicit in the responsibility. In other words, one should be able to exhaustively unit test an object method without implementing any other behaviors. More important, one can be confident that the state of the application will be the same after executing that method in the unit test in situ in the application as it would be executing that method in a unit test.

Asynchronous behavior model. In the OOA/D behavior is assumed to be asynchronous. That is, one assumes there is an arbitrary delay between the time a message is issued and when it is consumed (i.e., a behavior responds). (Because message and method are not separated in the OOPLs, the OOP model is synchronous.) This makes it somewhat more difficult to construct correct OOA/D models, but it yields huge dividends in maintainability and robustness. If one can't count on something happening immediately after issuing an message, one can't very well make the sender count on something specific having happened as it continues to execute.

When combined properly these features all ensure a very high degree of decoupling of implementations that completely eliminates the hierarchical dependencies of spaghetti code. Ironically, if one looks at the method call graph of a typical OO application it looks even worse that the rats' nests from procedural applications. That's because logical indivisibility, spearation of concerns, cohesion, and other OO practices tend to produce a lot of small abstractions with limited individual responsibilities that are highly interconnected due to peer-to-peer messaging. Don't worry about it. Those call graphs represent message traffic, not dependencies. One of the prices one pays for eliminating implementation dependencies is a lot more messages between a lot more entities.

On the global data front the OO approach essentially still allows it. Any public knowledge attribute of any object is available for access by any other object connected to it over some relationship path. So, in effect, all public atributes are global. However, the OO approach does address the issue by providing support for much better management of global data. Again, several features play together...

Encapsulation and implementation hiding. The data has one owner so anyone who wants to modifiy the data must talk to that owner. This raises the level of abstraction of access to that of collaboration. While that may not seem important, it enables other techniques for controlling access, such as the Observer design pattern. The real value, though, is forcing the developer to think about accessing data in terms of entities and their collaborations.

The biggest advantage lies in encapsulation of the rules and policies for modifying data in particular objects. Just as some object "owns" the data, so does some specific object "own" the rules for changing it. Very often this naturally leads to localizing those rules and policies in a single object rather than being littered all over the application. This also separates the issue of who owns the rules from the issue of when they should be executed.

Design by contract. Once the rules for modifying data are encapsulated one must address the issue of when they need to be executed. In the OO approach this comes down to design-by-contract (DbC). Before a behavior can execute some set of preconditions must prevail in the application. One of those is usually that some other behavior had to be executed immediately before the one in hand. This is the classical procedural view where the solution is a sequence of operations.

What OO adds to the pot is the notion that the DbC preconditions include conditions concerning state variables. That is, a precondition of execution is that all of the data that the behavior needs has been properly updated and is consistent. So one gnerates the message to execute the behavior only where one is sure that all the DbC conditions have been fulfilled. That segues to...

Peer-to-peer messaging and logical indivisibility. Because behavior responsibilities are logically indivisible they can be daisy-chained with messages to form sequences. Since that daisy-chaining depends on DbC to ensure the preconditions are satisfied and because OO includes data intergrity and consistency in the precondiitons, one can formally validate that the sequence one constructs is correct. Or, as a practical matter, one can construct the generation of peer-to-peer messages at the UML Interaction Diagram level in a manner that ensures DbC is satisfied.

Synchronous knowledge model. In the OO approach one accesses data on an as-needed basis. This works well with DbC because it simplifies the specification of execution preconditions. It also allows lower-level implementations at the OOD/P level to ensure data integrity when things start to get squirrelly (i.e., when one introduces stuff like connecurrent threads). The scope of integrity and consistency is limited to the scope of the executing method.

The assumption of synchronous access is necessary so the developer can maintain sanity. If there were arbirtrary delays between when data was requested and when it was delivered, trying to deal with data integrity issues would be mind boggling. So in the OOA/D one assumes a synchronous access view. Then if one must deal with actual delays, such as distributed data or paused threads, life is much simpler because it is relatively easy to enforce integrity over method scope as if there were no delay.

[Note that this is a pure methodological constraint. There are situations -- such as snapshots of data streams being collected in parallel -- where one must collect the data first in a consistent manner before invoking the method to process it. So passing knowledge as message data packets (e.g., method arguments) is sometimes necessary. But in a well-formed OO application, the method will always navigate directly to the data owner and extract the data on an as-needed basis unless there is an explicit constraint to the contrary.]

So while the OO approach certainly doesn't eliminate all of the problems of persistent state, it goes a long way towards making them more manageable. Given elimination of spaghetti code and a much better mapping between customer and computing spaces due to problem space abstraction, it is probably a reasonable trade-off.

Blog root page
previous post in category
next post in category

June 07, 2004

3.0 When to use OO

Blog root page
previous post in category

There are on the order of a dozen basic approaches to software development and within each approach there are different methodologies. The number of methodologies for a given approach can number in the hundreds for approaches that are popular. Each approach has specific advantages and disadvantages relative to particular problem spaces and goals.

The Object-Oriented approach is one of the more general approaches in terms of the applicable problem spaces are concerned. In can be usually be employed with equal facility for IT, R-T/E, and scientific applications. However, within those spaces there are certain types of processing for which OO development is not appropriate.

One example is pure algorithmic processing. Generally procedural or functional programming approaches will be better suited. That's because procedural and functional programming is very closely mapped to the computational model of Turing and von Neumann. Algorithmic processing is expressed at the mathematical level in terms of computation so the intuitive fit is much better. In addition, mathematical algorithms are invariant while the benefit of OO lies in managing requirements change.

So how can OO development be appropriate for scientific programming at all? The answer is that an application usually involves a lot more that executing a single algorithm. There is usually a user-friendly UI and some amount of persistence. There may be interoperability issues, such as integrating with CAD/CAE tools, statistical packages, etc. Today's complex scientific problems often involve multiple algorithms that need to play together (i.e., "glue" must be supplied). Often complex problem-specific set-up processing is required, such as providing a good basic feasible solution for a linear programming algorithm. So the OO approach can be useful in scientific applications for the substantial "boiler plate" that necessarily surrounds the mathematically defined algorithms. (With good application partitioning on can even switch development approaches across subsystems so the algorithmic portion can be encapsulated in a subsystem and developed, say, procedurally.)

Another example of where OO development is not very appropriate -- though the RAD marketeers would have us believe their products are OO -- is the sort of CRUD/USER pipeline applications between RDB and UI that are fairly common in IT. [Create, Retrieve, Update, Delete and Update, Sort, Extract, and Report] All of the interesting stuff where OO could be useful has already been automated by the RAD IDEs so there is really very little left to abstract. However, IT is a huge field so there is plenty of opportunity to apply the OO approach outside of data entry applications.

Another arena where OO development has limited value is in language translators (cross-compilers) and similar applications where each application function is myopic in that it is independent of what any other function does. Thus translating Java statements to C# statements is done pretty much in a linear fashion on a statement-by-statement basis dictated by the grammar productions. The OO approach only shines when there are many relationships among many entities that each involve complex collaborations. OTOH, generalizing such an engine so that it can translate between, say, any two LALR languages given input BNF definitions is ideally suited to OO development. That's because one can abstract the invariants of grammars and translation at a higher level than individual languages and those invariants will have complex interactions.

A more important area to understanding OO applicability is in terms of goals. All software development has the goal of building a correct application; if it is incorrect it just isn't finished yet. However, there are many other goals that vary with business and development context. Performance, speed of development, maintainability, reliability, and reuse are just a few of the possible goals. Each of these will be given different weight in a given development environment and how appropriate OO development is in that environment will depend on those weightings.

The primary goal of OO development is to provide maintainable software over time. It is not only the primary goal, it is far ahead of all the rest. The OO approach was designed to address the recognized maintainability problems of the Hacker Era ('50s and '60s) where making changes took 10-50 times more effort that writing the original and the SA/SD/SP Era ('70s and '80s) where changes took 5-10 times more effort. Thus the OO approach is ideally suited to any environment where requirements are volatile either during the development or over the application life.

The second most important goal, at least originally in OO development's formative years in the '70s, was to provide a direct mapping to the problem space. The abstraction that is systematic and ubiquitous in OO development was expressly designed to provide that mapping. There were two reasons. One was to bridge the gap between natural language requirements in the customer's terms and very disciplined computational model. The idea was that OOA models could provide a bridge for computer-illterate customers to validate the rigorous specifications needed for the computational model. (Alas, that never really worked out very well; OOA notations carry too much semantic baggage for the customers to learn.)

The second reason was based on the notion that customers don't like change any more than software developers do. So customers will accommodate change in a fashion that causes the least disruption to their existing processes and infrastructures. If the software structure closely parallels the customer infrastructures, then the software should also be minimally disrupted by change because the customer has already figured out the least painful path. This would be especially true if one extracts invariants from the customer space to abstract as the software "skeleton". Thus the OO approach uses problem space abstraction as a crucial tool for providing long-term structural stability. (Fortunately this reason has lived up to the initial expectations over the years.)

A third goal was reuse. Logical encapsulation, implementation hiding, and decoupling interfaces all enable reuse. Originally the hubris of the '70s focused on class level reuse and that was quite successful for computing space entities (String, Array, Stack) that were mathematically defined. It was less effective for problem space objects that tended to be highly complex and loosely defined with myriads of views. However, large scale reuse at the component and subsystem level is alive and quite well because one can tailor the class level abstractions that implement the component or subsystem to that specific context. In addition, the component or subsystem semantics itself is limited to the nature of the subject matter.

A fourth goal is improved reliability. I'm not sure if this was a Founding Fathers' goal, but it seems to have worked out that way. When we first tried OO development we did pilot projects to evaluate it and we collected a ton of data. The most surprising single thing was that our defect rate was reduced roughly 50%. I honestly can't point to any demonstrable reason why, though there are a number of plausible reasons (e.g., encapsulation forces one to think in a highly focused manner and that focus might reduce defect insertions). However, hard data is hard data is hard data...

Other goals address issues like ease-of-cosntruction and efficient mapping to the computational model. However, these were definitely tertiary. Thus OO applications usually have acceptable performance and they can built in reasonable but time, but typically they can't compete with other approaches that treat those goals as primary. One way this is manifested is that OOA/D/P is not as intuitive as other software development approaches so there tends to be a longer learning curve to do it properly. [That doesn't mean some people will never be able to "get" it. Anyone who can spell C can learn OOA/D/P; it will just take them a little longer to get good at it than it takes for C.]

[A lot of NIH shops will jump on performance as a reason not to use OO development. I would point out that OO is now used extensively in R-T/E where performance is usually pretty important because the processors are quite dumb. In addition, most serious performance problems live in fundamentally poor design, not cycle counting. And cycle counting problems can often be optimized locally. I spent the better part of two decades doing OO R-T/E development and there were only a couple of time places where we had to resort to Assembly at the method level. Finally, if one uses translation, one can target a non-OOPL for code generation if one needs instruction level optimization.]

Unfortunately that learning curve carries a significant cost. Years ago I was at a social event and got into a conversation with a stranger who was also a software developer. The conversation basically went:

Him: So you do OO. We are just starting out using it. We're rewriting 18 MLOC R-T/E system from scratch doing OO.

Me: Great. What methodology are you using?

Him (looking at me as if just discovering I had Alzheimer's): Uh, you know... Objects... UML..."

Me: I meant, what sort of analysis and design approach are you using?

Him: I'm not sure; the instructor didn't mention a specific name.

Me: Hmmm. What sort of training are you getting?

Him: We had a week's course on C++ and a couple of our June Grads took OO courses in college.

Me: I meant, what sort of consulting and mentoring will you have?

Him: We didn't have the budget for that.

Me: And you shop size is...?

Him: About 150 people.

Me: Well, good luck. Excuse me, I need a refill...

That project is going to crash and burn with absolute certainty. The only question is how long it will take to realize it is doomed. Going into a major project using a sea change like OO development without adequate training is a guaranteed disaster. Sadly, it will probably end up in the annals of Great OO Failures even though OO development will have nothing directly to do with the failure.

[It never ceases to amaze me that a company will spend $50K evaluating copy machines and training AAs for half a day to use them yet the same company won't hesitate to let developers apply an entirely new development approach that they know nothing about to a project that might kill the company if it failed. But I digress...]

In summary, consider OO development if your requirements are volatile, your applications are long-lived, you want to improve reliability, and/or a significant portion of your developers' time is spent doing maintenance to existing applications.

If so (for most shops the answer here is: who doesn't?), then make sure the shop isn't in a niche where the OO approach isn't very useful. If that's OK, then don't bet the farm by committing to it to a major project without proper training.

Blog root page
previous post in category