September 28, 2004
Subclassing is a very simple concept but it is one of the most poorly understood in OO development. The problem is that people tend to merge together subclassing, inheritance, and polymorphism when talking about subclassing. (It gets even worse in an OOPL context where subtyping is introduced.) The reality is that subclassing, inheritance, and polymorphism are three entirely different things. While some are enabled by others in an OO context, they should be kept straight in one's mind.
Subclassing. Subclassing is about set membership. A class defines a set of entities. The set is defined in terms of properties that every member of the set has. No member of the set may have fewer properties than the other members of the set have. However, some members of the class may have additional properties. In that case, all members who have exactly the same additional properties comprise a subset of the members of the original class. We can associate that subset of members with a new class that defines the additional properties that they uniquely share in addition to the properties already identified for the original class (called a superclass) from which the subset was derived.
That's it folks! That's all there is to subclassing. The tree diagram we see in an UML Class Diagram just keeps track of sets of members. For those of you who did not sleep through your set theory classes, an OO subclassing relationship is just a Venn Diagram in tree form that defines boundaries around entities (dots on the Venn Diagram) in terms of superclasses and subclasses. The reason we use the tree display is because it makes it easier to explicitly keep track of the properties that define the classes. (To include the specific properties in a Venn Diagram rather than entities would create a topological problem that only a mathematician could love.)
However, there is a special feature unique to OO subclassing. If you have any experience with Data Modeling, please forget it because OO subclassing represents a very major divergence from Data Modeling. That difference lies in instantiating objects. In Data Modeling, each class can be instantiated independently and the subclassing relationship represents a special sort of association between the entities instantiated in each set. That is not the case for OO subclassing. In the OO situation only one object at a time is instantiated for the entire the tree. That object is always a member of a particular leaf subclass in the tree and its properties represent the union of all properties in a direct line of ascent to the root superclass. This is why an OO subclassing relationship is commonly called an is-a relationship. Every object IS A member of every set in the line of ascent.
[Some brain-dead languages like C++ allow one to create instances of superclasses. Essentially this is an unrestricted license for foot-shooting. Don't do that!]
Inheritance. Inheritance is an equally simple concept. Because only one object at a time is instantiated for the entire tree, we need a set of rules for determining exactly what the object's properties are. Inheritance supplies those rules. They are also childishly simple: the properties are the union I mentioned above. So all one has to do is run one's finger up the line of ascent from the subclass to the root superclass and one has all the properties. That's all there is to inheritance; nothing more! It happens to be enabled by OO's unique view of instantiation and the use of the handy tree format that conveniently organizes the properties for us.
polymorphism. There are actually several different sorts of polymorphism available in OO development and the one relevant in this context is known as inclusion polymorphism. The basic idea if someone is accessing a property defined in a superclass (i.e., that all members of descending subclasses have), one can substitute the implementation of that property between subclasses. Thus a member of one subclass can have a different implementation of the property than a member of another subclass derived from the given superclass. That is transparent to the client doing the accessing because it doesn't know which subclass is in hand; the client only "sees" the superclass.
This is usually not terribly interesting for knowledge responsibilities because there are only so many ways to implement a data value. However, when dealing with behavior responsibilities this can be an enormously powerful tool because we can be quite liberal about what we mean by implementation. For example:
1 attacks 1 [Predator] --------------------- [Prey] +attacked() A | +--------------------+----------------+ | | | [Brontosaurus] [Gazelle] [Pheasant] +stomp() +run() +takeFlight()
Here the superclass [Prey] defines a behavior responsibility that all members of the class have. However, each subclass may implement that responsibility quite differently, as indicated. [I have taken some liberties with the UML notation here in addition to geologic time. UML allows the substitution semantics but not so vividly.] In effect the "implementation" for the attacked() behavior is very different to the point where one is really substituting behaviors rather than simple implementations (e.g., different implementations of a linear programming algorithm).
This ability to substitute behaviors transparently for the client is a very powerful technique, especially for reifying complex *:* relationships in OOD/P. [One can argue that all the patterns in the GoF book ("Design Patterns" by Gamma et al) exist to reify *:* relationships where participation in complex and dynamic.] It is also very useful for dependency management during OOP to overcome the physical coupling problems shared by most OOPLs. However, it isn't very common in OOA because the notion of behavioral substitution is pretty rare in most customer problem spaces.
The key concept behind inclusion polymorphism is also quite simple. One simply substitutes behaviors between subclasses by providing access through a common superclass interface that is accessed by clients. It is enabled by the OO view of both subclassing, where the properties are defined and organized, and inheritance, which provides the rules for determining which substitution is made.
Before leaving the general topic of subclassing, let me point out two important rules that apply rather uniquely to good OOA/D practice:
The members of a subclass must be a disjoint set relative to other subclasses of the same superclass.
The union of all subclass' members must be a complete set of the members of the superclass.
The first rule essentially means that an object cannot be a member of two different subsets in direct line of descent from one superclass. The second rule means that one cannot create a object that just has superclass properties as a superclass instance. The rationale behind both these rules is that they allow unambiguous mapping when resolving access to properties when employing polymorphic access and they limit the opportunities for inserting defects when doing maintenance. The examples where one gets into trouble tend to convoluted and involve LSP (Liskov Substitution Principle), so I won't go into them here and you will have to trust me on this.