Relationships are very important to OO development. They provide a basic structure for collaboration. Perhaps more important, they allow business rules and policies to be enforced through static structure in the application rather than explicitly in executable program statements. This can result in a significant reduction in the amount of code. That, in turn, can improve reliability be reducing the number of opportunities to insert defects. Finally, separating relationship instantiation from relationship navigation (addressing messages) can lead to a more robust and maintainable application. Therefore it is important to get relationships right in the OOA/D.
There are two broad categories of relationships in OO development: associations and subclassing. The remainder of this post is concerned with associations. Another set of posts will deal with subclassing relationships.
An OO association defines rules for participation in collaborations. That is, a relationship in a Class Model defines constraints on which members of two classes are allowed to collaborate. Those constraints are basic static structural features of the application. Relationships are described by four characteristics.
Identity. Each relationship has a unique identity. In UML this is a name, known as a discriminator, attached to the relationship. This identity is typically only necesasary in translation systems that do full automatic code generation from PIM models. Otherwise, its main use lies in communicating among team members.
Multiplicity. Each relationship end indicates how many participants of that class are allowed for each participant at the other end of the relationship. The crucial consideration is whether there may be only exactly one participant or whether there can be multiple participants. In PIM modeling this is all that matters, so in UML the possible "values" of multiplicity are '1' or '*' (read as: one or more). This is important because in the Relational Data Model exactly one participant is handled differently than multiple participants. As a practical matter, it means that in the implementation one must support either a single reference or a collection of references.
UML allows one to define multiple participants more explictly (e.g., '5'). However, that is only relevant to addressing nonfunctional requirements. Such information is useful for optimizing the kind of collection one uses and how one manages memory. So that sort of information is relevant to OOD models that are PSMs created from a PIM, but not to the PIM itself.
Conditionality. Each relationship end indicates whether participation by members of the class at that end is mandatory or not. The default is manditory (unconditional) participation. In UML one indicates a conditional relationship be prefixing the multiplicity with "0.." as in 0..1 or 0..* (read as: zero or more). Conditionality is important because conditional relationships require code to be supplied to test whether the relationship is active or not and that code must be supplied in every context where the relationship is navigated. Such code is ususally undesirable so I will address how to reify conditional relationships to make them unconditional in a subsequent post.
Role. In UML each relationship end has an associated text qualifier that describes the role that the participants at that end play relative to the participants at the other end. Roles have no effect on the implementation of a PIM, but they are extremely important to communicating how the PIM relates to the problem space.
While some authors suggest that a role only needs to be placed on one side of the relationship, I am a strong advocate of placing roles on both sides. To make dual roles useful they each need to capture something unique and useful relevant to the problem in hand. In particular, symmetric roles -- such as "creates", "is created by" -- should be avoided because the second role adds no new insight. Often the exercise of uncovering unique roles will improve one's understanding of the problem space and it will certainly improve the understanding of the maintainers who will follow.
The notion of roles segues into the more general topic of identifying relationships. Ultimately all relationships are abstracted from the problem domain. When one looks for relationships one seeks logical connections between classes of entities. While an OOA/D relationship is used as a vehicle for rigorous constraint specification for participation, the existence of the relationship itself is a matter of logical connection. Roles provide a bridge between that logical connection in the problem domain and the constraints on the connection in the computing space.
Though UML supports notation for three or more participating classes, only binary associations are in the PIM profiles, at least for translation. That's because the UML notation for ternary and higher associations is ambiguous in some implementation contexts and requires specific decisions to be made during the implementation. So a PIM association should be a connection between exactly two classes. However, an association can be reflexive (i.e., members of a single class are related to other members of the same class). This still qualifies as binary because the individual members are not related to themselves.
One should only identify the minimum set of relationships that one needs to solve the problem. In practice what this means is that there should be a path of binary associations between any two classes whose members need to collaborate. (This is a necessary condition, but sometimes not sufficent.) So one should proceed systemmatically from the most obvious to the least obvious. Clearly the most obvious connections are those where a physical connection exists in the problem domain (e.g., tracks between railroad stations). After that it is a matter of what connections seem more prominent in the problem space.