Blog root page

previous post in category

Suppose we have a situation where different arithmetic expressions are used to perform a particular computation based on different contexts. The classic approach to this problem is the GOF Strategy pattern:

* R1 1 [Client] ------------------- [Expression] + evaluate (...) A | +--------------+-----------+----... | | | [Expr1] [Expr2] [Expr3]

where a Client passes the operands to Expression.evaluate (...) and the subclass implementation executes the appropriate sequence of operations against those operands. One instantiates R1 so that Client gets the correct arithmetic expression for the context.

There are a couple of problems here. One is passing the right data; this model assumes all the expressions operate on exactly the same operands. Of course one can get around this with some sort of self-defining data packet (e.g., an XML string). However, that means Client must know which data packet needs to be defined, which defeats the transparent polymorphic dispatch. Alternatively, one can just supply the superset of all possible operands and let the subclass select which ones it actually wants to eat.

Another problem is that modifying an existing expression or adding a new one requires surgery to the [Expression] tree. If that is likely to happen very often, one would like to avoid changing the code. [An example might be the utility I mentioned in the Depreciation example. A user-friendly utility would do the computations for the user. But to do that, it needs the arithmetic expression. One possibility is for the user to create it "on the fly" using an expression generator similar to the ones found in RAD query generators.]

We can express the different arithmetic expressions in data if we step back and look at the invariants of expression evaluation. If we do that we realize that expressions are composed of operations and operations that are combined in an ordered sequence. In fact, Reverse Polish Notation (RPN) provides exactly that sort of ordering in a stack where the two operands for each operator are pushed first, followed by the operator. Thus the algebraic expression

(A + B) * (C - D)

would be expressed in RPN as the stack entries:

* + A B - C D

I won't go into the details here, but evaluating the expression is done by successively popping the entries on a FIFO basis and processing them appropriately depending upon whether it is operator or operand. That leads to a rather generic model:

[+] [-] [*] | | | +----------+---------+---... | R4 - V [Operation] + compute (operand, operand) | * | composed of | R2 |-------------------+ | | | | * valuates 1 | 1 | [Client] ------------------- [Expression] [RPNStack] R1 + evaluate () | | 1 | | | | | R3 |-------------------+ | | composed of | * [Operand] + type + value

The [RPNStack] association object orders the [Operation] and [Operand] instances. Expression.evaluate() "walks" the stack and invokes Operation.compute(...) whenever the operands in hand are both simple values. Operand.type is only needed to define the memory image (e.g., integer vs. real) when interpreting Operand.value in a strongly typed environment.

The issue in this context is not how the evaluation works; that's a pedestrian algorithm. The issue here is that the expression is defined purely in terms of data, relationships, and the order in which elements were added to the RPNStack. [Operand] is a pure specification object. All of that can be done by a GoF Factory based upon external data. (While [Operator] is subclassed, the subclassing is limited to the relevant arithmetic operators, which should be quite stable.)

[Note that this is a simplified version. One can reduce the algorithmic content of Expression.evaluate() by making [Operand] a Gof Composite pattern. That will deal elegantly will complex RPN structures.]

The key point here is that be stepping up to the invariants of expression evaluation, one can provide very generic behaviors in a structure that will allow arbitrarily complex arithmetic expressions to be evaluated in a complete ad hoc fashion.