ValueModel

We usually think of values as being the attributes of objects, or sometimes we think of them as being special classes. For example, an Employee object will have attributes like salary and hiring date, and these values will be instances of classes like Money or Date, which we might consider a value class.

But there are other ways that objects can represent values. VisualWorks has a class called ValueModel that represents a single value. Not only can clients of a ValueModel read and (usually) write its value, they can become its dependents and be notified when it changes. GUI widgets usually depend on a single ValueModel. So, if an Employee object stores its salary in a ValueModel then a text widget can depend on the salary and be notified when it changes.

The most common ValueModel is a ValueHolder, which is just a container of a value. Instead of storing its attribute in an instance variable, an object can store its attribute in a ValueHolder, which can be stored in an instance variable. A read or a write to the instance variable must then get converted to a message to the ValueHolder. This lets clients depend on the attribute and be notified when it changes.

Most other ValueModels are adapters. For example, a date adapter converts a value that is a date to a value that is a string. Thus, a date adapter might translate between a text widget, which expects a string, and a ValueHolder containing a date.

The most interesting ValueModels are ComputedValues, which define a value in terms of others. ComputedValues are often defined as a function of other ValueModels and as such use the dependents feature of ValueModels to stay up-to-date with the other ValueModels. There are many items that can be based on other values. For example, value PROFIT is really a function based on the amount of SALES minus the total COSTS whenever SALES or COSTS change so does PROFIT. Such a function is represented by a ComputedValue and in particular this function is represented by a BlockValue (a subclass of ComputedValue). BlockValues are special ComputedValues that use a Smalltalk block as their function. Whenever the value is needed the block is evaluated.

In addition to BlockValues who compute their functions through Smalltalk blocks, there are also QueryValues which compute their values from queries. This allows the model to be directly hooked into the database without the need of writing specific code to transfer values from queries to ValueModels.

The inheritance diagram for the ValueModels is shown in Figure 1. Although the actual VisualWorks® ValueModel hierarchy has several additional classes, only the important ones for the framework are shown.

The default protocol for ValueModel is very limited. It only contains the value message to return its value. But since many ValueModels are also used in formulas by ComputedValues, it should be easy to combine them to form the formulas. We added methods for arithmetic functions like + and - to ValueModel that automatically create a ComputedValue. The result is that instead of creating a BlockValue for PROFIT with code such as:

BlockValue

block: [:sales :costs | sales - costs]

arguments: (Array with: salesHolder with: costsHolder)

we can define the profit as "salesHolder - costsHolder". The definition of - in ValueModel is

aValue

^BlockValue block: [:x :y | x - y]

arguments: (Array with: self with: aValue)

We added to ValueModel all of the basic arithmetic operations and a few for operations for string and date manipulation. The result is that it is easy to define new ValueModels from old ones.

Figure 1: Object structure diagram for ValueModel

ValueHolders have a direct reference to their value, and as a result has a value: method to set this value. This message is mainly used by interface widgets, but can be used programmatically to change the value. ComputedValues also add a few messages to the basic ValueModel protocol that determine how it computes its value. The value can be computed eagerly or on demand by sending the eagerEvaluation: message. Values which are associated with queries turn-off eager evaluation since queries can take seconds to compute. Both BlockValue and QueryValue only extend ComputedValues interface by adding messages to initialize their blocks and queries.

Queries

Not all information for an application will be stored in memory. Instead, this information is stored externally in a database. Whenever a value is needed, the database is queried and the value returned. For example, consider a payroll system. There might be one function that lists the number of hours worked by an employee during a time interval. For such a system, there would be an interface that allowed the user to enter the employee id and date range. Once they are entered, the database would be queried for the number of hours worked.

Relational databases have their own language for specifying queries. Many times the language is SQL. For our application to query a SQL database, it must send its commands as SQL statements. A default SQL statement might look like:

The WHERE, GROUP BY, and ORDER BY parts of the statement are optional. For our hours worked example above, we would have an SQL statement such as:

SELECT SUM(hours)
FROM time_cards
WHERE employee_id = '12345' AND 
date < '1/1/98' AND 
date >= '1/1/97'

While we could model each query as a string, many queries have similar parts and these parts might change over time. For example, we might have several queries that have the same where clause that specifies that the records returned should be within a date range. If we needed to change the condition to add another condition, we would need to change all the strings in the code. Clearly, this is undesirable.

Another way to model the queries might be to make an object that held each query part as a string and then concatenate them together when we execute the query, but we would like to include other Smalltalk objects besides strings in our expressions. Instead of constructing a string from code like: "'orderNumber = ', orderNumberHolder value printString", we would rather construct the expression from an expression like: "salesTable orderNumber = orderNumberHolder". When the query is evaluated the expression is turned into the appropriate SQL statement string. Otherwise, we must update the string whenever the orderNumberHolder ValueModel changes.

Instead of using strings to model the queries, we chose QueryObjects to model queries and QueryExpressions to model the individual expressions (e.g., WHERE clauses, GROUP BY clauses, etc.). QueryObjects then construct their SQL statements at runtime using the QueryExpressions.

The results returned by a query are inherently table like. They have rows which are separated into fields. Many databases even allow you to create database views from a query. These views can then be used in other queries as if they were real tables. We want to have a similar feature in our QueryObjects, but instead of having to create database views for each query, we want to be able to use QueryObjects in other QueryObjects.

The fundamental type of QueryObject is a TableQuery. TableQueries represent tables in the database, and are the basic building blocks for all other queries. They correspond to the tables listed in the FROM clause of a SQL statement. Evaluating a TableQuery by itself just returns all the records in the table. Whenever multiple tables are listed in the FROM clause of a SQL statement, they are "joined" together. This operation is represented JoinQuery. JoinQueries join two QueryObjects to form one QueryObject.

We also need QueryObjects to specify the other clauses of a SQL query. For the SELECT part, we use a ProjectionQuery since the SELECT part tells the database what fields to project in the result. The WHERE clause is modeled by a SelectionQuery since it tells the database which rows to select. The ORDER BY and GROUP BY clauses are represented by OrderQuery and GroupQuery respectively. Since all of these queries also deal with expression, they have QueryExpressions that will create their SQL code for their respective clauses.

In addition to the SQL syntax described above, there is an additional keyword that can appear in a SELECT statement. You can use the "DISTINCT" keyword to specify that all records returned by the query should be unique. This is modeled in our system by a DistinctQuery which wraps another query and returns only the unique rows.

In addition to QueryObjects defined above, there are a couple additional QueryObjects that don't have a direct SQL mapping. A RenamingQuery renames the fields of another query. This is useful for achieving consistency with the field names. The final type of query is a ImmediateQuery. This is a special type of query decorator that evaluates and caches the results of its wrapped query. Instead of letting the database compute the values for the overall query, ImmdediateQueries signify that part of the calculation should be performed in Smalltalk. These can be used for performance optimizations and also when values from one database must be merged with values from another database. Since ImmediateQueries signify that part of the query calculation should be performed in Smalltalk, we can present a unified query model that can straddle several databases without the developer needing to write special code.

Figure 2: Structural diagram for QueryObject

Figure 2 shows the QueryObjects structural diagram. The query operations have been split-out under a WrapperQuery which defines some common behavior for all operational queries. Also, the queries that need QueryExpression have been further split-out under ExpressionWrapperQuery.

QueryObjects support a protocol to retrieve values from the database through the value, valueIfAbsent:, values, and valuesAsObject messages. Both the value and valueIfAbsent: messages expect to return zero or one row from the database whereas the values and valuesAsObjects can return zero or more rows. If value or valueIfAbsent: query returns more than one row, then an error is raised. The valuesAsObjects message is used when you wish to return the values from the query as Smalltalk data model objects, and not as arrays.

In addition to the value retrieval protocol, there are also methods that return fields from the query so that they can be used to create QueryExpressions. There are two main methods that are used for this support: @@ and fieldNames. The @@ message returns a QueryExpression that represents the field for argument name, and the fieldNames message returns the list of field names that are available to the query.

There is also several "helper" methods that are defined by QueryObjects. These methods allow you to create new QueryObjects based on the receiver. For example, you can join two query objects together by using the join: message.

In addition to the public protocol for retrieving values, creating QueryExpression, and creating new QueryObjects, there is also a private protocol for converting the QueryObjects into SQL by interpreting them. The main method that builds the query is the buildQuery Template Method in QueryObject. It uses the answerBlock, selectBlock, orderByBlock, and groupByBlock methods to help build the query. Each of these build specific parts of the query.

QueryExpressions

As mentioned in the previous section, QueryExpressions specify the expressions for the different parts of the SQL query. A query such as

SELECT employee_id, SUM(hours)
FROM time_cards
WHERE (date < '1/1/98') AND (date >= '1/1/97')
GROUP BY employee_id
ORDER BY employee_id

has four expressions (the FROM line is formed from JoinQueries not expressions). Looking at each expression closely we see that they are almost in a Smalltalk syntax. Renaming a few of the logical operators to their Smalltalk equivalent (e.g., AND &), we can convert everything except for the functions. Functions with one argument are easily converted to unary messages, and functions with more arguments are converted to keyword messages. Once we convert the functions into messages, we see that each expression is a series of message sends to a field in a table. Therefore, we can represent each expression as a parse tree of message and field nodes. Since queries can also refer to constant values, we also need parse nodes for constant values. These nodes can either hold a constant such as 100, or hold a ValueModel which holds the constant. If the value node holds a ValueModel, then when the query is evaluated, the current value of the ValueModel is used.

Figure 3: QueryExpression's structural diagram

Figure 3 shows the structural diagram for QueryExpressions. In addition to the three types of parse nodes, there is also a RenamedFieldQueryExpression class that is used together with the RenamingQuery. Since each field of the answer of a RenamingQuery can refer to many fields of its wrapped query, we need a reference to the expression that created that field. A RenamedFieldExpressionQuery holds onto the original expression for the renamed field. For example, given the query above, we might want to rename the fields of the answer to be "employee_id" and "hours_worked". For such a query we would need two ReanmedFQE's one for the "employee_id" expression and one for "SUM(hours)" expression.

There are three different protocols for QueryExpressions. One is used for easily forming the expressions. It consists mainly by redefining the doesNotUnderstand: message. This makes it easy to construct the parse trees simply by executing Smalltalk code. Whenever the doesNotUnderstand: message is received, the QueryExpression constructs a MessageQueryExpression with itself as the receiver. Although the doesNotUnderstand: mechanism can handle most messages, there are a few that must be overridden since they are defined by Object (e.g., isNil).

Another protocol is responsible for converting the expression into SQL code. Although we could generate our own SQL code, we rely on the VisualWorks Lens framework to generate it for us. Since we use the Lens framework, this protocol consists of only one message: valueUsingMapping:. This returns the Lens object that is equivalent to the expression, since the blocks used in the Lens queries are similar to our QueryExpression, the valueUsingMapping: method simply evaluates the QueryExpression in the context needed by the Lens blocks.

The final protocol is to support the Observer pattern. Since QueryExpressions can also have ValueModels, they need to update their dependents when they change. These dependents can either be other QueryExpressions or QueryObjects. Whenever an expression changes, the query that it is contained in must be re-computed.