We usually think of values as being the attributes of objects, or sometimes we think of them as being special classes. For example, an Employee object will have attributes like salary and hiring date, and these values will be instances of classes like Money or Date, which we might consider a value class.
But there are other ways that objects can represent values. VisualWorks has a class called ValueModel that represents a single value. Not only can clients of a ValueModel read and (usually) write its value, they can become its dependents and be notified when it changes. GUI widgets usually depend on a single ValueModel. So, if an Employee object stores its salary in a ValueModel then a text widget can depend on the salary and be notified when it changes.
The most common ValueModel is a ValueHolder, which is just a container of a value. Instead of storing its attribute in an instance variable, an object can store its attribute in a ValueHolder, which can be stored in an instance variable. A read or a write to the instance variable must then get converted to a message to the ValueHolder. This lets clients depend on the attribute and be notified when it changes.
Most other ValueModels are adapters. For example, a date adapter converts a value that is a date to a value that is a string. Thus, a date adapter might translate between a text widget, which expects a string, and a ValueHolder containing a date.
The most interesting ValueModels are ComputedValues, which define a value in terms of others. ComputedValues are often defined as a function of other ValueModels and as such use the dependents feature of ValueModels to stay up-to-date with the other ValueModels. There are many items that can be based on other values. For example, value PROFIT is really a function based on the amount of SALES minus the total COSTS whenever SALES or COSTS change so does PROFIT. Such a function is represented by a ComputedValue and in particular this function is represented by a BlockValue (a subclass of ComputedValue). BlockValues are special ComputedValues that use a Smalltalk block as their function. Whenever the value is needed the block is evaluated.
In addition to BlockValues who compute their functions through Smalltalk blocks, there are also QueryValues which compute their values from queries. This allows the model to be directly hooked into the database without the need of writing specific code to transfer values from queries to ValueModels.
The inheritance diagram for the ValueModels is shown in Figure 1. Although the actual VisualWorks® ValueModel hierarchy has several additional classes, only the important ones for the framework are shown.
The default protocol for ValueModel
is very limited. It only contains the value
message to return its value. But since many ValueModels
are also used in formulas by ComputedValues,
it should be easy to combine them to form the formulas. We added
methods for arithmetic functions like + and - to ValueModel
that automatically create a ComputedValue.
The result is that instead of creating a BlockValue
for PROFIT with code such as:
BlockValue
block: [:sales :costs | sales - costs]
arguments: (Array with: salesHolder with: costsHolder)
we can define the profit as "salesHolder
- costsHolder". The definition
of - in ValueModel
is
aValue
^BlockValue block: [:x :y | x - y]
arguments: (Array with: self with: aValue)
We added to ValueModel all of the basic arithmetic operations and a few for operations for string and date manipulation. The result is that it is easy to define new ValueModels from old ones.
ValueHolders
have a direct reference to their value, and as a result has a
value: method
to set this value. This message is mainly used by interface widgets,
but can be used programmatically to change the value. ComputedValues
also add a few messages to the basic ValueModel
protocol that determine how it computes its value. The value can
be computed eagerly or on demand by sending the eagerEvaluation:
message. Values which are associated with queries turn-off eager
evaluation since queries can take seconds to compute. Both BlockValue
and QueryValue
only extend ComputedValues
interface by adding messages to initialize their blocks and queries.
Not all information for an application will be stored in memory. Instead, this information is stored externally in a database. Whenever a value is needed, the database is queried and the value returned. For example, consider a payroll system. There might be one function that lists the number of hours worked by an employee during a time interval. For such a system, there would be an interface that allowed the user to enter the employee id and date range. Once they are entered, the database would be queried for the number of hours worked.
Relational databases have their own language for specifying queries. Many times the language is SQL. For our application to query a SQL database, it must send its commands as SQL statements. A default SQL statement might look like:
The WHERE, GROUP BY, and ORDER BY parts of the statement are optional. For our hours worked example above, we would have an SQL statement such as:
SELECT SUM(hours) FROM time_cards WHERE employee_id = '12345' AND date < '1/1/98' AND date >= '1/1/97'
While we could model each query as a string, many queries have similar parts and these parts might change over time. For example, we might have several queries that have the same where clause that specifies that the records returned should be within a date range. If we needed to change the condition to add another condition, we would need to change all the strings in the code. Clearly, this is undesirable.
Another way to model the queries might be to make
an object that held each query part as a string and then concatenate
them together when we execute the query, but we would like to
include other Smalltalk objects besides strings in our expressions.
Instead of constructing a string from code like: "'orderNumber
= ', orderNumberHolder value printString",
we would rather construct the expression from an expression like:
"salesTable orderNumber = orderNumberHolder".
When the query is evaluated the expression is turned into the
appropriate SQL statement string. Otherwise, we must update the
string whenever the orderNumberHolder
ValueModel
changes.
Instead of using strings to model the queries, we chose QueryObjects to model queries and QueryExpressions to model the individual expressions (e.g., WHERE clauses, GROUP BY clauses, etc.). QueryObjects then construct their SQL statements at runtime using the QueryExpressions.
The results returned by a query are inherently table like. They have rows which are separated into fields. Many databases even allow you to create database views from a query. These views can then be used in other queries as if they were real tables. We want to have a similar feature in our QueryObjects, but instead of having to create database views for each query, we want to be able to use QueryObjects in other QueryObjects.
The fundamental type of QueryObject is a TableQuery. TableQueries represent tables in the database, and are the basic building blocks for all other queries. They correspond to the tables listed in the FROM clause of a SQL statement. Evaluating a TableQuery by itself just returns all the records in the table. Whenever multiple tables are listed in the FROM clause of a SQL statement, they are "joined" together. This operation is represented JoinQuery. JoinQueries join two QueryObjects to form one QueryObject.
We also need QueryObjects to specify the other clauses of a SQL query. For the SELECT part, we use a ProjectionQuery since the SELECT part tells the database what fields to project in the result. The WHERE clause is modeled by a SelectionQuery since it tells the database which rows to select. The ORDER BY and GROUP BY clauses are represented by OrderQuery and GroupQuery respectively. Since all of these queries also deal with expression, they have QueryExpressions that will create their SQL code for their respective clauses.
In addition to the SQL syntax described above, there is an additional keyword that can appear in a SELECT statement. You can use the "DISTINCT" keyword to specify that all records returned by the query should be unique. This is modeled in our system by a DistinctQuery which wraps another query and returns only the unique rows.
In addition to QueryObjects defined above, there are a couple additional QueryObjects that don't have a direct SQL mapping. A RenamingQuery renames the fields of another query. This is useful for achieving consistency with the field names. The final type of query is a ImmediateQuery. This is a special type of query decorator that evaluates and caches the results of its wrapped query. Instead of letting the database compute the values for the overall query, ImmdediateQueries signify that part of the calculation should be performed in Smalltalk. These can be used for performance optimizations and also when values from one database must be merged with values from another database. Since ImmediateQueries signify that part of the query calculation should be performed in Smalltalk, we can present a unified query model that can straddle several databases without the developer needing to write special code.
Figure 2: Structural diagram for QueryObject
Figure 2 shows the QueryObjects structural diagram. The query operations have been split-out under a WrapperQuery which defines some common behavior for all operational queries. Also, the queries that need QueryExpression have been further split-out under ExpressionWrapperQuery.
QueryObjects
support a protocol to retrieve values from the database through
the value,
valueIfAbsent:,
values, and
valuesAsObject
messages. Both the value
and valueIfAbsent:
messages expect to return zero or one row from the database whereas
the values
and valuesAsObjects
can return zero or more rows. If value or valueIfAbsent:
query returns more than one row, then an error is raised. The
valuesAsObjects
message is used when you wish to return the values from the query
as Smalltalk data model objects, and not as arrays.
In addition to the value retrieval protocol, there
are also methods that return fields from the query so that they
can be used to create QueryExpressions.
There are two main methods that are used for this support: @@
and fieldNames.
The @@ message
returns a QueryExpression
that represents the field for argument name, and the fieldNames
message returns the list of field names that are available to
the query.
There is also several "helper" methods
that are defined by QueryObjects.
These methods allow you to create new QueryObjects
based on the receiver. For example, you can join two query objects
together by using the join:
message.
In addition to the public protocol for retrieving
values, creating QueryExpression,
and creating new QueryObjects,
there is also a private protocol for converting the QueryObjects
into SQL by interpreting them. The main method that builds the
query is the buildQuery
Template Method in QueryObject.
It uses the answerBlock,
selectBlock,
orderByBlock,
and groupByBlock
methods to help build the query. Each of these build specific
parts of the query.
As mentioned in the previous section, QueryExpressions specify the expressions for the different parts of the SQL query. A query such as
SELECT employee_id, SUM(hours) FROM time_cards WHERE (date < '1/1/98') AND (date >= '1/1/97') GROUP BY employee_id ORDER BY employee_id
has four expressions (the FROM
line is formed from JoinQueries
not expressions). Looking at each expression closely we see that
they are almost in a Smalltalk syntax. Renaming a few of the logical
operators to their Smalltalk equivalent (e.g., AND
&),
we can convert everything except for the functions. Functions
with one argument are easily converted to unary messages, and
functions with more arguments are converted to keyword messages.
Once we convert the functions into messages, we see that each
expression is a series of message sends to a field in a table.
Therefore, we can represent each expression as a parse tree of
message and field nodes. Since queries can also refer to constant
values, we also need parse nodes for constant values. These nodes
can either hold a constant such as 100, or hold a ValueModel
which holds the constant. If the value node holds a ValueModel,
then when the query is evaluated, the current value of the ValueModel
is used.
Figure 3: QueryExpression's structural diagram
Figure 3 shows the structural diagram for QueryExpressions. In addition to the three types of parse nodes, there is also a RenamedFieldQueryExpression class that is used together with the RenamingQuery. Since each field of the answer of a RenamingQuery can refer to many fields of its wrapped query, we need a reference to the expression that created that field. A RenamedFieldExpressionQuery holds onto the original expression for the renamed field. For example, given the query above, we might want to rename the fields of the answer to be "employee_id" and "hours_worked". For such a query we would need two ReanmedFQE's one for the "employee_id" expression and one for "SUM(hours)" expression.
There are three different protocols for QueryExpressions.
One is used for easily forming the expressions. It consists mainly
by redefining the doesNotUnderstand:
message. This makes it easy to construct the parse trees simply
by executing Smalltalk code. Whenever the doesNotUnderstand:
message is received, the QueryExpression
constructs a MessageQueryExpression
with itself as the receiver. Although the doesNotUnderstand:
mechanism can handle most messages, there are a few that must
be overridden since they are defined by Object
(e.g., isNil).
Another protocol is responsible for converting the
expression into SQL code. Although we could generate our own SQL
code, we rely on the VisualWorks Lens framework to generate it
for us. Since we use the Lens framework, this protocol consists
of only one message: valueUsingMapping:.
This returns the Lens object that is equivalent to the expression,
since the blocks used in the Lens queries are similar to our QueryExpression,
the valueUsingMapping:
method simply evaluates the QueryExpression
in the context needed by the Lens blocks.
The final protocol is to support the Observer pattern. Since QueryExpressions can also have ValueModels, they need to update their dependents when they change. These dependents can either be other QueryExpressions or QueryObjects. Whenever an expression changes, the query that it is contained in must be re-computed.