Tuesday, 26 June 2007

In Genome, you can use ToObject() to return a single object from a query that has zero or one result element. The Genome documentation gives the following explanation about the restriction on the number of result elements:

The method should not be called for sets that may contain more than one element. Calling the method for these kinds of sets results in different behaviours based on the database platform and the calling context. If you need to retrieve the first element of a set, the combination of Set.GetRange and Set.ToObject has to be used, as in the following example.

But when can you be sure that a query returns only zero or one element? And what happens if you do not follow that advice?

A common case that would require using ToObject() is when you need to map an inverse object reference of a 1:1 relationship in the domain model.

Imagine the following example, where a company car can be assigned to one or no employee: In this case, the company car has an object reference to the employee it can be assigned to, represented by a foreign key in the database. Vice-versa, the employee has an object reference to the assigned company car, implemented through a lookup query that returns the car that is being assigned to this employee:

public abstract class CompanyCar
{
	...
	public abstract Employee AssignedTo { get; set; }
	...
}

<Type name="CompanyCar">
	...
	<Member name="AssignedTo"><NearObjectReference/></Member>
</Type>

public abstract class Employee
{
	...
	public abstract CompanyCar AssignedCar { get; }
	...
}

<Type name="Employee">
	...
	<Member name="AssignedCar" Oql="extentof(CompanyCar)[ccar: ccar.AssignedTo==this].ToObject()" />
	...
</Type>

Note the following details:

  1. Employee.AssignedCar is readonly, while CompanyCar.AssignedTo is read/writeable. This is logical, since Employee.AssignedCar is only mapped to a query, where you cannot “set” the result. Of course, you can implement a more sophisticated property on Employee which would allow the car to be set directly for an employee, but I leave this out for simplicity’s sake.
  2. The lookup query mapped to Employee.AssignedCar retrieves a single car instance by using ToObject(). This assumes that the lookup query returns only zero or one result element, which is the point I wanted to discuss in this article.

Having mapped this, you can freely navigate from CompanyCar to Employee and vice-versa, as shown below.

Navigating from CompanyCar to Employee executes a lookup query for the foreign key against the database:

SELECT ... FROM Employee WHERE Id = {CompanyCar.AssignedTo}

Navigating from Employee to CompanyCar executes a lookup query for the primary key of the Employee instance in the AssignedTo fields of the CompanyCar table:

SELECT TOP 1 ... FROM CompanyCar WHERE AssignedTo = {Employee.Id}

The beauty of this mapping is that the domain model’s user does not need to be aware in which direction the relationship is mapped in the database. You can even build more complex queries using the property . For example, finding all Employees that have a CompanyCar assigned to them is easy in OQL:

extentof(Employee)[AssignedCar != null]

This translates to the following SQL:

SELECT ... FROM Employee
  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id
  WHERE NOT CompanyCar.Id IS NULL

If you change the database schema to point the foreign key in the other direction, the same OQL is translated to the following SQL:

SELECT ... FROM Employee WHERE NOT AssignedCar.Id IS NULL

The important point that I want to make is that ToObject() works fine as long as you can be sure it will return only zero or one result. In my example, if there were more than one car assigned to an employee, then the SQL query would return duplicate employee entries for those employees with more than one car assigned:

SELECT ... FROM Employee 
  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id
  WHERE NOT CompanyCar.Id IS NULL

A wrong approach to fixing this problem is to use a distinct projection, eliminating the duplicate employee entries from the result:

 [this distinct]extentof(Employee)[AssignedCar != null]

This translates to:

SELECT DISTINCT ... FROM Employee 
  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id
  WHERE NOT CompanyCar.Id IS NULL

There are several reasons why using a distinct projection is not a good solution to the problem in this case. First of all, you do not want to change the semantics of your query in OQL, just to work around this problem. There might be many other places where you run into similar problems which you would have to fix with additional projections or other hacks one by one. Second, DISTINCT limits the query in some cases, e.g. you cannot sort by a field not contained in the selector anymore when using DISTINCT.

The right approach to solving this problem reflects the fact that there can be more than one CompanyCar assigned to an employee in the mapping of the relationship. There are two ways of doing this.

The simplest is to tell Genome to expect more than one result in the lookup, and return only the first. This can make sense if you want to return “any” of the assigned cars. If you combine it with an order criterion that defines which cars to return first, this can be even more meaningful. The following mapping would return the most expensive car assigned to an employee:

<Type name="Employee">
	...
	<Member name="AssignedCar" Oql="extentof(CompanyCar)[ccar: ccar.AssignedTo==this].OrderBy([Price descending]).GetRange(0,1).ToObject()" />
	...
</Type>

.OrderBy([Price descending]) ensures that the most expensive car is returned first.

GetRange(0,1) tells Genome to make sure only one result is returned. Depending on how AssignedCar is used in another OQL, Genome shapes the resulting query accordingly.

After mapping AssignedCar as above, the original query

extentof(Employee)[AssignedCar != null]

is now translated to the following SQL:

SELECT ... FROM Employee
  WHERE NOT(
    SELECT TOP 1 FROM CompanyCar WHERE CompanyCar.AssignedToId = Employee.Id
    ORDER BY CompanyCar.Price DESC
  ) IS NULL

Note that, because GetRange(0,1) is used in the mapping of Employee.AssignedCar, Genome implements the same OQL to search for employees without a car, using a sub-query instead of using LEFT OUTER JOIN.

Depending on your business use case, you might choose to do more complex refactoring of your business model to reflect that more than one car can be assigned to an employee. For example, you could introduce an additional property to Employee that returns Set with all assigned cars. You would still need to define how queries that retrieve only a single instance of a car should work, no matter how many cars are assigned to an employee.

Summary

When using ToObject() alone to retrieve a single element from a set, make sure that the set can only contain zero or one element. Otherwise, you will end up with unwanted side effects when building more complex queries based on this query.

When you know a query can return more than one element, but you just want to fetch the top element, use GetRange(0,1) in combination with ToObject(). For example, if you want to retrieve the most expensive car from the car pool, use

extentof(CompanyCar).OrderBy([Price descending]).GetRange(0,1).ToObject()

It is also possible that your data is inconsistent; more than one CompanyCar may be assigned to an Employee, although business rules forbid this. In this case, this is a bug in your software which you should resolve otherwise (e.g. through proper business rule checks or database constraints). Using GetRange(0,1) does not really help, as it does not represent the business intent you want to implement.

It may also be tempting to use GetRange(0,1) “just to be sure”, but note that this adds unnecessary performance overhead. For example, joining in a TOP query is far more complex and slow than joining in the same without GetRange(0,1). In fact, this is the reason why we have not included GetRange(0,1) in ToObject().

Posted by Chris

Technorati Tags: object relational, getting started

Genome | OQL