Thursday, 28 June 2007

Mats Helander, whom I have already had the pleasure to meet personally several times, wrote about an O/RM challenge in his blog.

While it is always fun to participate in challenges, I want to criticize the problem Mats describes first, before I show how you can solve it with Genome.

The challenge only concerns how efficiently an O/RM can read up a set of whole tables from the database. This does not make sense for two reasons:

  1. Usually, you don’t want to load up all the data from a database into memory (that’s one of the reasons why you use a database).
  2. If you have special cases, where you cache whole tables from a database (e.g. some lookup data), caching takes place very seldom (e.g. once a day) and thus the efficiency of loading up the data is not of such a big importance.

Mats expresses the challenge in such a way that he demands that the O/RM may not join the related objects when loading from the database to find out about their relationships. Instead, the O/RM should load up all objects at once, and “discover” the relations between the objects afterwards on its own (without using the JOIN from the database). This results in three SELECT statements (SELECT * FROM Customers; SELECT * FROM Orders; SELECT * FROM OrderLines).

An O/RM usually maintains only an identity map to cache object lookup queries. This helps object references mapped through foreign key fields in the database to be followed without extra database roundtrips (given that the related objects are already loaded into memory). This means that following an Order to its related Customer works in memory, if all data is loaded up. To discover the Orders belonging to the Customer, however, the O/RM needs to perform a lookup query.

Some O/RMs, including Genome, allow collections to be preloaded in order to avoid unnecessary roundtrips when traversing object graphs deeply. So you can tell the O/RM to preload all the Orders of the Customers retrieved, and to preload all the OrderLines of the Orders retrieved. In this case, the O/RM builds a map for relating the objects in memory while loading up the data.

Usually you only want to load up the related children of the parent table. It doesn’t make sense to load up all orders from the database only to fulfill the orders of three specific customers. To ensure this, an O/RM typically JOINs the related data to the filtered parent table.

Not filtering the parent table is a very special case. Introducing an optimisation for this case is possible, but would make no sense (for the initial reasoning above). Besides that, I wonder how large the loaded table has to be in order for that additional JOIN to make a significant difference, giving the whole performance optimisation sense at all. I guess in those cases, it is out of the question to cache the results in memory anyway, which is the premise of the scenario. Another drawback of this optimisation I want to point out is that it can turn out to be less efficient very quickly when the parent reference is nullable, as unnecessary data is loaded up again.

Still, this is a challenge and a lot of people interested in O/RM read it; so, let’s solve it with Genome.

Genome provides two infrastructures for retrieving and caching relations: collections and indexing.

The collection infrastructure provides rich support for handling specialised relation types such as 1:n and n:m relations. Usually, I would recommend using Genome’s collection mapping feature to support Mats’ scenario, except that Genome uses a JOIN to limit the related objects loaded up from the database.

Indexing is a Genome infrastructure that automatically detects even complex relationships, based on the loaded data. It is more complex to configure, use and maintain, but it can support Mats’ exotic scenario. Having mapped the business layer with Genome, the following three lines of code will do the trick:

using (Context.Push(LocalContext.Create()))
    IndexManager.FillIndex(Context.Current, dd.Extent<OrderDetail>(), 
                           IndexManager.GetIndex(dd.Schema, typeof(OrderDetail), "IdxOrder"));

    IndexManager.FillIndex(Context.Current, dd.Extent<Order>(),
                           IndexManager.GetIndex(dd.Schema, typeof(Order), "IdxCustomer"));

    Set<Customer> customers = dd.Extent<Customer>().ToArray();



Inside the using block, the first two lines of code load up all OrderDetails and all Orders. Additionally, they saturate the indexes for the relationships Order->OrderDetail and Customer->Order. The third line of code loads up all customers. When Dump(customers) traverses through the object graph, all relationships are served from memory.

Note that this feature is not limited to simple 1:n and n:m relationships. It works for more complex relationships as well, such as retrieving pending orders of a customer etc.

Posted by Chris

Technorati Tags: object relational, challenge

Thursday, 28 June 2007 14:17:52 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Tuesday, 26 June 2007

In Genome, you can use ToObject() to return a single object from a query that has zero or one result element. The Genome documentation gives the following explanation about the restriction on the number of result elements:

The method should not be called for sets that may contain more than one element. Calling the method for these kinds of sets results in different behaviours based on the database platform and the calling context. If you need to retrieve the first element of a set, the combination of Set.GetRange and Set.ToObject has to be used, as in the following example.

But when can you be sure that a query returns only zero or one element? And what happens if you do not follow that advice?

A common case that would require using ToObject() is when you need to map an inverse object reference of a 1:1 relationship in the domain model.

Imagine the following example, where a company car can be assigned to one or no employee: In this case, the company car has an object reference to the employee it can be assigned to, represented by a foreign key in the database. Vice-versa, the employee has an object reference to the assigned company car, implemented through a lookup query that returns the car that is being assigned to this employee:

public abstract class CompanyCar
	public abstract Employee AssignedTo { get; set; }

<Type name="CompanyCar">
	<Member name="AssignedTo"><NearObjectReference/></Member>

public abstract class Employee
	public abstract CompanyCar AssignedCar { get; }

<Type name="Employee">
	<Member name="AssignedCar" Oql="extentof(CompanyCar)[ccar: ccar.AssignedTo==this].ToObject()" />

Note the following details:

  1. Employee.AssignedCar is readonly, while CompanyCar.AssignedTo is read/writeable. This is logical, since Employee.AssignedCar is only mapped to a query, where you cannot “set” the result. Of course, you can implement a more sophisticated property on Employee which would allow the car to be set directly for an employee, but I leave this out for simplicity’s sake.
  2. The lookup query mapped to Employee.AssignedCar retrieves a single car instance by using ToObject(). This assumes that the lookup query returns only zero or one result element, which is the point I wanted to discuss in this article.

Having mapped this, you can freely navigate from CompanyCar to Employee and vice-versa, as shown below.

Navigating from CompanyCar to Employee executes a lookup query for the foreign key against the database:

SELECT ... FROM Employee WHERE Id = {CompanyCar.AssignedTo}

Navigating from Employee to CompanyCar executes a lookup query for the primary key of the Employee instance in the AssignedTo fields of the CompanyCar table:

SELECT TOP 1 ... FROM CompanyCar WHERE AssignedTo = {Employee.Id}

The beauty of this mapping is that the domain model’s user does not need to be aware in which direction the relationship is mapped in the database. You can even build more complex queries using the property . For example, finding all Employees that have a CompanyCar assigned to them is easy in OQL:

extentof(Employee)[AssignedCar != null]

This translates to the following SQL:

SELECT ... FROM Employee
  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id

If you change the database schema to point the foreign key in the other direction, the same OQL is translated to the following SQL:

SELECT ... FROM Employee WHERE NOT AssignedCar.Id IS NULL

The important point that I want to make is that ToObject() works fine as long as you can be sure it will return only zero or one result. In my example, if there were more than one car assigned to an employee, then the SQL query would return duplicate employee entries for those employees with more than one car assigned:

SELECT ... FROM Employee 
  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id

A wrong approach to fixing this problem is to use a distinct projection, eliminating the duplicate employee entries from the result:

 [this distinct]extentof(Employee)[AssignedCar != null]

This translates to:

  LEFT OUTER JOIN CompanyCar ON CompanyCar.AssignedToId=Employee.Id

There are several reasons why using a distinct projection is not a good solution to the problem in this case. First of all, you do not want to change the semantics of your query in OQL, just to work around this problem. There might be many other places where you run into similar problems which you would have to fix with additional projections or other hacks one by one. Second, DISTINCT limits the query in some cases, e.g. you cannot sort by a field not contained in the selector anymore when using DISTINCT.

The right approach to solving this problem reflects the fact that there can be more than one CompanyCar assigned to an employee in the mapping of the relationship. There are two ways of doing this.

The simplest is to tell Genome to expect more than one result in the lookup, and return only the first. This can make sense if you want to return “any” of the assigned cars. If you combine it with an order criterion that defines which cars to return first, this can be even more meaningful. The following mapping would return the most expensive car assigned to an employee:

<Type name="Employee">
	<Member name="AssignedCar" Oql="extentof(CompanyCar)[ccar: ccar.AssignedTo==this].OrderBy([Price descending]).GetRange(0,1).ToObject()" />

.OrderBy([Price descending]) ensures that the most expensive car is returned first.

GetRange(0,1) tells Genome to make sure only one result is returned. Depending on how AssignedCar is used in another OQL, Genome shapes the resulting query accordingly.

After mapping AssignedCar as above, the original query

extentof(Employee)[AssignedCar != null]

is now translated to the following SQL:

SELECT ... FROM Employee
    SELECT TOP 1 FROM CompanyCar WHERE CompanyCar.AssignedToId = Employee.Id
    ORDER BY CompanyCar.Price DESC

Note that, because GetRange(0,1) is used in the mapping of Employee.AssignedCar, Genome implements the same OQL to search for employees without a car, using a sub-query instead of using LEFT OUTER JOIN.

Depending on your business use case, you might choose to do more complex refactoring of your business model to reflect that more than one car can be assigned to an employee. For example, you could introduce an additional property to Employee that returns Set with all assigned cars. You would still need to define how queries that retrieve only a single instance of a car should work, no matter how many cars are assigned to an employee.


When using ToObject() alone to retrieve a single element from a set, make sure that the set can only contain zero or one element. Otherwise, you will end up with unwanted side effects when building more complex queries based on this query.

When you know a query can return more than one element, but you just want to fetch the top element, use GetRange(0,1) in combination with ToObject(). For example, if you want to retrieve the most expensive car from the car pool, use

extentof(CompanyCar).OrderBy([Price descending]).GetRange(0,1).ToObject()

It is also possible that your data is inconsistent; more than one CompanyCar may be assigned to an Employee, although business rules forbid this. In this case, this is a bug in your software which you should resolve otherwise (e.g. through proper business rule checks or database constraints). Using GetRange(0,1) does not really help, as it does not represent the business intent you want to implement.

It may also be tempting to use GetRange(0,1) “just to be sure”, but note that this adds unnecessary performance overhead. For example, joining in a TOP query is far more complex and slow than joining in the same without GetRange(0,1). In fact, this is the reason why we have not included GetRange(0,1) in ToObject().

Posted by Chris

Technorati Tags: object relational, getting started

Genome | OQL
Tuesday, 26 June 2007 09:46:19 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Friday, 22 June 2007

In my previous installment, I wrote about using the Database Reverse Engineering Wizard to create classes and mappings for an existing database.  In today's post, I talk about using the Genome Web Application Wizard to create a simple, yet fully functional ASP.NET Site to view and edit entries in your database.

I'm basically continuing where I left off last time.  I have a Visual Studio Solution with two projects: one with the DataDomain classes, and one with the mapping files.  Now I add a new project using File -> Add -> New Project ...  I select Visual C# (the Wizard seems to be a C#-only affair), and then Genome Web Application.

I gve the project a name, and click OK.  I click Next on the Wizard's Welcome page, and come to the page where I select the DataDomain Schema (or Mapping) project, and the Business Project.  The Wizard is smart enough to automatically detect the appropriate projects in the solution.

Next once more, I enter the connection string again, click Finish and Finish again.  I now have a third project in my solution, with appropriate references, CSS adapaters, a default page, a master page and a couple of helper classes.

At this point, I made the mistake of thinking that the wizard had malfunctioned because there are no pages for my business objects.  There's only a default.aspx page, and that one is empty.  One of my colleagues kindly explained to me that I need to add those pages manually.  That way, I can specify exactly what such a page should contain.

So I right-click the web application project, point to Add and select New Item.  There are two Genome-related items to choose from, the Genome Details Page and the Genome List Page. I select Genome List Page and call it Orders.aspx.

I click on Add, and another part of the wizard appears.  I click Next to get past the welcome screen, I choose my business object class (Order) and click next again. A screen with several settings appears.  I leave the default values, except that I enable in-place editing.  I click Finish twice and the wizards adds Orders.aspx to the web application.

I set Orders.aspx as the project's start page and hit F5.  A browser window opens and I see a page with filter options and the results of the search.  Go ahead and play around with the filter options.  The Edit and Delete links are fully functional as well.  Play around with these as well.  Show Details leads to an HTML 404 error however, since we haven't yet defined a details page for the Order business entity.  I close the browser and return to Visual Studio.

I add another item to the web application, but this time I select a Genome Details Page.  I click through the wizard, select Order for my business class and basically accept all the default values.  The wizard adds an OrderDetails.aspx page to the project. I hit F5 again, and the Orders.aspx page opens again.  I click on Show Detail for one of the items, and this time OderDetails.aspx opens and shows the details of the Order object.

You can repeat this process for all business entities that you want to view and edit in the web application.  Using the Genome Web Application wizard, you can quickly generate a small but fully functional web site for editing your business data.

In the next installment, I will show how to create a Windows Forms application that uses Genome.

Posted by Dirk

Technorati Tags: object relational, getting started

Friday, 22 June 2007 09:50:56 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Friday, 15 June 2007

Most software projects start with an existing database.  It's not often that you get the chance to start from scratch and design the database the way you like it.  Now, creating Genome mappings for an existing database is a tedious job, but thankfully we have the Database Reverse Engineering Wizard for that.  So let's start Visual Studio 2005, and create a new Database Reverse Engineering project.

Genome will start a wizard.  Connect to Northwind, select all tables, accept the names for the projects or choose your own, change any options if you like (I opted to have Genome create default constructors) and click finish.  Genome will now generate two projects for you, one with classes for your domain model, and one for the mappings of those classes to database tables.

Let's have a look at some of the class file that the wizard generated for you. Open the Order.cs (or Order.vb if you selected VB.NET as your language of choice).

    1 using System;

    2 using TechTalk.Genome;

    3 using TechTalk.Genome.Mapping;

    4 using System.ComponentModel;


    6 namespace Genome.DataDomain

    7 {

    8     [TypeConverter(typeof(ExpandableObjectConverter))]

    9     public abstract class Order : Persistent

   10     {

   11         #region Primary Keys


   13         public abstract int OrderId { get; }


   15         protected Order()

   16         {

   17         }


   19         #endregion


   21         #region Scalar Fields


   23         public abstract Nullable<DateTime> OrderDate { get; set; }

   24         public abstract Nullable<DateTime> RequiredDate { get; set; }

   25         public abstract Nullable<DateTime> ShippedDate { get; set; }

   26         public abstract Nullable<decimal> Freight { get; set; }

   27         public abstract string ShipName { get; set; }

   28         public abstract string ShipAddress { get; set; }

   29         public abstract string ShipCity { get; set; }

   30         public abstract string ShipRegion { get; set; }

   31         public abstract string ShipPostalCode { get; set; }

   32         public abstract string ShipCountry { get; set; }


   34         #endregion


   36         #region Reference Fields


   38         public abstract Customer Customer { get; set; }

   39         public abstract Employee Employee { get; set; }

   40         public abstract Shipper ShipVia { get; set; }


   42         #endregion


   44         #region One To Many Associations


   46         public abstract Collection<OrderDetail> OrderDetails

   47         {

   48             get;

   49         }


   51         #endregion


   53         #region Many To Many Associations


   55         public abstract Collection<Product> Products

   56         {

   57             get;

   58         }


   60         #endregion

   61     }

   62 }

The file starts out with a couple of using declaration that refence the Genome namespaces. The Order class is annotated with the TypeConverter attribute, which is useful to display Genome objects in a property grid.  The first member we encounter is a property for the primary key of the Order entity, in this case a simple integer.  There's also the default constructor, which is protected because we are dealing with an abstract class. Then come a number of simple properties, nothing exciting.  After that it gets more interesting.

The Orders table has several foreign key contraints.  Those foreign keys are represented by references to related objects.  For example, the foreign key to the Employees table as mapped by a property of type Employee.  We have the other direction as well: the OrderDetails table has a foreign key to the Orders table.  This results in the Order class having a collection of OrderDetail objects.  If you take a look at the OrderDetail class, you will find it has a property of type Order for its side of the relation.

Lastly we have an m-p relation between the Orders and the Products tables, with the OrderDetails table as connection table.  Genome creates a collection of Product objects in the Order class, and a collection of Order object in the Product class.

By this time, you are probably wondering why the classes and all those properties are abstract. The reason becomes clear when we use Reflector to have a look at the assembly compiled by Genome.

For every class generated by the Database Reverse Engineering wizard, there are two classes in this assembly: one in a namespace starting with GenomeContextBoundProxy and one in a namespace starting with GenomeContextUnboundProxy.  The Genome runtime will use one or the other depending on the Genome Context. The distinction between the two kinds of proxies isn't that important, though. What is important is the fact that Genome uses the proxy design pattern. This all happens behind the scenes however, so as an application developer you will only deal directly with the abstract classes.

In the next installment, I'll use the Genome starter kit to create a web application that uses the two projects we just generated.

Posted by Dirk

Technorati Tags: object relational, getting started

Friday, 15 June 2007 10:16:50 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Sunday, 27 May 2007
I understand that PDC doesn’t deal with current or soon-to-be-released technology, and frankly, I didn’t epxect to see anything about Orcas there. I find their explanation weird, especially for two points.
Sunday, 27 May 2007 11:43:07 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Thursday, 15 March 2007

George Lawton contacted me at the end of February to ask some questions about O/RM as he was working on a story for The story has now been published, and I think George Lawton and Jack Vaughn did a good job of providing an accurate analysis of the current situation of the O/RM market for .NET.

When I received George’s email, I was quite surprised that he was inquiring about the situation that people allegedly complain about O/RMs generating quick wins in the beginning that you pay dearly for at a later stage. In his article, you can read how strongly I disagree with this myth and I was pleased to see that other people quoted in his article feel the same.

George asked us the following three questions, which I found very interesting to discuss:

  • What specific features of Genome make it simpler to use, both initially and over time, than other O/RM tools?
  • What have been some of the major challenges in the use of O/RM tools, and what are the ways you have gone about addressing these?
  • What specific tips do you have to offer developers in getting the most out of using O/RM tools as part of the software development process?

Intrigued by his questions, I put together quite extensive replies – replies that may be of interest to others, too. Based on my answers to George, I have put together this article to outline our thoughts on the issues above and give some advice to developers who are evaluating O/RMs.

Thursday, 15 March 2007 15:59:23 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Monday, 12 March 2007

The Singleton pattern represents a very common way of using lazy initialization with resources. Textbooks usually describe the Singleton as follows:

public class SingletonClass
  private static SingletonClass instance;
  public static SingletonClass Instance
    get {
      if (instance == null)
        instance = new SingletonClass(…);
        // do some initialization logic
      return instance;

However, as any developer quickly realizes - implementing it in this way in a multi-threaded scenario is highly error-prone: if two threads attempt to access Instance at the same time (when it hasn't been created yet), both may run inside the if block and one of the threads will override the instance created by the other. As we tried to defend ourselves against this scenario, we changed the implementation as follows:

Monday, 12 March 2007 17:57:28 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [3]  | 
Let’s be honest about it: getting to know Genome is a non-trivial undertaking. It may seem downright daunting. I want to share with you some of the things I did, hoping that someone might benefit from my experiences.
Monday, 12 March 2007 12:02:26 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Monday, 26 February 2007

In my previous post I was discussing about how to mock test data for persistent objects mapped with Genome. Now I want to discuss how to provide this sample data in Blend so the designer building a front end with WPF actually sees how the UI would look like with data.

Genome | WPF
Monday, 26 February 2007 15:29:32 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  | 
 Saturday, 24 February 2007

We’ve been working on a little research project recently with WPF and created a small (and hopefully handy) application where you can list your contacts synchronized from Microsoft CRM. We store the offline data in a SQL 2005 Compact Edition database and we access the data using Genome.

To make the experiment more exciting we involved a designer in the project, not only to design a cool UI using the features of WPF, but mainly to see how the collaboration between designer and developer works in reality.

Genome | WPF
Saturday, 24 February 2007 15:15:35 (W. Europe Standard Time, UTC+01:00)  #    Disclaimer  |  Comments [0]  |