Friday, 05 January 2007

In the recent months there has been a lot of confusion in the community about what LINQ is and what it is not. If you discuss this topic with others and read through the blogs you will find a lot of different perspectives and opinions on LINQ.

Most of the questions and misconceptions about LINQ I have encountered are about mixing up LINQ with an O/RM system and not understanding the impact of LINQ to .NET based O/RMs.

Below I want to give a brief summary about LINQ and how it relates to O/RMs, using Genome as a concrete example.

Pre-LINQ era in O/RMs

Before LINQ every O/RM had to come up with its own proprietary solution for expressing queries. Likewise, Genome provides its own unique Object Query Language (OQL) for querying data from underlying databases.

Genome’s OQL is designed from ground up to fulfill the requirements of expressing queries in object-oriented domain models and practice. OQL thus strongly supports key OO programming constructs such as decomposition, encapsulation, method calls, object graph navigation and set operations. OQL’s syntax is rather similar to that of C-based languages, such as C# or Java. This similarity is intentional, as the typical authors of OQL queries are assumed to be middle-tier developers who already have extensive experience in C#, Java or VB.NET, making OQL syntax extremely familiar for them.

Introduction of LINQ

In Fall 2005, Microsoft introduced a set of language extensions for the C# and VB.NET compiler, named LINQ (Language INtegrated Query). LINQ provides syntax sugar for C# and VB.NET such as lambda expressions, extension methods, standard query operators and anonymous types that allow you to directly express queries in C# or VB.NET.

The main advantage of LINQ is that query expressions can now be checked by the compiler during build time, eliminating hard-to-detect runtime errors that result from ill-formed queries formulated as strings in code.

LINQ versus O/RMs

LINQ is a source of much confusion amongst developers, who are unsure of what it is and what it is not. Microsoft has played quite a role in augmenting this confusion by calling APIs that use LINQ to express queries “*LINQ” (or nowadays: LINQ to *): take XLINQ (or LINQ to XML) for example, which provides much more than the possibility of querying XML documents using LINQ. It provides a fully fledged alternative API to MSXML DOM (including creating, transforming and saving XML documents). Similarly, DLINQ (or LINQ to SQL) is just yet another upcoming O/RM implementation provided by Microsoft that uses LINQ to express queries. All other O/RM functionality, such as mapping, caching, lazy loading, update tracking etc., is subject to the DLINQ implementation, which has nothing to do with the LINQ compiler extensions.

Having said this, all that LINQ does is check and compile queries expressed in C# or VB.NET to abstract syntax trees (represented as Expression<T>). It is the responsibility of the corresponding API utilising LINQ (such as an O/RM tool or an XML API) to translate and execute the queries against the data represented with the API (such as a relational database or an XML document).

LINQ versus proprietary query languages

While LINQ is not a replacement for an O/RM, it is definitely a replacement for an O/RMs proprietary query language implementation, such as Genome’s OQL. OQL tried to achieve several years ago what LINQ will provide in the near future: expressing queries in a language based on C# rather than a data-specific language such as SQL or XPath.

Advantages of this approach:

  • The developer does not need to deal with different languages for different use cases.
  • Expressed logic can be executed both in the .NET CLR and against the data storage.
  • Concepts like inheritance, polymorphism and encapsulation used in C# can also be used in query expressions.
  • No need to repeat mapping between the domain model and the data storage representation in each query.
  • Query logic is not an alien to the compiler anymore and can be checked and compiled during build time.

While OQL can already offer most of these advantages, it certainly cannot modify the C# compiler to directly understand OQL expressions written in C# code. Therefore Genome’s OQL only provides a partial solution to the problem of runtime errors caused by query expressions that could not be verified during compile time: Genome allows mapping OQL to members of any CLR class. These OQLs are verified by the Genome mapping compiler during build time. However, OQL in .NET source code needs to be expressed in string literals which cannot be verified during build time.

Using LINQ with Genome eliminates this final problem and allows all query logic to be checked during build time.

Using LINQ in O/RMs like Genome

Genome's OQL is very similar to LINQ because it also stems from C#. There are minor syntax differences between OQL and LINQ for operations not specified in C# so far, such as specifying implicit functions (lambdas) for e.g. filtering or projection. In the absence of any other standard, OQL leaned towards the XPath syntax in this regard.

Because of the similarities between OQL and LINQ, Genome already supports using LINQ on an equal footing with OQL. This means that LINQ can be used in code as well as in the mapping file as a fully functional alternative to OQL. It does not matter to Genome in which form a query is expressed; LINQ and OQL can even be used side by side in projects. To use LINQ with Genome, the C# compiler of course needs to be updated with the LINQ extensions. Genome 3.x releases support the latest corresponding CTPs and beta versions of LINQ available. Of course Genome can be used without installing and using LINQ as well.

The future of O/RMs in the LINQ era

LINQ is an opportunity for O/RM solutions as it unifies the way in which developers build query logic, reflecting the object model rather than the physical database structure. Differentiation between O/RMs in the future will be based on how well they implement translation of LINQ to SQL and what other features they offer besides querying.

Genome is the only O/RM that allows you to decompose and reuse query logic in LINQ (or OQL), thereby reducing complexity and improving maintainability of query logic. Genome provides powerful LINQ to SQL translation including sub-queries, aggregates and compiling of query expressions to stored functions. Apart from its LINQ capabilities, Genome offers advanced O/RM features for mapping, caching, lazy loading, update tracking and extensibility.

Availability of LINQ and transitioning from a proprietary query language to LINQ

LINQ is not to be confused with the .NET Framework 3.0, which was released end of 2006. Such confusion does occur frequently because LINQ CTPs are also referred to as C# 3.0. The release of LINQ is currently expected with the next release of Visual Studio, code named “Orcas”. No official release date or time frame for “Orcas” has yet been announced by Microsoft.

Whether you should switch to LINQ before its release is a question of whether you trust the new compiler and if you are satisfied with the IDE support provided for LINQ (e.g. intellisense for the new C# language features). Note that LINQ only modifies the compiler. The compiled result only requires the already released .NET 2.0 CLR. While it is probably not a good idea to base most projects on a CTP, it may also not be the best choice to start a larger development project right now with query logic expressed in a way that is not aligned with already foreseeable innovations of the platform.

Genome allows you to postpone the decision of when to switch to LINQ as OQL and LINQ are already treated equally. You can start your project today without the LINQ extensions, expressing your query logic in Genome’s OQL. Whenever your project is ready to switch to LINQ, existing query logic expressed in OQL can be used together with LINQ. LINQ queries can even be composed out of expressions mapped with OQL. There is no need to throw out or migrate existing OQL logic when switching to LINQ. OQL code in the project can be migrated gradually as the project evolves further.

Posted by Chris

Genome | Linq