Print

Integrating Enterprise Library Validation Application Block with LINQ to SQL and Entity Framework Part 3: The complexity of custom validators.

This article shows the complexity a developer is facing, implementing a custom validator while using an O/RM tool. While the article uses the Validation Application Block and LINQ to SQL in writing the validation logic, the problems encountered in the article are very general.

In previous posts (here and here), I’ve shown how to integrate the Enterprise Library Validation Application Block (VAB) in LINQ to SQL and Entity Framework projects. I showed how to obtain access to a data context from within custom validators, but I also stated that -in the context of O/RM frameworks- custom validators that communicate with the database, tend to get pretty complicated quickly.

In this post I will try to prove this statement by defining a simple business rule and trying to create a validator for it. The rule is written in seven iterations. Each of these iteration will have a defect which I will try to solve in the version following it. Per version I will try to improve that validator towards a correct solution. I’ll also write some supporting integration tests to check it’s correctness while doing so.

The code in this article is based on the Northwind database, and I will use the Customers, Orders and Order_Details tables and their associated entities. I’ve shown code that works with both LINQ to SQL and Entity Framework in my previous posts, but the code in this post is based on LINQ to SQL only. While the discussed concepts remain the same, Entity Framework has different deficiencies. These could bite you, when trying this out with Entity Framework.

The validator built here is based on the following (imaginary) business rule named the ‘unshipped orders rule’. It’s definition is as follows:

A customer is not allowed, at a given moment in time, to have more than two unshipped orders.
Version 1.0

Writing a validator for such a simple business rule couldn’t that hard, could it? Below is the first definition of the UnshippedOrdersRuleValidator class:

[ConfigurationElementType(typeof(CustomValidatorData))]
public sealed class UnshippedOrdersRuleValidator : Validator
{
private const int MaximumNumberOfUnshippedOrders = 2;

private const string RuleViolationMessage = "Customer with " +
"ID {0} has currently {1} unshipped orders, while {2} " +
"is the maximum permitted number of unshipped orders.";

public CustomerValidator(NameValueCollection attributes)
: base(string.Empty, string.Empty)
{
}

protected override string DefaultMessageTemplate
{
get { throw new NotImplementedException(); }
}

protected override void DoValidate(object objectToValidate,
object currentTarget, string key, ValidationResults results)
{
Customer customer = (Customer)currentTarget;

int totalUnshippedOrders = (
from order in customer.Orders
where order.ShippedDate == null
select order).Count();

if (totalUnshippedOrders > MaximumNumberOfUnshippedOrders)
{
string message = string.Format(RuleViolationMessage,
customer.CustomerID, totalUnshippedOrders,
MaximumNumberOfUnshippedOrders);

this.LogValidationResult(results, message,
currentTarget, key);
}
}
}

The class inherits from Validator, which is part of the VAB. The implementation has the required plumbing: It’s decorated with the ConfigurationElementTypeAttribute and has a constructor with a NameValueCollection argument. These are needed for it to be configurable using VAB. Please note that I’m using hard coded strings for simplicity. You’d probably want to use resource based strings in production code. I wanted to show a full working implementation, but from now on, the focus will be on the DoValidate method.

The code is rather straight forward. When a customer is validated, it is supplied to the DoValidate method as the currentTarget argument. The argument is casted from object back to Customer and the customer’s orders collection is queried, by searching for orders with no ShippedDate. The total number of unshipped orders is compared to MaximumNumberOfUnshippedOrders. If the total number of unshipped orders for the customer is greater than the defined maximum, the customer is invalid and a validation result is logged using the LogValidationResult method, which is part of the Validator base class. For the sake of simplicity I’ve decided to define the value containing the maximum number of unshipped orders as a const instead making it configurable.

To be complete, here is the VAB configuration used to get this working (I stripped namespaces of the types for readability):

  <validation>
<type name="Customer" defaultRuleset="Default"
assemblyName="Domain">
<ruleset name="Default">
<validator type="UnshippedOrdersRuleValidator, Validators"
name="Unshipped Orders Rule Validator" />
</ruleset>
</type>
</validation>

As I wrote, each version would have a defect. But the code looks so innocent; what could be wrong with it?

The trouble is that this validator is too simplistic and the possible deletion of orders within that context are not handled correctly. Here is an integration test that proves this bug:

[TestMethod]
public void UnshippedOrdersRuleTest01()
{
using (new TransactionScope())
{
using (var db = new NorthwindDataContext())
{
// Customer ERNSH has 2 unshipped order in the
// database.
Customer ernsh =
db.Customers.Single(c => c.CustomerID == "ERNSH");

ernsh.ContactName = "Steven";

// Adding one new unshipped order.
Order orderToInsert = new Order();
orderToInsert.Customer = ernsh;
db.Orders.InsertOnSubmit(orderToInsert);

// Delete this order and its details.
Order orderToDelete = (
from order in db.Orders
where order.CustomerID == ernsh.CustomerID
where order.ShippedDate == null
select order).First();

// We must also delete the order’s details.
db.Orders.DeleteOnSubmit(orderToDelete);
db.OrderDetails.DeleteAllOnSubmit(
orderToDelete.OrderDetails);

// SubmitChanges should succeed, because the customer
// should now still have 2 unshipped orders.
db.SubmitChanges();
}
}
}

Here are some notes on this test. While it’s an automated test, it differentiates from normal unit tests in that it doesn’t test a single unit of code, but a single feature or requirement. The test does this by touching the different tiers in the application. It even accesses the database in this test, which is absolutely a no-go in unit testing. The test expects the Northwind database with it’s default data (therefore, it’s wise to have a dedicated test database, when you’re running integration tests). By wrapping all code in a TransactionScope, we ensure that all database changes are rolled back after the test finished. This way we can be sure that the test can run successfully again and again. Also note that I only show tests relevant to this post. In reality, more tests are needed to prove correctness of this business rule.

The test retrieves a specific customer from the database, inserts a new unshipped order for that customer and deletes another (already existing) unshipped order from the database. The specified customer has currently two unshipped orders in the database (a precondition for this test to run). Now after deleting one of those and adding a new unshipped order, the customer should again have two unshipped orders, but the test fails and therefore I’ve proven this first version of the unshipped orders business rule to be flawed.

Version 2.0

Here is a new implementation of the DoValidate method, that fixes that flaw:

protected override void DoValidate(object objectToValidate, 
object currentTarget, string key, ValidationResults results)
{
Customer customer = (Customer)currentTarget;

NorthwindDataContext context =
(NorthwindDataContext)ContextScope.CurrentContext;

var ordersToDelete =
context.GetChangeSet().Deletes.OfType<Order>();

int totalUnshippedOrders = (
from order in customer.Orders
where order.ShippedDate == null
where !ordersToDelete.Contains(order)
select order).Count();

if (totalUnshippedOrders > MaximumNumberOfUnshippedOrders)
{
...
}
}

This new version uses the ContextScope.CurrentContext construct to get the context, as discussed in part 2. It uses LINQ to SQL’s ChangeSet to determine all orders that are currently being deleted (within that context of course). This list is used to filter those orders from the total number of unshipped orders, because this is what caused our test to fail. The rest of the code remains the same.

The previously defined integration test will now succeed. However, there is another problem. The validator only validates customers, while also orders have to be validated. Let’s write a second test to detect this bug:

[TestMethod]
[ExpectedException(typeof(ValidationException))]
public void UnshippedOrdersRuleTest02()
{
using (new TransactionScope())
{
using (var db = new NorthwindDataContext())
{
// Customer ERNSH has two unshipped order in the
// database.
Customer ernsh =
db.Customers.Single(c => c.CustomerID == "ERNSH");

// Adding a third unshipped order.
Order orderToInsert = new Order();
orderToInsert.Customer = ernsh;
db.Orders.InsertOnSubmit(orderToInsert);

// SubmitChanges should throw a ValidationException.
db.SubmitChanges();
}
}
}

This test fails, because SubmitChanges didn’t throw the expected exception. This happens because the validator is never called. The first test (deliberately) changed the ContactName of the customer. This second test however didn’t. Therefore the customer didn’t change and because of that the validation wasn’t triggered.

In part 1, I’ve built the validation infrastructure in such a way that it only validates changed objects (see the GetChangedEntities method). That infrastructure could be changed in such a way that it validates all objects, but this would give an enormous performance penalty. Thousands of entities could be loaded by the context, while maybe just a single entity was changed. The real event that should trigger the validation of a customer is not the creation or mutation of the customer itself, but rather that of an order. Let’s try it once more.

Version 3.0

Below, the new DoValidate implementation changed in such a way that -not a customer- but the customer’s order is the target object:

protected override void DoValidate(object objectToValidate, 
object currentTarget, string key, ValidationResults results)
{
// We cast the currentTarget to the order and get it’s customer.
Customer customer = ((Order)currentTarget).Customer;

NorthwindDataContext context =
(NorthwindDataContext)ContextScope.CurrentContext;

var ordersToDelete =
context.GetChangeSet().Deletes.OfType<Order>();

int totalUnshippedOrders = (
from order in customer.Orders
where order.ShippedDate == null
where !ordersToDelete.Contains(order)
select order).Count();

if (totalUnshippedOrders > MaximumNumberOfUnshippedOrders)
{
...
}
}

Within the validator, the customer is still validated, but VAB is now supplying an order. Changing an order should now trigger the validation and for this the VAB configuration must be changed:

  <validation>
<type name="Order" defaultRuleset="Default"
assemblyName="Domain">
<ruleset name="Default">
<validator type="UnshippedOrdersRuleValidator, Validators"
name="Unshipped Orders Rule Validator" />
</ruleset>
</type>
</validation>

While the new code and configuration satisfies both the previously defined tests, this change resulted in a new problem. When a customer is invalid, we’d expect a single message that describes this error. Let’s write a test for this:

[TestMethod]
public void UnshippedOrdersRuleTest03()
{
using (new TransactionScope())
{
using (var db = new NorthwindDataContext())
{
Customer junkie = new Customer();

junkie.CustomerID = "junki";
junkie.ContactName = "S. van Deursen";
junkie.CompanyName = "Cutting Edge";

// Adding three new unshipped orders.
// Yes, this magically works in L2S :-)
new Order { Customer = junkie };
new Order { Customer = junkie };
new Order { Customer = junkie };

db.Customers.InsertOnSubmit(junkie);

try
{
db.SubmitChanges();
Assert.Fail("Exception was expected.");
}
catch (ValidationException vex)
{
int actualCount = vex.Results.Count;
// Should contain one message.
Assert.AreEqual(1, actualCount);
}
}
}
}

This third test fails, because the list of errors will contain three results instead of the expected single result. There is a result for each validated order. But all results describe the same message:

Customer with ID junki has currently 3 unshipped orders, while 2 is the maximum permitted number of unshipped orders.

These duplicate error messages will clutter the user interface and while it’s possible to filter those duplicate messages, this isn’t desirable. Besides that, duplicate validations can have big impact on the performance of the system. Imagine changing a couple of hundred orders of a single customer within the same context.

Version 4.0

How can these duplicate validations be prevented? A validated customer must be registered in such a way that the UnshippedOrdersRuleValidator doesn’t validate that customer twice within the same context. It should also work correctly in a multi-threaded / multi-user environment such as ASP.NET. To solve this, let’s use the infrastructure, presented in part 2. Some sort of caching mechanism must be implemented that validators can use. The only feasible place to define such a mechanism is within the ContextScope class (see part 2 for the previous definition). Here is a new implementation:

public sealed class ContextScope : IDisposable
{
private const string ScopeAlreadyActiveExceptionMessage =
"A ContextScope has already been declared within " +
"the current scope. Only one scope can be active.";

private const string NoScopeActiveExceptionMessage =
"A ContextScope hasn't been declared with in the " +
"current scope. This property can only be called " +
"within the context of a ContextScope.";

[ThreadStatic]
private static object currentContext;

// This static field has been added.
[ThreadStatic]
private static Dictionary<Type, object> currentItems;

public ContextScope(object context)
{
context.Requires("context").IsNotNull();

if (currentContext != null)
{
throw new InvalidOperationException(
ScopeAlreadyActiveExceptionMessage);
}

currentContext = context;
currentItems = new Dictionary<Type, object>();
}

public static object CurrentContext
{
get
{
object context = currentContext;

if (context == null)
{
throw new InvalidOperationException(
NoScopeActiveExceptionMessage);
}

return context;
}
}

// This static property has been added.
public static Dictionary<Type, object> CurrentItems
{
get
{
Dictionary<Type, object> items = currentItems;

if (items == null)
{
throw new InvalidOperationException(
NoScopeActiveExceptionMessage);
}

return items;
}
}

public void Dispose()
{
currentContext = null;
currentItems = null;
}
}

This extended ContextScope contains a new CurrentItems property that returns a thread-local dictionary, just like the thread-local context returned from the CurrentContext property. This is a generic solution for all our custom validators that need a context-aware cache. Now let’s fix the implementation of the UnshippedOrdersRuleValidator class, using this new feature:

protected override void DoValidate(object objectToValidate, 
object currentTarget, string key, ValidationResults results)
{
Customer customer = ((Order)currentTarget).Customer;

// Prevent duplicate customer validations.
if (ValidatedCustomers.Contains(customer))
{
return;
}

// Add the customer to the list of validated customers.
ValidatedCustomers.Add(customer);

NorthwindDataContext context =
(NorthwindDataContext)ContextScope.CurrentContext;

var ordersToDelete =
context.GetChangeSet().Deletes.OfType<Order>();

...
}

// Returns a list of customers, validated in the current context.
private static HashSet<Customer> ValidatedCustomers
{
get
{
Dictionary<Type, object> items = ContextScope.CurrentItems;

Type key = typeof(UnshippedOrdersRuleValidator);

object validatedCustomers;

// Get the list of validated customers from the cache.
if (!items.TryGetValue(key, out validatedCustomers))
{
// When there's no list yet, we create a new one and
// register it in the item cache.
validatedCustomers = new HashSet<Customer>();
items.Add(key, validatedCustomers);
}

return (HashSet<Customer>)validatedCustomers;
}
}

This version uses the new ContextScope.CurrentItems property to retrieve its list of validated customers. Because I’ve implemented the CurrentItems as a dictionary with a Type key, the validator can supply it’s type as key. This way each validator can have it’s own private cache, without interfering with others.

The next problem we will tackle has to do with the Orders property on our LINQ to SQL customer entity. The Orders property returns a collection of orders. To be more precise: it returns an instance of type EntitySet<Order>. LINQ to SQL’s EntitySet<TEntity> however, is an in-memory store of all orders related to that customer. It doesn’t implement IQueryable<T>, but IEnumerable<T> exclusively. You can only iterate over all items and that means all the orders in the database related to that specific customer. In other words, the ‘from order in customer.Orders’ query in the DoValidate method retrieves *all* orders for that customer from the database. Imagine a customer with a few thousand orders and you’ll get the picture. Every time this validator runs, it loads all orders in memory. Now imagine yourself this validator running on your production web server with thousands of users. Of course we know that premature optimization is the root of all evil, but let’s say we know during design that certain customers will have over a thousand orders now or in the near future. In that case we aren’t talking about premature optimization anymore. With that condition in mind, we’ll have to make a conscious design decision.

Version 5.0

It’s note easy to write an integration test that checks whether the validator loads too many orders, because the validator is functionally correct and LINQ to SQL doesn’t allow us to hook into it’s infrastructure to check this. Writing such a test is outside the scope of this post. Therefore we directly move on to the new definition of the UnshippedOrdersRuleValidator class:

protected override void DoValidate(object objectToValidate, 
object currentTarget, string key, ValidationResults results)
{
Customer customer = ((Order)currentTarget).Customer;

// Prevent duplicate customer validations.
if (ValidatedCustomers.Contains(customer))
{
return;
}

ValidatedCustomers.Add(customer);

NorthwindDataContext context =
(NorthwindDataContext)ContextScope.CurrentContext;

IQueryable<Order> unshippedOrdersInDatabase =
GetUnshippedOrdersFromDatabase(context, customer);

IEnumerable<Order> unshippedOrdersInMemory =
GetUnshippedOrdersFromContext(context, customer);

int totalUnshippedOrders =
unshippedOrdersInDatabase.Count() +
unshippedOrdersInMemory.Count();

if (totalUnshippedOrders > MaximumNumberOfUnshippedOrders)
{
...
}
}

private static IQueryable<Order> GetUnshippedOrdersFromDatabase(
NorthwindDataContext context, Customer customer)
{
ChangeSet changeSet = context.GetChangeSet();

var updatesAndDeletesInContext =
changeSet.Updates.Concat(changeSet.Deletes);

// Get a list of all the customer's orders that are updated or
// deleted.
var excludedOrderIds =
from order in updatesAndDeletesInContext.OfType<Order>()
where order.CustomerID == customer.CustomerID
select order.OrderID;

// Get a list of all the customer's orders in the database
// that have no shipped date, excluded by the list of updated
// and deleted orders in the current context (this prevents
// double counts).

// Note that we use context.Orders instead of customer.Orders
return
from order in context.Orders
where order.CustomerID == customer.CustomerID
where order.ShippedDate == null
where !excludedOrderIds.Contains(order.OrderID)
select order;
}

private static IEnumerable<Order> GetUnshippedOrdersFromContext(
NorthwindDataContext context, Customer customer)
{
ChangeSet changeSet = context.GetChangeSet();

var insertsAndUpdatesInContext =
changeSet.Inserts.Concat(changeSet.Updates);

// Get a list of all the customer's orders in the datacontext
// (in memory) that are being inserted or updated and don't
// have a shipped data (we don't count the deletes).
return
from order in insertsAndUpdatesInContext.OfType<Order>()
where order.CustomerID == customer.CustomerID
where order.ShippedDate == null
select order;
}

As you can see, there’s already a lot of code to get the business rule correct. Although this version is functionally equivalent to the previous version, the performance is much better. The IQueryable<Order> returned from the GetUnshippedOrdersFromDatabase function is deferred and will result in the following (very efficient) database query during the call to Count():

SELECT  COUNT(*) AS [value]
FROM [dbo].[Orders] AS [t0]
WHERE (NOT ([t0].[OrderID] IN (@p0)))
AND ([t0].[ShippedDate] IS NULL)
AND ([t0].[CustomerID] = @p1)

But still, even now, there’s a glitch. This time however, the problem is less obvious, it is one of concurrency. While this code is functionally correct and even performs great, a system that uses the rule might get in the situation where two different users / processes manage to simultaneously insert new unshipped orders for a certain customer and thereby bypassing the validation mechanism. This will happen when the validators for both processes run before each other’s order insertions. The optimistic locking mechanism most O/RM tools use, will not help in this case. Optimistic concurrency mechanisms are simply unable to handle this specific scenario. To solve this race condition, these are the alternatives one could consider:

  • Do nothing and just let it happen.
  • Use pessimistic offline locking.
  • Implement a constraint in the database.
  • Use database transactions (a.k.a. pessimistic online locking).

Doing nothing and letting it happen can be a very valid solution to this problem. If you can predict that the likelihood of failure, times the costs of correcting the corrupted data in case of such a failure is less than the costs of implementing a foolproof solution; that solution is commercially your best pick (but don’t forget to document that decision). Of course this depends on a wide variety of variables, both technically and non-technically, such as the chosen architecture, complexity of the system, experience of the development team, business strategy, the cost of repairing customer trust, etc.

The second alternative is using a pessimistic offline lock. Depending upon requirements and the system architecture, a locking mechanism could be used that’s not based on database transaction (pessimistic online locking), or optimistic locking. Martin Fowler describes a pessimistic offline lock as follows:

It forces a business transaction to acquire a lock on a piece of data before it starts to use it, so that, most of the time, once you begin a business transaction you can be pretty sure you'll complete it without being bounced by concurrency control.

Acquiring a lock on a piece of data could be done in several ways. You could for instance implement this by storing extra information at row level about who locked the data at which moment. Using pessimistic offline locking all over the place however, is a terrible idea and it has to be thought out very carefully within the business requirements and system architecture, where and how to use this mechanism. For instance, when the given business rule runs within automated processes, what should be done in case of a lock? Should one of the processes fail, wait until the lock can be acquired or should it be prevented from running in the first place?

Version 6.0

Let’s try to solve the problem using a database constraint. Using a database constraint is only a fallback mechanism to prevent state corruption in the database. It prevents corruption, but it’s not always suitable in the context of an O/RM technology, as we will see shortly. I call constraints a fallback mechanism, because it’s preferable to define all business rules in one place. Of course I prefer defining them in code, as I’m showing you in this post by using the VAB. The problem with constraints is that a failing database constraint will result in an exception being thrown within .NET and while it’s possible to do some smart exception filtering to see which constraint did fail, that’s all there is. There is no (simple) way to see which entity caused the failure, and because of this, there is no way to communicate back to the user what exactly went wrong and what must be done to fix the problem.

The problem we’re dealing with here is one of concurrency. It might not show up that often (if ever) and therefore we might be fine with the few times that it fails hard. At least our database remains consistent. Let’s assume we are fine with it failing ugly, and let’s create a constraint in the database for our made up business rule:

CREATE FUNCTION dbo.GetNumberOfUnshippedOrders
(
@CustomerID nchar(5)
)
RETURNS int
AS
BEGIN
DECLARE
@Result int

SELECT @Result = COUNT(*)
FROM Orders
WHERE CustomerID = @CustomerID
AND ShippedDate IS NULL

RETURN @Result
END
GO

ALTER TABLE
dbo.Orders
ADD CONSTRAINT CK_Orders_MaxUnshippedOrders
CHECK (
dbo.GetNumberOfUnshippedOrders(CustomerID) <= 2
)

After we added the constraint to the database we can rerun our integration tests to see if everything still works fine, but Houston we have a problem! The first test fails:

Test method TestProject1.UnitTest1.UnshippedOrdersRuleTest01 threw exception:  System.Data.SqlClient.SqlException: The INSERT statement conflicted with the CHECK constraint "CK_Orders_MaxUnshippedOrders". The conflict occurred in database "Northwind", table "dbo.Orders", column 'CustomerID'.
The statement has been terminated.

Why is this constraint making our test fail? Look at the test and notice how the insertion of the new order happens before the deletion of the old. Swapping the order in which those are executed however, will have no effect. We’re screwed here, because LINQ to SQL reorders the mutations, and decides that deletes should be executed after inserts. There is nothing we can do about that. This is why O/RM technologies such as LINQ to SQL could sometimes make it hard to define database constraints. This version failed, let’s remove that constraint again from the database and move on to the last alternative:

ALTER TABLE dbo.Orders 
DROP CONSTRAINT CK_Orders_MaxUnshippedOrders

Adding a constraint didn’t work in this scenario. The last alternative is using database transactions. We could wrap the use of the context class in a transaction. Using the SqlTransactionWrapper, I wrote a long time ago, this would look like this:

using (var tran = new SqlTransactionWrapper("connstr."))
{
using (var db = new NorthwindDataContext(tran.Connection))
{
// We'll need to set the Transaction in L2S.
db.Transaction = tran.Transaction;

// TODO: Business logic

db.SubmitChanges();
}

tran.Commit();
}

However, when trying to use transactions to prevent concurrency conflicts to occur, one could have a rude awakening. In this case the transactional logic should prevent other processes from creating, updating or deleting orders for the validated customer. Normally this would mean creating custom queries with locking hints, because just using a transaction -even serializable transactions- will not protect you from this race condition. When using O/RM technology, writing custom queries is something we’d like to prevent. O/RM tools take away the need to write SQL code and hand-written SQL statements are hard to integrate with O/RM tools. Besides, I find reasoning about transactional correctness pretty difficult.

Note: Although I like using the TransactionScope with integration tests, I never seem to see them fit in my software architecture. Within the scope of a TransactionScope, there will be only one transaction. This means that, in case of a failure, all database operations will be reverted. That means that also logging information, such as exception logging, written to the database, will be rolled back. Of course one could again make sure those logs are written in their own TransactionScope, but this makes designing and writing applications much harder.

I think the simplest way in writing correct transactional code in this situation is by using a serializable transaction in conjunction with a count operation over the Orders table for all orders of the customer being validated. The count operation over the customer’s orders together with the serializable transaction prevents other transactions from inserting new orders for that customer during that transaction. Here is the updated UnshippedOrdersRuleValidator class:

protected override void DoValidate(object objectToValidate, 
object currentTarget, string key, ValidationResults results)
{
...

NorthwindDataContext context =
(NorthwindDataContext)ContextScope.CurrentContext;

CheckIfContextRunsInSerializableTransaction(context);

// This select count(*) will, in conjunction with the
// serializable transaction, prevent other transactions from
// inserting, updating and deleting orders for this customer.
context.Orders
.Where(o => o.CustomerID == customer.CustomerID)
.Count();

...
}

private void CheckIfContextRunsInSerializableTransaction(
NorthwindDataContext context)
{
DbTransaction transaction = context.Transaction;

if (transaction == null || context.Connection == null ||
transaction.IsolationLevel != IsolationLevel.Serializable)
{
throw new InvalidOperationException("The context " +
"should be part of a serializable transaction");
}
}

Running the context in a serializable transaction, should not be forgotten:

using (var tran = new SqlTransactionWrapper("connstr.",
IsolationLevel.Serializable))
{
using (var db = new NorthwindDataContext(tran.Connection))
{
db.Transaction = tran.Transaction;

// TODO: Business logic

db.SubmitChanges();
}

tran.Commit();
}

This effectively solves the concurrency problem. Still, using a serializable transaction has it’s problems on it’s own. It increases the possibility of deadlocks to occur, especially since we have no control over the order in which O/RM tools write data and the order in which our validators are executed. And while there is the possibility to detect a deadlock exception within the SubmitChanges method, and retry the operation, it’s better to not do this while the context is part of a bigger transaction (as shown in the last example). A deadlock detected by the database will effectively rollback that transaction in the database, while the connection is kept open in .NET. Retrying that operation, means it will be executed in a transactionless context and this could lead to data corruption. I think it’s best to consider the complete connection doomed in case of a failure caused by SQL. Retrying the operation can only be done on the level where the transaction is defined (by recreating that transaction and it’s connection), but this is typically a place where you don’t want to have such logic. I advice you just let it fail and let the infrastructure log that exception.

Note: Please note that while serializable transactions might increase the number of deadlocks in the system, you must ask yourself if you know for sure that lowering the isolation level of your transactions couldn’t lead to a database that is in a (functionally) corrupted state. This is very hard to reason about. Also remember that this deadlock might have actually prevented any state corruption.

Please note that the reason that using locks works in this scenario, is because the database server holds locks on certain indexes in the Orders table. The effectiveness of this solution therefore depends on the indexes in this table. Changing the indexes in the database could change the effectiveness and correctness of this solution. This has to be taking into effect when choosing such a solution. When using SQL Server, you can use the sp_lock or sp_lock2 procedures to check whether the code acquired the correct locks and of course check your assumptions manually when building such a solution and every time you make changes to the indexes in the database. This makes it a time consuming and error-prone solution.

Conclusion

We must conclude that there is not a technically pleasant solution to our business rule problem. While serializable transactions do the trick, they have their performance penalties and increased changes of deadlocking. Although your solution works today, it could behave differently after your colleague added or removed an index. Also note that, because of the difficulties, I would say using serializable transactions is an architectural decision. Determining on your application, it may be better to go for pessimistic offline locking, or perhaps doing nothing and letting it happen is even more feasible. But, as I said, this totally depends on your business, architecture, software, team, etc. However, what remains important, is to understand the difficulties you are facing. Software engineering is all about tradeoffs, but if you don’t know there is a difficulty, there is nothing to consider in the first place. I hope I succeeded in warning you.

In the next part of this series I'll show how to use the metadata, generated by LINQ to SQL, to automate validations.

Good luck!

- .NET General, C#, Enterprise Library, Entity Framework, LINQ to SQL, O/RM, SQL, Validation Application Block - No comments / No trackbacks - §

The code samples on my weblog are colorized using javascript, but you disabled javascript (for my website) on your browser. If you're interested in viewing the posted code snippets in color, please enable javascript.

No comments:


No trackbacks:

Trackback link:

Please enable javascript to generate a trackback url


  
Remember personal info?

/

Before sending a comment, you have to answer correctly a simple question everyone knows the answer to. This completely baffles automated spam bots.
 

  (Register your username / Log in)

Notify:
Hide email:

Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.