b.ling on software development: 2009

Monday, November 23, 2009

Real World Performance Comparison: Enterprise Library Logging Block vs log4net

Any search on the web you find will consistently show log4net outperforming EL by a factor of around 4:1. Most of these tests are typically tight for/loops, and comments will end up saying things like “oh well that’s not a realistic test” or “well you shouldn’t be logging that much anyway”, etc etc. Well, let’s see how important choosing your logging infrastructure is in terms of overall system performance in a multithreaded application.

One of my recent tasks was hunting down a memory leak. So I downloaded the trial of the ANTS .NET Bundle which contains a memory profiler and performance profiler. Hunting the memory leak was a pain since I wasn’t the one who wrote the code, so it took a while for me to realize which objects were “leaked.” I eventually found it, and then I turned my attention to using the performance profiler to see if any obvious places could be optimized.

At this point I decided to swap in the Enterprise Logging Block to see how it would fair. The logging migration project was still in progress, and we haven’t made the full switch yet, but I figured I should try out it anyway because I’m the curious type.

The software already uses the Trace diagnostics extensively. Practically every message, error, you name it, will get Trace.WriteLined…and then a LogFileWriter will pick it up, and dump it to a file. In my testing, I simply added an extra line of code here, namely Logger.Write(LogEntry). Thus, anything that used to get logged, will also get logged by the EL.

Then, it was basically running the application to a similar set of scenarios, and then seeing what the profiler says. Obviously each run will not be exactly the same as the previous, but overall I think if we look at percentages we can begin to see where the CPU is taking the most time.

Here is what the application does without any modifications.

Here is the application after adding Logger.Write(LogEntry).

Yeah, I was pretty surprised too. That one method ended up being the most expensive method in the entire application.

A quick look at the source revealed quite a bit of code dedicated to thread safety. So digging deeper, I decided to avoid the static Logger.Write method altogether, create instance LogWriters directly, and have them all go to a singleton FileTraceListener.

That’s a noticeable improvement…it’s not #1 anymore. The CPU time also dropped from ~15% to ~10%. That means that ~5% of CPU time was in the static Logger.Write(LogEntry) method. If you look at the graphs you’ll notice a common pattern among them, because I wanted to test with the same scenario to get as close to as measurable results as possible…but to illustrate how bad the EL’s performance can get, here’s one:

I literally fell off my chair when I first discovered this as I didn’t think it could be possible something that’s supposed to be non-invasive could end up taking this much CPU. This is one of the earlier tests I did where I didn't have a set scenario yet.

And of course, here are the results with log4net.

So what does that leave us?

log4net: 5%
EL static: 15%
EL instance: 10%

So what should you get out of this? Be very careful of how you invoke Logger.Write()…it may not be as cheap as you think it is. Even log4net took much more CPU time than I expected.

I redid the test using Castle’s logging framework, which is a barebones dump messages to a file. The total CPU time for this was 0.016%. Yep! I had to scroll down to even find it!

I guess this shows that even log4net, which is touted for high-performance, comes at a cost.

Saturday, November 7, 2009

Fixing ReSharper 5 EAP Missing Colors

I think anyone’s who has used ReSharper for an extended period of time has experienced the colors disappearing all of a sudden from Fonts & Colors. This is excruciatingly annoying when I have a dark color scheme set up and the colors disappear on me.

In the case of RS5, it seems even more flakey than RS4, and installing/uninstalling other addins have a pretty high probability of losing the colors, and this time, repairing the installation doesn’t fix the problem (nor does the registry restore trick).

For me, to fix the problem I had to:

a) Uninstall RS5.
b) Install RS4.5
c) Open VS!!! Go into settings, and verify all the colors are in Fonts & Colors.
d) Upgrade to RS5.

If you skip (c), it will not restore colors…at least it didn’t when I tried it. Following the above fixed it for me in 2 occasions already. If Jetbrains sees this blog post, PLEASE fix this extremely annoying bug.

Friday, November 6, 2009

Multiple Inheritance in C# with Castle DynamicProxy Mixins

Having some significant experience with C++, every once in a while I’ll really miss features available in C++ that just aren’t possible in C#. Sure, linking takes forever and if you could calculate the percentage of time for a C++ developer spent on linking time I’m sure it’d be around 50%. Nonetheless, there’s a certain seductive pull to stepping through the debugger and the sheer speed of a C++ program (arguably debugger is being made obsolete with TDD). If you’re only used C# for a long time, you probably didn’t even know that C++ was THAT much faster until you try this out. Even a simple hello world program: you can tell a difference.
But that’s not the point of this post. This post is about mixins!
So what is a mixin? You can wikipedia it, but basically it’s a way to add functionality to an existing object. In a way, it’s like multiple inheritance, but not exactly. If you wanna go CS101 terms, this is not a “is-a” relationship, but more of a “can-do” relationship. Both methods promote code reuse, and of course, both can be abused.
In a way, C# extension methods are a form of mixin, but really it’s just syntactic sugar. It’s a static method but looks like a instance method. True mixins are actual instances.
I’ve been playing around with Castle.DynamicProxy lately, and some of it is going into production code soon. This is honestly kind of scary, because I really don’t want to mess up production code which generates daily profit for the company. However, if NHibernate’s using dynamic proxy, and that has been proven to work in countless production environments, I’m sure the complexity of the proxies I’m doing pales in comparison to what NHibernate does.
Most of the examples available on the web show you how a mixin works. Basically, you create a target, and then you mixin extra stuff. Basic stuff…but there’s a really cool trick you can do to make all consuming code a lot easier to use by removing the need of consuming code to cast to a mixin instance. In my case, the system is a very complex piece of legacy code where there are threads all over the place locking things all over the place. I was given the task of keeping track of who owned the lock.
Originally I started with an extension method, but I soon realized that even though it’s nice that along with IDisposable, I could do something like this:

using (anyInstance.Lock()) {
anyInstance.DoStuff();
}

Where Lock() is an extension method which returns an IDisposable, which when Dispose() is invoked will call Monitor.Exit. This is all nice and dandy use of the ‘using’ syntax, but as you can see this is all just syntactic sugar. At the end of the day, this is still a public static method can is accessible by thousands of other objects. And once you need to keep track of all those owners, you need to create a dictionary of object hashcodes, and locking on that thing for each incoming lock on an instance is…well…very very bad.

Anyways, so with DynamicProxy2, I decided to mix in this lock tracking logic. Now, the interface is simple:

public interface ILockMixin {
IDisposable Lock(string tag);
IDisposable TryLock(string tag);
}

Now, the particular thing I wanted to target happened to be a hashtable, so with a little interface trick, you could do this:

public interface ILockableHashtable : IDictionary, ILockMixin { }

Finally, any code that references hashtable needs to be replaced with ILockableHashtable, but the beauty of this is that none of the existing code had to change, and now all proxied hashtables have the Lock() method as well. And alongside Windsor, this is brain-dead simple to wire up:

kernel.Register(
Component.For<IDictionary, ILockableDictionary>()
.ImplementedBy<Hashtable>()
.Proxy.AdditionalInterfaces(typeof(ILockableDictionary))
.Proxy.MixIns(new Locker()));

And that’s it! BTW, I realize I could have easily achieved the same effect and with a lot less effort if I just inherited from Hashtable directly (and avoid the negligible proxy/performance cost), but if I needed this locking mechanism for any other code, I would have to resort to either copy/pasting a lot of code, or a lot of copy/paste/redirect to a static class.

Edit: There is a bug in this last bit, check out my latest entry about this topic.

Monday, October 19, 2009

Improving Your Unit Testing Skills

Unit testing is hard! I came to this sad realization when my code which had a large test suite with near 100% code coverage, with every thought-of requirement unit tested, failed during integration testing.

How is that possible? I thought to myself. Easy…my unit tests were incomplete.

Writing software is pretty complex stuff. This is pretty evident in the level of difficulty in determining how good a software developer is. Do you measure them based on lines of code? Can you measure on time to completion of tasks? Or maybe the feature to bugs coming back ratio? If writing features alone can be this complicated, surely unit testing is just as (or more) complicated.

First, you must be in a place where you can even start testing your code. Are you using dependency injection? Are your components decoupled? Do you understand mocking and can you use mocking frameworks? You need an understanding of all these concepts before you can even begin to unit test effectively. Sure, you can unit test without these prerequisites, but the result will likely be pain and more pain because the you’ll be spending 90% of your time setting up a test rather than running it.

But back to the point of this post. Unit testing is hard because it’s one of those things where you’ll never really know how to do it properly until you’ve tried it. And that assumes you’ve even made the step to do testing at all.

I’m at the point where my experience with writing unit tests allows me to set them up quickly with relative ease, and perform very concise tests where anyone reading the unit test would say “oh, he’s testing this requirement.” However, the problem is that there’s still 2 problems that are hard to fix:
a) Is the test actually correct?
b) Do the tests cover all scenarios?

(a) is obvious. If the test is wrong then you might as well not run it all. You must make it absolute goal to trust your unit tests. If something fails, it should mean there’s a serious problem going on. You can’t have “oh that’s ok, it’ll fail half the time, just ignore it.”

(b) is also obvious. If you don’t cover all scenarios, then your unit tests aren’t complete.

In my case, I actually made both errors. The unit test in question was calculating the time until minute X. So in my unit test, I set the current time to 45 minutes. The target time was 50 minutes. 5 minutes until minute 50, test passed. Simple right? Any veteran developer will probably know what I did wrong. Yep…if the current time was 51 minutes, I ended up with a negative result. The test was wrong in that it only tested half of the problem. It never accounted for time wrapping. The test was also incomplete, since it didn’t test all scenarios.

Fixing the bug was obviously very simple, but it was still ego shattering knowing that all the confidence I had previously was all false. I went back to check out other scenarios, and was able to come up with some archaic scenarios where my code failed. And this is probably another area where novice coders will do, where experienced coders will not: I only coded tests per-requirement. What happens when 2 requirements collide with each other? Or, what happens when 2 or more requirements must happen at the same time? With this thinking I was able to come up with a scenario which led to 4 discrete requirements occurring at the same time. Yeah…not fun.

Basically, in summary:
a) Verify that your tests are correct.
b) Strive to test for less than, equal, and greater than when applicable.
c) Cross reference all requirements against each other, and chain them if necessary.

Wednesday, October 7, 2009

TDD is Design…Unit Testing Legacy Code

I’ve been listening to a bunch of podcasts lately and I came across a gem that was pretty darn useful. It’s pretty old I suppose in technology standards, since it was recorded January 12, 2009, but it’s still worth a listen. Scott Hanselman talks with Scott Bellware about TDD and design. Find it here.
I’m merely reiterating what Scott has already said in the podcast, but it’s different when I can personally say that I’m joining the ranks of countless others who have seen the benefits of TDD.

Now I can’t say that I’m a 100% TDD practitioner since I’m still learning how to write tests before code effectively, but I’ve already seen the improvements in design in my code many times over just by merely writing unit tests. At this point I'd say half of my tests are written first, and the other after are after the fact.

I’ve been through many phases of writing unit tests, it it took me a long time to get it right (and I’m sure I have tons more to go). It takes some experience to figure out how to properly write unit tests, and as a by-product how to make classes easily testable. Code quality and testability go hand-in-hand, so if you find something difficult to test, it’s probably because it was badly designed to start.

The first time I was introduced to unit testing in a work setting was when I was writing C++…and man it was ugly, not because it was C++, but because everything was tightly coupled and everything was a singleton. The fixture setup was outrageous, and it was common to ::Sleep(5000) to make tests pass. Needless to say, my first experience with unit testing was extremely painful.

After a job switch and back to the C# world, I started reading more blogs, listening to more podcasts, and experimenting more. I was given a project to prototype, and for phase 1 it was an entirely DDD experience with I got the opportunity to experiment with writing unit tests the “proper” way with dependency injection and mocking frameworks.

Unit tests are easy to write when you have highly decoupled components.

Prototype was finished, and now I was back on the main project, and given a critical task to complete in 2 weeks. I had to modify a class which was 10,000 lines long, which has mega dependencies on everything else in the project. Anything I touched could potentially break something else. And of course…no unit tests whatsoever. I like challenges and responsibility – but this was close to overkill. This thing produces revenue for the company daily so I really don’t want to mess anything up.

First thing I realized was that there’s no way I could possibly write any unit test for something like this. If the class file was 10,000 lines long, you can imagine the average line count for methods.

And of course, the business team didn’t make things easy on me by asking that this feature be turned off in production, but turned on for QA. So, the best option was to refactor the existing logic out to a separate class, extract an interface, implement the new implementation, and swap between the 2 implementations dynamically.

After 4 days of analyzing and reading code to make sure I have a very detailed battle plan, I started extracting the feature to a new class. The first iteration of the class was UGLY. I extracted out the feature I was changing to a separate class, but the method names were archaic and didn’t have any good “flow” to them. I felt that I had to comment my unit tests just so whoever’s reading them could understand what’s happening, which brings up a point.

If you need to comment your unit tests, you need to redesign the API

It took many iterations and refactoring of the interface to get it to the point where I found acceptable. If you compared the 1st batch of unit tests to the last batch of unit tests it is like night and day. The end result were unit tests which typically followed a 3-line pattern of setup/execute/verify. Brain dead simple.

The last thing to do was to reintegrate the class into the original class. For this I was back to compile/run/debug/cross-fingers, but I had full confidence that whenever I called anything in the extracted class it would work.

An added benefit is that I didn’t add another 500 lines of code to the already gigantic class. Basically, in summary:

Get super mega long legacy code class files
Extract feature into separate file as close to original implementation as possible (initially will be copy-paste)
Run and make sure things don’t die
Write unit tests for extracted class (forces you to have a better understanding of the code modifications, and the requirements)
Make sure unit tests cover all possible scenarios of invocation from original class
Start refactoring and redesigning to better testability, while maintaining all previous tests
Done! Repeat as necessary!

Sunday, September 27, 2009

Dark Color Refresh

There you have it! I started with Visual Studio’s default settings and configured each and every color one by one. 3 hours later I was finished and satisfied with what I have…but I think it was time well spent considering this is what I’m looking at for >= 8 hours a day.

This was a complete re-do of my previous dark theme. This one is even more colorful than the previous, mainly because Resharper does an amazing job of detecting additional code elements like mutable vs unmutable, or unused code, in additional to many other things.

By the way, you should definitely spend the time to configure your own colors rather than using someone else’s template. The reason being is that each and every person is different, and obviously will have preferences towards different colors. I go through different moods and I’ll tweak the theme to be “blue” or “red” for a period of time.

Another reason why you configure it manually is that it drills into your brain more of what each color represents. If you just used my template you would probably never notice that orange means input parameter, and green is a local variable.

However, one thing you can and should take from this blog post is not to use pure black or pure white as your background (did you notice my current line indicator is darker than my background?). Set your background to (10,10,10) or (245,245,245). Your eyes will thank me.

Saturday, September 26, 2009

Autofac’s extremely powerful and flexible ContainerScope

I need to show some love and support for my favorite IoC tool because it’s most powerful feature needs more explaining. It’s not that the main site doesn’t provide a good explanation, because it does, but because most people don’t really understand what the solution is solving.
The following is on Autofac’s front page:

var container = // ...
using (var inner = container.CreateInnerContainer())
{
  var controller = inner.Resolve<IController>();
  controller.Execute(); // use controller..
}

The first time I saw this I basically glanced right passed it. Honestly I didn’t think anything of it, and initially the reason I tried out Autofac was for its very slick lambda registrations. I didn’t realize I was in a world of surprises when I finally realized the power and flexibility of Autofac’s container scope.

If benchmarks and feature comparisons are not enough show off Autofac’s features, this blog post hopes to show how to solve “complex” problems elegantly.

Let’s start with the typical NHibernate use-case. Create 1 singleton, create many sessions-per-request. Here’s a solution with Ninject (not that I’m picking on Ninject, because I love its very slick contextual binding, but because most other IoC containers have a similar solution, like per-session with WCF & Windsor).

Basically, the solutions mentioned above will following an approach like this:

a) Hook into the beginning of a request, and create the ISession from the ISessionFactory.
b) Set it in the HttpContext.Current or OperationContext.Current’s dictionaries.
c) Get this property in all the dependencies that need it.
d) At the end of the request, Dispose() the ISession.

OK, pretty simple and straightforward solution, but there’s one key thing that really bugs me is that by doing this we have introduced a dependency…that is, HttpContext.Current[]. That, or you could wrap that around a class, like SessionManager, again, basically coupling to a dependency under a different name. With Autofac, we can bypass steps b and c entirely and only worry about the beginning and end of a request.

To start off, here's the basic wiring needed:

1:   var cb = new ContainerBuilder(); 
2:   cb.Register(x => CreateSessionFactory())
       .As<ISessionFactory>()
       .SingletonScoped(); 
3:   cb.Register(x => x.Resolve<ISessionFactory>().OpenSession())
       .As<ISession>()
       .ContainerScoped(); 
4:   IContainer c = cb.Build(); 
5:   
6:   Assert.AreSame(c.Resolve<ISessionFactory>(), c.Resolve<ISessionFactory>()); 
7:   Assert.AreSame(c.Resolve<ISession>(), c.Resolve<ISession>()); 
8:   
9:   var inner1 = c.CreateInnerContainer(); 
10:  Assert.AreSame(c.Resolve<ISessionFactory>(), inner1.Resolve<ISessionFactory>()); 
11:  Assert.AreNotSame(c.Resolve<ISession>(), inner1.Resolve<ISession>());

That’s the configuration. And that’s it! Nothing more. No additional SessionManager class. No need to use HttpContext.Current to store the session. Just pass ISession in with regular constructor/property injection.

Here’s how it works:

Line 2: ISessionFactory is created from CreateSessionFactory(). This is a singleton so there will always be one and only one instance of it within the container (and all child containers).

Line 3: This is where it’s interesting. We’re saying “whenever I need an ISession, resolve ISessionFactory and call OpenSession() on it”. Also, by specifying ContainerScope, we only get 1 instance per-container.

And this is where it’s sort of confusing with the terminology. You can think of Autofac as a tree of containers. The root container (variable c in this case), can create children containers (inner1 in this case, and inner1 could create an inner2, and so on). So when something is Singleton scoped, that means that the root container, and any child containers (and child’s children) will only have 1 instance of a service. With ContainerScope, each “node = container” in the tree gets 1 instance.

So back to the unit test above, in line 6 we verify that there is only 1 instance of ISessionFactory. We resolve ISession twice as well, which shows that we get the same instance.

Line 9, we create an inner container, and here we see that ISessionFactory is the same for both the container c and inner container inner1. However, the ISession resolved is different between the two.
Thus, by specifying ContainerScope, you can very easily group multiple services and dependencies together as one unit. Implementing the Unit of Work pattern is insanely easy with Autofac. Create services A, which depends on B, which depends on C, which all the previous depends on D. Resolve A within a new inner container, and B, C, and D will always be the same instances. Resolve A in another inner container and you will get a new set of instances.

Last but not least, Autofac will automatically call Dispose() on all resolved services once the container is disposed. So for the above, once inner1.Dispose() is called, ISession.Dispose() is automatically called. If you needed to, you can very easily hook into this mechanism and implement things like transactions and rollbacks.

I hope this blog post clears things up a little bit about Autofac’s ContainerScope!

Sunday, September 13, 2009

Getting code coverage working on TeamCity 5 with something other than NUnit

I've been experimenting lately with TeamCity 5 EAP and so far it's been a pretty awesome experience. I was up and running within minutes and I was swarmed with beautiful graphs and statistics with specifics even per-test. Getting something like that up with CC.NET is not a trivial task.

Anywho, with TC5 code coverage is one of the cool new features added for .NET, but unfortunately only NUnit is supported. Not that that's a bad thing, but some people prefer to use other testing tools. Two notable contenders are xUnit.net and MbUnit.

I like the fact (pun intended) that xUnit.net makes it a point to prevent you from doing bad practices (like [ExpectedException]), and I like how MbUnit is so bleeding edge with useful features (like [Parallelizable] and a vast availability of assertions).

And with that I set up to figure out how to get TC working with Gallio, but the following should work with any test runner.

It certainly was a pain to set up because it took a lot of trial and error but eventually I figured it out. I analyzed the build logs provided in each build report and noticed something interesting...specifically:

[13:51:23]: ##teamcity[importData type='dotNetCoverage' tool='ncover' file='C:\TeamCity5\buildAgent\temp\buildTmp\tmp1C93.tmp']

and

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageL' value='94.85067']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageM' value='97.32143']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageC' value='98.68421']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsLCovered' value='921.0']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsMCovered' value='218.0']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsCCovered' value='75.0']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsLTotal' value='971.0']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsMTotal' value='224.0']

[13:51:30]: ##teamcity[buildStatisticValue key='CodeCoverageAbsCTotal' value='76.0']

The first message happens after NCover.Console is done its thing. After NCoverExplorer is done its thing, the statistics are published. I set out to mimic this functionality with Gallio, but what's described here should work with any test runner.

1) Disable code coverage in TC. We're doing it manually instead.
2) In your build script, run your unit tests with NCover and generate a coverage.xml report.
3) Run NCoverExplorer on coverage.xml and generate reports ncoverexplorer.xml and index.html.
4) Create a zip file of index.html and name it coverage.zip.
5) Configure coverage.zip to be an artifact in your TC configuration (this is to enable the tab).
6) Parse out ncoverexplorer.xml with XPath and output the statistics.

Certainly a lot of things to do just for the sake of pretty statistics reporting....but it was the weekend and I was bored. With the help of MSBuildCommunityTasks, the zip file and XML parsing was made a lot easier.

After that, viola! Code coverage + Gallio on TeamCity 5!!!

Unfortunately, NCoverExplorer's report only reports the # total of classes and nothing about unvisited or covered, so for those values I set to 0/0/0 (BTW, you need all values present for the statistics to show). A task for next weekend!!!

(Edit: I suppose I could should also mention that you could technically avoid all the trouble above and hack it with this:
#if NUNIT
using TestFixtureAttribute = NUnit.Framework.TestFixtureAttribute;
using TestAttribute = NUnit.Framework.TestAttribute;
#endif
And it'll work just fine as well)

Monday, September 7, 2009

Member injection module for Autofac

Inevitably as a project gets more complicated, you will need to start using more features of your IoC container. Autofac has built-in support for property injection by hooking into the OnActivating or OnActivated events, which basically set all public properties (or only those which are unset). However, I didn't really like this because once you start using properties, it's not as clear cut as constructors that you are using injection. It becomes hard to manage later on when you have many properties which some should be injected and others should be manually set in code. Autofac's inject all or only-null approach doesn't fit the bill when a class gets a little more complicated. I set up to fix this by writing a custom module, and it turned out to be very very simple. With attributes, we can mimic functionality that's used in Ninject. Here's the module code:

public class MemberInjectionModule : IModule
{
  public void Configure(IContainer container)
  {
    container.ComponentRegistered += (oo, ee) =>
      ee.ComponentRegistration.Activated += (o, e) =>
    {
      const BindingFlags flags = BindingFlags.Instance |
                                 BindingFlags.NonPublic |
                                 BindingFlags.Public;
      var type = e.Instance.GetType();
      foreach (var field in type.GetFields(flags))
      {
        if (field.GetCustomAttributes(typeof(InjectedAttribute), true).Length > 0)
          field.SetValue(e.Instance, e.Context.Resolve(field.FieldType));
      }
      foreach (var prop in type.GetProperties(flags))
      {
        if (prop.GetCustomAttributes(typeof(InjectedAttribute), true).Length > 0)
        {
          if (prop.GetIndexParameters().Length > 0)
            throw new InvalidOperationException("Indexed properties cannot be injected.");

          prop.SetValue(e.Instance, e.Context.Resolve(prop.PropertyType), null);
        }
      }
    };
  }
}
[AttributeUsage(AttributeTargets.Field | AttributeTargets.Property)]
public class InjectedAttribute : Attribute { }

And that's it! Pretty basic reflection here stuff here, but as you can see...iterate through all fields and properties (including privates), and try to resolve them. Now, you can just RegisterModule(new MemberInjectionModule()), and inside your services you simply annotate your properties/fields with the [Injected] attribute, like so:

public class SomeService : ISomeService
{
  [Injected]
  protected ISomeOtherService SomeOtherService { get; set; }
}

And now, it's a very clear cut way of explicitly defining which properties (and fields) should be injected. Also, it's also a simple string property away defined in the attribute if you want to specify a named service, which I'll leave the the reader to implement :)

Friday, August 21, 2009

Session management with NHibernate and Autofac: Corollary

Ooops, I forgot to mention that it's easily mockable as well.

public class UserService : IUserService {
  public UserService(ISessionFactory factory, UsersRepositoryFactory usersFactory) { ... }
}

So that's the basic thing right. So now, with Moq, we can do this:

var mockUserRepo = new Mock<IUsersRepository>();
mockUserRepo.Setup(x => x.GetUser(It.IsAny<string>())).Returns(() => someUser);

var userService = new UserService(..., x => mockUserRepo.Object);

So what I've basically done is insert a lambda expression into the constructor, which is the delegate, which always returns my mock repository. Pretty nifty methinks!

Thursday, August 20, 2009

Session management with NHibernate and Autofac

NHibernate is a beast let me tell you! I think I already mentioned this in the previous post, but NHibernate definitely has a VERY high learning curve. If it wasn't for FluentNHibernate I'd probably still be struggling with XML mapping files right now.

But anywho, it didn't take long for me to run into very many problems with NHibernate. It's not really NHibernate's fault, but more of maybe I'm trying to use the wrong tool for the job.

I'm using DDD for my project, and there's a clear separation between the client and the domain. I got lazy and I didn't make any DTOs for my entities, and just used the [DataContract] and [DataMember] to have WCF automatically generated "DTO"s for me (and in the options turn OFF reusing assemblies).

(Disclaimer: I am *not* by any means knowledgeable about NHibernate, but maybe, just maybe what I say here might point other people in the right direction)

OK, all is good. I can store my entities in the database, I can read them back. So I fire up my client and read some users, and it blows up. NHibernate by default lazy loads everything. Here's what I originally had in my UsersRepository

public IEnumerable<User> GetAll() {
using (var session = _factory.OpenSession())
return session.CreateCriteria<User>().List<User>();
}

It looks pretty innocent. But it blows up. Why? If Users has a collection of other entities (which mine did), they are not loaded with the above code. They are loaded on-demand, i.e. lazy loaded. So when my WCF service finally returns the objects and serializes them, it pukes and throws an exception because the session was long closed already.

Easy solution was to Not.LazyLoad() all my collections. Now here's what I'm thinking I might not be using the right tool for the job because I am purposely disabling one of the key features of NHibernate. By default, caching is always enabled, and I couldn't find anywhere how to globally turn it off. You must go class by class.

OK, so on to my next problem. I soon ran into issues with entities not being unique. I would have code like this in my UserService:

public void AddProduct(string username, Guid productId) {
var user = _usersRepository.GetUser(username);
var product = _productRepository.GetProduct(productId);
user.Products.Add(product);
_usersRepository.Store(user);
}

Again, this puked. The problem here was after my GetUser call, inside my repository, like my first example, I had a using(session), which closed when the method ended. Shortly after, I am trying to update the same user, but with a *different* session. NHibernate doesn't like this, an NHibernate.NonUniqueObjectException is thrown, and I had a new problem to deal with.

It became clear I had to actively manage the sessions somehow, and the best place would be to have them in my services, which typically each method had a "unit of work" placed on them.

So the goal I wanted to accomplish was to initiate a new session at the beginning of a service method, call a repository multiple times, close the session, and then exit the method.

So how can we achieve that?

I thought of a couple things, and the first thing I did actually was to use AOP style and use Autofac with Castle.DynamicProxy. I created an interceptor for my service, and before invocation I opened a session, then manually called a property setter for a CurrentSession, and after invocation close the session.

It did the job, but had some problems:
a) It did a little too much black magic for me. All methods got intercepted.
b) I still had to manually set the CurrentSession property of my repositories. Sooner or later I'm sure to run in the threading problems.

After I got that half working I scrapped it and tried to come up with something better. This is what I came up with:

public delegate IUsersRepository UsersRepositoryFactory(ISession session);

public class UserService : IUserService {
public UserService(ISessionFactory factory, UsersRepositoryFactory usersFactory) { ... }
}

public class UsersRepository : IUsersRepository {
public UsersRepository(ISession session) { ... }
}

Now, when I call a method on my UserService, I do this:

public void AddProduct(string username, Guid productId) {
using (var session = _factory.OpenSession()) {
var userRepo = _userRepoFactory(session);
var productRepo = _productFactory(session);
...
}
}

And, of course, the Autofac code:

b.Register<UsersRepository>().As<IUsersRepository>().FactoryScoped();
b.RegisterGeneratedFactory<UsersRepositoryFactory>();

More details on the delegate factory here. But basically, with Autofac I have told it to inject delegates into my UserService, and those delegates can create new instances of my user repository by passing in the current session.

You'll also notice I had a products repository as well. With these delegate factories I can create up multiple repositories on the fly with the same session (and all the good NHibernate stuff that comes with that).

Sure, there's a whole bunch of using(var session) boilerplate now, but it's explicit, clear, and allows for a lot of fine-grained control.

EDIT: I have since dropped this approach and I'm currently injecting ISessions into my constructors by using Autofac's container-scope (something that took me a while to understand).

Tuesday, August 18, 2009

Goodbye db4o, Hello ORM

It's unfortunate that I have to remove db4o from my project, but I can't say this was unexpected. The CTO finally came to me today to voice his concerns on using an object database rather than a relational database. As typical of many organizations, we have a database team dedicated to maintaining the integrity and performance of our main product. We are also stored procedure happy and every write and read operation goes through a stored procedure....so you can say I was pretty gutsy to even consider using an object database ;-) It was fun nonetheless... Sooooo....which ORM to use? I've already read of all the things about the Entity Framework already, so I didn't even bother considering it. Choices came down to just a few: a) Castle.ActiveRecord b) Fluent NHibernate c) SubSonic I've initially started with Fluent NHibernate. The main reason is because it's got really good documentation for beginners like me who've never used NHibernate, and has more configuration options via code. Even so, it didn't take long for me to run into gazillions of mapping errors though :-[ I'm still undecided whether I prefer ClassMap<> or ActiveRecord attributes though... I started porting all my repositories to using NHibernate, and for the most part it wasn't too bad. I've learned some things along the way like lazy loading and cascading, but the conclusion is having Fluent NHibernate with pretty code completion isn't going to prevent the need to actually learn NHibernate the old-fashioned way, although it definitely got me up and running a lot faster. I'll probably write up a quick repository implementation with SubSonic as well as part of my prototype to see which I like. SubSonic definitely looks more 'hip' with it's very flashy and web 2.0ey looking website :)

Saturday, August 15, 2009

DDD, TDD, Enterprise Library, Autofac, and Lokad.Shared libs!

So my project has been chuggin along quite nicely, and gave my CTO a quick look over the design of my project of which he was very impressed...I'm glad I spent all that time reading up on DDD, BDD, TDD, and listening to all the ALT.NET podcasts. I've learned SOOOOO much in the past month it is ridiculous, mostly thanks to coming across ALT.NET, and then reading everything linked from that source. I can honestly say I'm 100% sold on the TDD approach. Initially, after reading about it I was like, hmmm, yeah, I can see that working, but writing tests first kinda seems dumb and redundant. After all, obviously you know the tests will pass because you wrote them!! Then...I saw the light!!! I'm currently prototyping my project, but since I live in the real world I can reasonably assume the prototype will become the real thing. But that's not really an issue having designed everything with DDD and DI. Once the prototype is done, we can optionally replace the implementation, change the IoC wiring up, and voila, it just works (because we had all those tests written). Another design choice for the prototype I made was to use DB4O rather than SQL Server. DDD emphasizes very much on storing aggregate roots in repositories, and let's be honest, object databases work ridiculously well with storing domain entities. So not only can I speed up development in my prototype by having all repositories go to a DB4O backing store, if it's decided we need to change to a relational database, boom, I just need to reimplement all my repositories with NHibernate. And just to drive home how easy DB4O is, literally, this is what you need to do to store a User in the database.

using (IObjectContainer c = Db4oFactory.OpenFile("database.db"))
{
  c.Store(user);
}

Seriously!!! It IS that easy! But back to why I'm sold on TDD. Originally, I had IUsersRepository modeled something like this:

User GetUserByName(string userName);
User GetUserById(Guid id);
User GetUserByFirstName(string firstName);

And then I realized, it's a total pain in the butt to be adding all these methods because once the method call reaches the DB4O container, I need to pass in a predicate for the query. So I thought, I'll just pass the predicate directly to DB4O. I changed all my repositories (I had 4 at the time), to do this instead:

User GetUser(Predicate<User> condition);

So now I can simply do this:

var user = usersRepository.GetUser(x => x.UserName == userName);

I've converted 3 methods (and potentially more) into a single method that is very expressive and easy to read (thank you lambdas!). And then I reran all my tests and they passed....which leads to the following conclusion: TDD lets you refactor with confidence! Oh wait, I mentioned Autofac and Lokad.Shared libs in this blog post didn't I.... To start...my project has removed all dependencies on the Enterprise Library. In my previous post I mentioned how Unity wasn't as mature as other frameworks. Well, it appears that the Logging Block and/or the Exception Handling Block has a dependency on Unity. I definitely did not like having 2 IoC containers in my bin directory, so I decided maybe I should just ditch the entire thing. Honestly, I didn't really like the EntLib anyways....the logging block requires the use of a static class Logger (tight coupling), and the exception block needs a try/catch/handle for any exception you want to handle, again with a static class ExceptionPolicy (more tight coupling!!!). So now it was a matter of finding a replacement. There really was only 2 choices: the Castle Stack, or Autofac+Lokad.Shared. I was not very impressed with the lack of documentation on the Castle Validator component (or even where to download a binary of it). Also, I'm really not a big fan of attributes on parameters. So what's left is Autofac and Lokad.Shared. I chose Autofac mainly because I love its lambda syntax (and the included performance benefit). But of more concern is Lokad.Shared. a) It is basically written by 1 guy b) It doesn't have community inertia But after reading the examples here and here I didn't have a choice! It is just too damn nice! Comparing code I'm writing today with code I wrote 3 months ago is like light and day....on different planets. And since I'm very link happy today, check out this awesome add-on called Studio Tools. It's free, only 1.9 megs and has the most impressive and blazing fast file/class/symbol search I've ever seen.

Saturday, August 8, 2009

Dynamic proxies

Life has sure been a roller coaster lately ever since I started working on my new project. I've learned a lot about WCF in the past week. There's really nothing that can teach you faster than trying something out yourself. I've joined the ranks of countless others who have had to deal with implementing service behaviors to handle WCF exceptions (because by default thrown exceptions fault the communication channel), realized that WCF proxy clients breaks the 'using' keyword because someone thought it was a good idea to throw exceptions in Dispose(), and even Unity's InterfaceInterceptor not supporting more than 1 interface! But now that we're talking about proxies, I've been thinking for a while about switching out Unity with a more mature IoC container like Windsor or StructureMap. There are little touches that other containers have that I miss in Unity. For example, auto-wiring. But then again, the integration that Unity has with the rest of the Enterprise Library is very nice, of which I'm using the Logging and Exception Policy blocks, so it made sense in a way that everything in my bin folder just had Microsoft.Practices.*.dll But now I'm seriously reconsidering.

public interface ITestService {
  int Count { get; set; }
  void DoWork();
  bool ShouldDoWork();
}
public class TestService : ITestService {
  public int Count { get; set; }
  public void DoWork() { ++Count; }
  bool ShouldDoWork() { return true; }
}
public void HowDoTheyCompare() {
  var unity = (ITestService)new Microsoft.Practices.Unity.InterceptionExtension.InterfaceInterceptor()
                   .CreateProxy(typeof(ITestService), new TestService());
  var castle = new Castle.DynamicProxy.ProxyGenerator()
                   .CreateInterfaceProxyWithTarget<ITestService>(new TestService());

  Thread.Sleep(1000);

  Stopwatch sw = new Stopwatch();
  sw.Start();
  for (int i = 0; i < 1000000; ++i) { if (unity.ShouldDoWork()) unity.DoWork(); }
  sw.Stop();
  Console.WriteLine("unity: " + sw.ElapsedMilliseconds);
  sw.Reset();
  sw.Start();
  for (int i = 0; i < 1000000; ++i) { if (castle.ShouldDoWork()) castle.DoWork(); }
  sw.Stop();
  Console.WriteLine("castle: " + sw.ElapsedMilliseconds);
}

Results? unity: 1787 castle: 136 From this test it looks like the Unity interception is 13x slower....but wait! I mentioned before that Unity has a bug where it can only proxy one interface....so to resolve that we would need to use the TransparentProxyInterceptor. Let's change it and test again... unity: 30462 castle: 142 Hmmmm....214x slower. I guess we can try the VirtualMethodInterceptor for completeness. After making all methods virtual, here's the results. unity: 3843 castle: 132 Still 29x slower. Whatever DynamicProxy is doing...it's orders of magnitude faster than what Unity's doing.

Thursday, August 6, 2009

Who woulda thought, IList<> kills WCF

It can't be that hard!!! I keep telling myself this. I'm writing a WCF application right now and I've been running into loads of problems trying to configure things. ABC right? Set my address, contract, and binding....how hard can it be?? Why is a horrible ExecutionEngineException being thrown and taking down IIS with it??? http://connect.microsoft.com/wcf/feedback/ViewFeedback.aspx?FeedbackID=433569 Turns out you can't have an IList<> of DataContracts.

Wednesday, July 29, 2009

Assimilated by ALT.NET

It feels good to be revitalized again! A couple months back I was very unhappy. I was at a job I didn't particularly like. The team I worked with was very rigid and used waterfall practices (even though it was a small 5 man team), and they were very resistant to change (I struggled to move them off SourceSafe, and failed). It was time to job ship and try something else, but then the market crash hit and suddenly having money to put food on the table became a priority. Life got boring and dull. I was writing C++ code (which didn't even use boost because it wasn't allowed), and .NET 2 code when 3.5 was out for 3 years already. I lost all drive to learn and grow, and after 8 hours of work I really didn't want to have anything to do with coding when I was home. Then my life changed... I found a new job, got hired, and it has been a HUGE turnaround. I haven't been there long (almost done my 3 month probation), but I'm finding myself working at home simply because I love what I'm doing. The difference? My new team actually wants to learn new things and continually improve. It is just under 3 months, and I have already helped my team implement branching conventions (rather than 2 week code freezes), unit testing conventions (rather than no conventions), and maintaining a CruiseControl.NET build server (rather than no build server). It sure makes a HUGE difference when what you're doing is valued by your coworkers. In my 2nd month I was sent to my first conference, DevTeach in Vancouver, and I'd probably consider this the biggest trigger of change in my way of thinking. I was only away from the .NET community for about a year, but sitting in on the sessions with a panel of experts really drove home exactly how obsolete I already was. I've heard of Test Driven Development. But what's this Behavior Driven Development all about? What's Domain Driven Development? What is Inversion of Control and why should I care? And then I came across ALT.NET. I first heard about it in James Kovacs' Vancouver DevTeach session on IoC containers, who was telling everyone about a ALT.NET conference that weekend after the DevTeach conference. I had next to no knowledge about what ALT.NET was, so I skipped out on the conference because I had other obligations that weekend. I wish I knew more about ALT.NET earlier, because I would have rescheduled my obligation so I could attend the conference!! It didn't take long for me to be fully immersed in everything the ALT.NET community has to offer, and I was reading up on anything and everything I could. I practically had years of catchup to do. And then I was given my big break. A completely new project with absolutely no dependencies on anything else was assigned to me. It was going to be a massive n-tier project with web clients, windows clients, business logic, database, and anything else you could think of. I got the chance to design something big from scratch. So now I'm reading everything I can about domain driven development and making sure I don't do the classic Gang of Four mistake of overdesigning after reading ;)

Saturday, July 25, 2009

Best tool for the job...

So I recently moved, and my new place doesn't make internet by wire all that accessible...so now I have a big honkin desktop computer with no internet, but I have a little netbook which has wireless. Hmmmm...my netbook's main operating system is Archlinux...and Linux is strong in networking...so it should be easy to set it up right? Right I was! - give my ethernet an IP address and start it # ifconfig eth1 192.168.0.1 netmask 255.255.255.0 # ifconfig eth1 up - enable packet forwarding and reboot (or modify /proc) # set /etc/sysctl.conf to set net.ipv4.ip_forward=1 - set up iptables # iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE Done! Now on my desktop I just set it to static IP 192.168.0.2, connected to my gateway 192.168.0.1, and boom, I got internet! Now I can wait patiently for a wireless adapter to go on sale...

Wednesday, June 24, 2009

CodeRush, ReSharper, Visual AssistX

Hmmmmm...CodeRush and ReSharper...It seems like these are the only 2 products people think about when they want that all super powerful addon that's supposed to quadruple their productivity. I think people just look at the feature chart and say wow, ReSharper and CodeRush have hundreds of features!!! I'm going to be so much more productive now! I've tried both and there are features I like from both products. For example, ReSharper's find references is extremely useful because you can filter by read-only or writeable. I'm also a big fan of all the visual cues that CodeRush adds to the editor. But...neither of them have the 1 feature Visual AssistX I like most: Partial Names IntelliSense. What's that? Let's say you have SomeMethodWithAReallyLongName. If you type "really" with VAX's intellisense, that method appears in the list. Originally I thought that was a nice feature, but it can't be that useful. But then I was fortunate (or unfortunate is probably more accurate) to get to work with MSHTML. Now I have 1 humongous namespace with around 100 different types. Sometimes it's called TextRange and other times it's called TxtRange. Some have the HTML prefix, and others don't. When was the last time you could only remember a part of a method name?

Tuesday, May 5, 2009

MetaTrader Indicators

One of my hobbies is forex trading, and a sub-hobby of that is writing custom indicators. MetaTrader is a trading platform that is very popular in the forex community because it allows traders to write custom indicators. It's pretty easy to get started if you have programming experience because the language is very similar to C. And yes, you can also write your own bot if you wanted, called an Expert Advisor. Personally I don't spend too much time trying to write a winning bot simply because there are way too many variables to take into account, and eventually the market's going to yell "in your face" and your winning bot will end up losing. I just write simple scripts that help me identify entry points, and filter out all the non-interesting price movements. Recently I've been looking at breakout strategies because they are simple, yet effective. The result is a custom indicator I wrote which simply put calculates the high and low for a time period, and then draws pretty rectangles. Like so:

The above is simply the high/low of 2 hours before the opening London session, and the opening US session. Isn't it puurrrrrrty?

I'm doing the VI challenge!

One of my time-wasting hobbies is running Linux on virtual machines. I play games from time to time so Windows is still my main OS. But otherwise, I do use my Linux virtual machines for many things like my personal Mercurial repository is running on VirtualBox. For the curious, my distro of choice is Archlinux. Anyone who's needed to tinker with a Linux distro knows that you spend a lot of time in a text editor changing configuration files. Typically, the choices are pico/nano for its simplicity, or vi because it's on any POSIX compliant operating system. I decided, ah what the heck, I'm gonna learn vi and use that for editing files instead of nano. I got decent with vi. I could move around (albeit I still relied on home/end and arrow keys), save, search, etc., but on Windows I'd still use Notepad2. In an attempt to improve my productivity in text editing, I'm doing the VIM challenge! All my text editing I will be using WinVI (I even replaced the notepad.exe with it), and I even got the demo of ViEmu running on Visual Studio. For the next 2 weeks I will be using vi for anything text related. I've been going through tutorials and trying to remember all the new things I'm learning. Most importantly, I'll be trying to avoid the home/end/arrow keys as well. BTW, '*' is an awesome feature in vi. It highlights all instances of the current word in the document, and subsequently you can use n or N to back or to the next instance. Like, take a look at the following screen shot: Hosted by imgur.com

Boom! All instances of "total" highlighted with just a Shift+8. Let's see you do the same thing with Ctrl+F, mouse-click "Find next" (or worse, reverse your search).

Tuesday, April 28, 2009

A Trick Question With Closures

Given the following code, what do you expect the output to be?

for (int i = 0; i < 100; ++i)
{
    ThreadPool.QueueUserWorkItem(delegate
    {
        Console.WriteLine(i);
    });
}

Keep that answer in your head! Now...what do you expect the output of the following?

int i;
for (i = 0; i < 100; ++i)
{
    ThreadPool.QueueUserWorkItem(delegate
    {
        Console.WriteLine(i);
    });
}

Were your answers the same? Different? Why?

Monday, April 27, 2009

A Simple Thread Pool Implementation (Part 2)

I said last time I'd compare the performance of my super duper simple implementation against the .NET ThreadPool, so here it is! Here's the test code:

ManualResetEvent evt = new ManualResetEvent(false);
int total = 100000;
int count = 0;
DateTime start = DateTime.Now;
for (int i = 0; i < total; ++i)
{
//  ThreadQueue.QueueUserWorkItem(delegate(object obj)
    ThreadPool.QueueUserWorkItem(delegate(object obj)
    {
        Console.WriteLine(obj);
        if (Interlocked.Increment(ref count) == total)
            evt.Set();
    }, i);
}
evt.WaitOne();

Console.WriteLine("Time: {0}ms", (DateTime.Now - start).TotalMilliseconds);
Console.ReadLine();

Here are the initial tests. My CPU is a Core2Duo clocked at 3.6GHz. I also ran a release build outside of the debugger. I set my ThreadQueue to have 25 worker threads. The results were pretty surprising:
ThreadPool: 1515.6444ms ThreadQueue: 5093.8152ms Well, that's interesting. The .NET ThreadPool whooped my butt! How can that be? Let's dig deeper and find out how many threads actually got used. I used ThreadPool.GetMinThreads and ThreadPool.GetMaxThreads, and I got 2-500 for worker threads, and 2-1000 for completion IO threads. Then, I tracked how many threads were actually used, like this:

Dictionary threads = new Dictionary();
// ...snip
    ThreadPool.QueueUserWorkItem(delegate(object obj)
    {
        Console.WriteLine(obj);
        threads[Thread.CurrentThread.ManagedThreadId] = 0;
        if (Interlocked.Increment(ref count) == total)
            evt.Set();
    }, i);
// ...snip
Console.WriteLine("Threads used: {0}", threads.Count);

Surprisingly, only 2 threads were used. So, I set the number of worker threads of my ThreadQueue to 2. Viola! 1437.5184ms, which is just a little under 100ms faster than the ThreadPool. I guess this shows that more threads does not mean better or faster! For fun I set the number of worker threads to 200 and it took 27906.6072ms! There was clearly a lot of locking overhead here...

A Simple Thread Pool Implementation

I've always wanted to do this, and finally I've gotten around to doing it. Here's a super duper simple implementation of a working thread pool. I named in ThreadQueue just so it's clearly discernible from the one provided by the framework.

public static class ThreadQueue
{
    static Queue _queue = new Queue();

    struct WorkItem
    {
        public WaitCallback Worker;
        public object State;
    }

    static ThreadQueue()
    {
        for (int i = 0; i < 25; ++i)
        {
            Thread t = new Thread(ThreadWorker);
            t.IsBackground = true;
            t.Start();
        }
    }

    static void ThreadWorker()
    {
        while (true)
        {
            WorkItem wi;
            lock (_queue)
            {
                while (_queue.Count == 0)
                {
                    Monitor.Wait(_queue);
                }
                wi = _queue.Dequeue();
            }
            wi.Worker(wi.State);
        }
    }

    public static void QueueUserWorkItem(WaitCallback callBack, object state)
    {
        WorkItem wi = new WorkItem();
        wi.Worker = callBack;
        wi.State = state;
        lock (_queue)
        {
            _queue.Enqueue(wi);
            Monitor.Pulse(_queue);
        }
    }
}

As you can see, it's very short and very simple. Basically, if it's not used, no threads are started, so in a sense it is lazy. When you use it for the first time, the static initializer will start up 25 threads, and put them all into waiting state (because the queue will initially be empty). When something needs to be done, it is added to the queue, and then it pulses a waiting thread to perform some work. And just to warn you, if the worker delegate throws an exception it will crash the pool....so if you want to avoid that you will need to wrap the wi.Worker(wi.State) with a try/catch. I guess at this point you may wonder why one should even bother writing a thread pool. For one, it's a great exercise and will probably strengthen your understanding of how to write multithreaded applications. The above thread pool is probably one of the simplest use-cases for Monitor.Pulse and Monitor.Wait, which are crucial for writing high-performance threaded applications. Another reason is that the .NET ThreadPool is optimized for many quick ending tasks. All asynchronous operations are done with the ThreadPool (think BeginInvoke, EndInvoke, BeginRead, EndRead, etc.). It is not well-suited for any operation that takes a long time to complete. MSDN recommends that you use a full-blown thread to do that. Unfortunately, there's a relatively big cost of creating threads, which is why we have thread pools in the first place! Hence, to solve this problem, we can write our own thread pool which contains some alive and waiting threads to performing longer operations, without clogging the .NET ThreadPool. In my next post I'll compare the above implementation's performance against the built-in ThreadPool.

Saturday, April 25, 2009

SourceGear Vault, Part 4, Conclusion

I suppose I should give a disclaimer since I am *not* an expert with Vault, and my opinions may be completely due to my lack of understanding of the system. With that in mind, here's what my experience of using Vault has been so far. In general, it is not as fast as TortoiseSVN. There are many operations in SVN that are instant, where the comparable operation in Vault is met with 10 seconds of "beginning transaction" and "ending transaction", sometimes more. Feature-wise, Vault has much more to offer than Subversion does. Basically, it can do most of what Subversion can do, plus features from SourceSafe (like sharing, pinning, labeling, etc.). However, like I mentioned before, Vault has no offline support whatsoever, and you cannot generate patches, so it effectively cuts off any kind of outside collaboration. You could say that this is fine because SourceGear's target audience is small organizations where everyone will have access to the network anyway, but that doesn't mean that you won't be sent off to the middle of nowhere with no internet access and you need to fix bugs now! Not to say that Subversion is much better in that scenario, but at least you can still revert back to the last updated version.

Friday, April 24, 2009

SourceGear Vault, Part 3 (Performance vs Subversion)

I downloaded the kernel 2.6.29.1, and extracted it to the working folder. I figured this was an easy way to have a real-world scenario of changes (albeit a little big since it is all changes that happened between 29 and 29.1). Anywho, I was pretty surprised to find out that Vault does not have any feature whatsoever to allow collaboration between programmers other than via the client. You cannot create a .patch file and email it to someone. Everyone is assumed to have access to the server. This thwarted my plans to test the performance of diffing the changeset, because I simply planned on comparing how long it would take to create the patch. Ah well, I guess I'll just have to compare the performance of committing the changes between 29 and 29.1, which is a common use case as well so I don't mind. Time it took to a) start up the Vault client, b) bring up the Vault commit dialog, or c) bring up to TortoiseSVN commit dialog: Vault startup: ~1m02s Vault commit: ~7m55s Subversion commit: ~4m33s Time it took to commit: Vault: ~13m45s, and at 14m34s the HD stopped spinning Subversion: ~1m42s Hmmmmm, it doesn't look like Vault did too well in this comparison. Getting a status of all changed files took almost twice as long compared with Subversion, and what's worse, committing with Valut took 12 minutes longer than it did with Subversion. The extra minute with the hard drive spinning was attributed to SQL Server completing the transaction, which is why I separated that part, because as far as the client was concerned the operation was complete after 13m45s. One more quick test...branching. Subversion is famous for its O(1) time to branch anything, from the smallest to the biggest. SourceGear's page of Vault vs Subversion mentions that both offer cheap branches. Let's see that in action! I used TortoiseSVN's repo-browser and branched the kernel. It was practically instant with no wait time for the operation to finish. Vault, on the other hand, took a total of 47 seconds from the time it took me to commit the branch, to when the status said "ready" again.

Thursday, April 23, 2009

SourceGear Vault, Part 2 (Performance vs Subversion)

Obviously you can't compare something without performance numbers! So in this post I'll post some performance numbers for the most common operations. I'll be using the Linux kernel for no reason other than it's readily available and everyone knows about it. The size of the code base is also relatively large (depending on who you ask). In general, I "felt" that Vault is slower than Subversion. This is probably due to the .NET runtime startup time for most operations, which is negligible for anything but trivial operations. Test environment: Core2Duo clocked at 3.6GHz 4G of RAM Vista 64bit/IIS7/SQL2008Express Vault 5.0 Beta 1 (I'm hoping the beta isn't going to affect performance numbers drastically) TortoiseSVN 1.6.1 Linux kernel 2.7.29 (roughly 300MB of source code and more than 26000 files) Both server and client are on the same computer, so none of these scenarios will really test network performance. Anyway, on with the tests. Time it took to recursively add all folders and files and just bring up confirmation dialog Vault: ~14s Subversion: ~11s Time it took to mark all files to add (create a changeset): Vault: 0s (it was pretty much instant) Subversion: ~6m52s Time it took to commit: Vault: ~21m10s Subversion: ~28m50s Total: Vault: ~28m16s Subversion: ~35m53s As you can see, Vault was slightly slower at a trivial operation like which files to add, but once it got to some real work to do, it ended up being faster overall vs Subversion. A significant part of the Subversion time was devoted to creating .svn folders. After this, it was apparent that Subversion has a pretty major advantage over Vault...working offline. Vault does not appear to keep any offline information. This was confirmed when I went into offline mode in Visual Studio and pretty much all functionality was disabled except for "go online." Basically, when you're working offline in Vault you can no longer diff with the original version, no reverting changes, no anything. Once you go online it scans your working copy for any changes and the "pending changes" will update. I don't like super long posts, so there will most definitely be a part 3 where I'll continue on with some common operations.

Wednesday, April 22, 2009

SourceGear Vault, Part 1

I'll be starting a new job soon, and the company is using SourceGear Vault, so I went ahead and downloaded the latest beta versions off the site (since it's free for single users) to see how it was like. I've hated SourceSafe from the first time I used it. The nail in the coffin was when my data got corrupted and some of my work was lost. At the very least, no source control management system should ever lose any work. Unfortunately, my current workplace is still using SourceSafe and I was unable to convince them otherwise. I've coped with the situation by using Mercurial for a quick local repository, and checking in changes back to VSS for major changesets. The choice to go with hg instead of git or bzr was mainly because I'm a Windows developer and hg is currently way ahead of the other two on Windows. Anyways, since Vault is frequently advertised as SourceSafe done right, I was curious to how it would affect my opinion of how things "should" have worked. I'm a long term user of Subversion, and most of my experience of actually using SCM is with svn. I used hg at work just to see what all this DVCS hype is all about. With that in mind, setting up Vault was certainly more eventful then I expected, since it involved setting up SQL Server and IIS. After all the requirements were taken care of (the most time consuming part), installing Vault was pretty quick. The administration interface is simple and straightforward, and had a repository created by default. Then, the real fun began! Tune back for part 2!

Thursday, April 9, 2009

No more cross thread violations!

“Cross thread operation not valid: Control ### accessed from a thread other than the thread it was created.” Seriously, I don’t think there’s a single person who’s written UI programs for .NET that has not encountered this error. Simply put, there is only 1 thread which does drawing on the screen, and if you try to change UI elements in a thread that’s not the UI thread, this exception is thrown. A very common example is a progress bar. Let’s say you’re loading a file which takes a long time to process, so you want to notify the user with a progress bar.

  public event EventHandler ValueProcessed;
  private void StartProcessing() {
    ValueProcessed += ValueProcessedHandler;
    Thread t = new Thread(delegate() {
      for (int i = 0; i < 1000; ++i) {
        // do stuff
        ValueProcessed(this, new IntEventArgs(i));
    });
    t.Start();
  }
  private void ValueProcessedHandler(object sender, IntEventArgs e) {
    _progressBar.Value = e.Value;
  }

I left some implementation details out, but you can pretty much infer what IntEventArgs is, etc. Once you try to set Value, it’ll throw an cross-thread exception. A common pattern to solve this problem is this:

  private void ValueProcessedHandler(object sender, IntEventArgs e) {
    if (this.InvokeRequired) {
      this.BeginInvoke(ValueProcessedHandler, sender, e);
    } else {
      _progressBar.Value = e.Value;
    }
  }

It gets the job done…but frankly I’m too lazy to do that for every single GUI event handler I write. Taking advantage of anonymous delegates we can write a simple wrapper delegate to do this automatically for us.

static class Program {
  public static EventHandler AutoInvoke(EventHandler handler) where T : EventArgs {
    return delegate(object sender, T e) {
      if (Program.MainForm.InvokeRequired) {
        Program.MainForm.BeginInvoke(handler, sender, e);
      } else {
        handler(sender, e);
      }
    };
  }
}

This assumes that you set Program.MainForm to an instance to a Control, typically, as the name implies, the main form of your application. Now, whenever you assign your event handlers, you can simply do this:

    ValueProcessed += Program.AutoInvoke(ValueProcessedHandler);

Pretty neat! BTW, just a word of warning, if you plan on using the same thing to unsubscribe an event with -=, it's not going to work. Unfortunately, calling the AutoInvoke method twice on the same input delegate returns 2 results where .Equals and == will return false. To get around this you can use a Dictionary to cache the input delegate.

Digsby

it's pretty sweet. you can connect to every IM service known to man, and it even checks all your email, facebook, and twitter too with nice popup notifications! it only works for windows at the moment, but the guys over there are working hard and pushing out updates very frequently. i'm alpha testing and even so i haven't seen any crashes or any major problems. there have been the minor quarks here and there, but that's to be expected when you're alpha testing. check it out at digsby.com also, you can embed a chat client into your website...like this:

bling.github.io