ServicesResourcesConferencesOur TeamWeblogsAboutContact
   

Buddhike's Weblog

{binary mind}


Kissing goodbye

"Flames to dust... lovers to friends.... why do all good things come to an end... come to an end..." I just walked into my office humming one of my favorite Nelly Furtado songs and yes, one good (or best rather) thing in my life is about to end today. Although this is not a secret among my best friends, for all the others who've been reading this blog; today I'm retiring from thinktecture family. IMG_6804 

It has been a wonderful ride through out the past three years or so and I'm truly gifted to work with a fantastic team like this. At thinktecture for the first time in my professional life I could find things I like to live with. 

However, just being a part of this rapidly changing world, lately I found that it's time for me to say good bye. So I'm leaving this virtual desk with lot of remarkable memories (And I only have good memories at tt Wink).

I wish all the best to thinktecture and I'm certainly looking forward to seeing them in our future encounters very soon Wink.

DSCN0025 Last but definitely not least during the past few years I and my family made really good family friends and that will remain unchanged for the rest of our life.

thinktecture Rocks!

 

 

 

PS: About next stop in my professional career: I really hope that it deserves a dedicated post in my new blog (http://geeksdiary.com) as it's also going be full of escapades!!!

posted Friday, November 30, 2007 11:21 AM with 0 Comments

Sniffing WCF applications in localhost

WCF comes with handful of tracing and logging options. We can just enable it with a few lines in the config and we are good to go. Furthermore the SDK comes with a handy tool svctraceviewer.exe (for wimps Wink).

However, out-of-the-box trace output gives us access to the data/activities only in the WCF world. For example, at some point we might want to look at the HTTP headers sent/received by the application or we might want to check out the transport level frames are written properly in a custom transport. Although this is quite easy to do with a tool like WireShark or Ethereal, it still wants us to deploy the client and the service applications in two different machines (either virtual or physical).

I'm not done yet. In the end, under the covers, WCF also uses the well-known System.Net  APIs at the transport level. Therefore we can just use System.Net tracing settings to capture the wire level traffic right from the dev box.

For example, to capture the http traffic we can use the following config settings.

<system.diagnostics>
  <trace autoflush="true"/>
  <sources>
    <source name="System.Net.HttpListener">
      <listeners>
        <add name="FooNetTraceListener"/>
      </listeners>
    </source>
  </sources>
  <sharedListeners>
    <add name="FooNetTraceListener" 
    type="System.Diagnostics.TextWriterTraceListener" 
    initializeData="C:\dev\src\lab\wcf\WireLevelTracing\Logs\FooNetTrace.log" 
    traceOutputOptions="None"/>
  </sharedListeners>
  <switches>
    <add name="System.Net.HttpListener" value="Verbose"/>
  </switches>
</system.diagnostics>

If you want to capture the traffic for a built-in tcp transport or a custom transport using System.Net.Sockets API; change to trace source to System.Net.Sockets.

posted Tuesday, November 20, 2007 12:39 PM with 1 Comments

Releasing streams in message contracts

In a lot of WCF streaming applications it's common to see a message contract like this.


[MessageContract]

public class DownloadFileRequest

{

    [MessageHeader]

    private string filename;

   

    [MessageHeader]

    private int length;

   

    [MessageBodyMember]

    private Stream fileStream;

   

    public DownloadFileRequest()

    {

    }     

}

 

This often brings up the question "How can I release the stream when the streaming is completed?". I just noticed a nice attempt by using a simple polling mechanism in this article http://www.codeproject.com/WCF/WCF_FileTransfer_Progress.asp. But, there is much simpler and elegant way to do this. You simply have to implement IDisposable interface in your message contract and clean up the stream in the Dispose method.


[MessageContract]
public class DownloadFileRequest : IDisposable
{
    [MessageHeader]
    private string filename;
   
    [MessageHeader]
    private int length;
   
    [MessageBodyMember]
    private Stream fileStream;
   
    public DownloadFileRequest()
    {
    }   
   
    public void Dispose()
    {
        if(fileStream != null)
        {
           fileStream.Dispose();
           fileStream = null;
        }
    } 
}


Once the message has been fully streamed the dispatcher runtime will call Dispose in all input/output parameter objects used to construct the message.


In to a little bit of internals like always ;). I looked up where exactly this happens in the reflector.

As far as I can understand this work is done in MessageRpc.DisposeParameterList method.

 

private void DisposeParameterList(object[] parameters)
{
    IDisposable disposable = null;
    if (parameters != null)
    {
        foreach (object obj2 in parameters)
        {
            disposable = obj2 as IDisposable;
            if (disposable != null)
            {
                try
                {
                    disposable.Dispose();
                }
                catch (Exception exception)
                {
                    if (DiagnosticUtility.IsFatal(exception))
                    {
                        throw;
                    }
                    this.channelHandler.HandleError(exception);
                }
            }
        }
    }
}


So obviously if our Message contract implements IDisposable it will be perfectly disposed by this function.


Have fun!

posted Thursday, September 06, 2007 8:18 PM with 0 Comments

Some insights on calling sync proxy methods from multiple threads

Update: Scott Seely explained that this is by design.  So I'm changing the title in my post. Also if you want to do this, use async methods as he pointed out.


Last night my friend Michele pointed me an interesting thing on the WCF client side runtime. Basically, she had a service method like as follows.

 

public void SendMessage(string message)

{

    Console.WriteLine("SendMessage: {0}", message);

    MessageBox.Show(String.Format("Received message: '{0}'", message));

}

 

(We know, we know, one should not display message boxes from a service method but here the idea was basically blocking the thread that executes the service method interactively.)

 

Further more the service was configured as PerCall/Multiple and was running on netTcp.

Then we tried to call this service with an instance of svcutil generated proxy in several threads simultaneously (see below). 

 

 private void button1_Click(object sender, EventArgs e)

{

    MethodInvoker m = new MethodInvoker(CallService);

    m.BeginInvoke(null, null);

 }

 

 private void CallService()

 {

     proxy.SendMessage(string.Format("Message {0}", ++counter));

 }

 

When we ran client and the service, we expected to see multiple messages boxes appearing in the service as we click on the button1 in the client. In theory this should work because, although we use a single tcp session, our service is configured for PerCall and Multiple. So the message pump in the service side ChannelHandler can use a new thread to pump the next message from the same tcp session and dispatch it to a new service instance object while the previous message is being processed.

 

However, things did not workout the way we want.  When we clicked the button, first message box appeared in the service. But subsequent clicks did not do so until the first message box was closed (i.e. thread was released). However, when all pending messages are displayed, everything started to work as expected for the subsequent requests.

 

Let the fun begin!  :)

 

Out of curiosity I got a snapshot of all threads while the client is blocked after the few very first calls. Call stack of one blocking service call revealed quite a lot about what's going on. The stack was like this (irrelevant frames and parameters are removed for simplicity sake).

 

[In a sleep, wait, or join]        

mscorlib.dll!System.Threading.WaitHandle.WaitOne()

mscorlib.dll!System.Threading.WaitHandle.WaitOne()

System.ServiceModel.dll!System.ServiceModel.TimeoutHelper.WaitOne()        

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.CallOnceManager.SyncWaiter.Wait()

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce()     

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.EnsureOpened()

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.Call()

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.Call()        

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannelProxy.InvokeService()   

System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannelProxy.Invoke()

mscorlib.dll!System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke()

 

Aha! Interesting! On the client side WCF internally uses ServiceChannel.Call to send/receive messages from the service (regardless of whether you use svcutil generated proxy or manual ChannelFactory and IClientChannel). So it seems like the call is not going out of ServiceChannel.Call at all. It's blocking at a call to EnsureOpened method. Then I opened up Reflector hoping to find what's going on in this method (Lutz, I thank you every time I use it :)).

 

Refelctor demystified some of the interesting stuff WCF do in the client side. When we call service methods, we simply call them with just a single line (we hardly call Open explicitly).  Therefore WCF has to make sure that the underlying channel is Open before sending the request. This opening is just one time action. So internally WCF uses a thing called CallOnceManager to make sure that  actions like this are performed only once.

The ServiceChannel.EnsureOpened method calls the CallOnceManager which is responsible for calling Channel.Open once to open the underlying channel (look at the following code I extracted from Reflector). 

 

internal void CallOnce(TimeSpan timeout, ServiceChannel.CallOnceManager cascade)
{
    SyncWaiter item = null;
    bool flag = false;
    if (this.queue != null)
    {
        lock (this.ThisLock)
        {
            if (this.queue != null)
            {
                if (this.isFirst)
                {
                    flag = true;
                    this.isFirst = false;
                }
                else
                {
                    item = new SyncWaiter(this);
                    this.queue.Enqueue(item);
                }
            }
        }
    }
    SignalNextIfNonNull(cascade);
    if (flag)
    {
        bool flag2 = true;
        try
        {
            this.callOnce.Call(this.channel, timeout);
            flag2 = false;
        }
        finally
        {
            if (flag2)
            {
                this.SignalNext();
            }
        }
    }
    else if (item != null)
    {
        item.Wait(timeout);
    }
}

 

CallOnceManager services the simultaneous requests trying to use it for the first time in a FIFO fashion using a queue. This queue is initialized in the ctor. When the CallOnce is invoked it checks whether this queue is available and if "yes", it checks whether this call is the first one. If it's NOT, a waitable object is created and placed in the queue and then CallOnce method blocks on this newly created waitable object.

 

So out of several simultaneous *first* calls to CallOnce the one that wins open the channel and returns back to the ServiceChannel.Call frame. After doing some more work it finally uses this newly opened channel to send the message. When the reply is returned, it calls CallOnceManager.SingnalIfNotNull method which in turn calls the CallOnceManager.SignalNext method to signal the next waitable object in the queue. When it's signaled the relevant CallOnce call that was waiting on it gets released and the next request is sent to the service.

 

So in our case first request did not return until we closed the message box. So the subsequent service method calls were hanging at the CallOnceManager.CallOnce method. Because first call has to return and call CallOnceManager.SingnalIfNotNull to get one of the waiting calls released.

 

IMO, this is too bad. This should essentially check whether the Open has called and if it has, it should just flow without blocking.

 

On the second part of the question. I was wondering why it was working after end all pending calls. I opened up the SignalNext method and the answer was there.

 

internal void SignalNext()
{
    if (this.queue != null)
    {
        IWaiter state = null;
        lock (this.ThisLock)
        {
            if (this.queue != null)
            {
                if (this.queue.Count > 0)
                {
                    state = this.queue.Dequeue();
                }
                else
                {
                    this.queue = null;
                }
            }
        }
        if (state != null)
        {
            IOThreadScheduler.ScheduleCallback(signalWaiter, state);
        }
    }
}

When the SignalNext is invoked it dequeues the next waitable object from the queue and signals that. If the queue is empty (i.e. no more items striving to be first), it sets the queue to null. Therefore the next call to CallOnceManager.CallOnce just exists without doing anything because the queue is null.

 

IMO, this is a little bug we have there ;). Dear team, please correct me if I'm missing anything here or if this is by design please explain us why.

 

Meanwhile if someone is trying to get over the problem I've also found a nice solution. The CallOnceManager is used for automatic opening of channels. But if we call Open explicitly in our code we can get rid of it (Look at how ServiceChannel.EnsureOpened is called in ServiceChannel.Call method).

 

if (!this.explicitlyOpened)
{
    this.EnsureDisplayUI();
    this.EnsureOpened(rpc.TimeoutHelper.RemainingTime());
}

 

So we can turn on this explicitlyOpened flag by calling Open *once* in our proxy or the channel before invoking the method.   

 

After all I started to wonder why I did not spend little bit of more time to reflector the client side runtime. There is a lot of fun out there as well!!! :)

posted Saturday, September 01, 2007 7:15 AM with 1 Comments

IO Worker Threads - Performance Uncovered!

Writing my last post about WCF threading internals compelled me to reveal some of the tests I did sometime back. So today we are going to see how we can optimize the CPU utilization by using IO worker threads in CLR thread pool.
We are talking about CPU. Therefore I wrote this simple method to be used in my tests which does some CPU intensive operations. Note that I don't do any IO at all, not even a Console.WriteLine() and this is *very* important to assure the accuracy of the results.
In my CpuIntensiveMethod method I calculate the largest Fibonacci number that is below 100 and I run this calculation for int.MaxValue number of times (I will tell you why this number is preferred).

static void CpuIntensiveMethod(int testNumber)
{
    Stopwatch sw = new Stopwatch();
    sw.Start();
                                    
    // Let's perform some math to keep our CPU busy here.
    for (int i = 0; i < iterations; i++)
    {
        // Here we calculate the maximum Fibonacci number less than 100.
        int fn = 0;
        int preFn = 0;

        while (fn <= 100)
        {
            if (fn < 2)
            {
                preFn = fn;
                fn++;
                continue;
            }
            int next = fn + preFn;
            preFn = fn;
            fn = next;
        }
    }

    sw.Stop();
    timings[testNumber, 0] = Thread.CurrentThread.ManagedThreadId;
    timings[testNumber, 1] = sw.ElapsedMilliseconds;
    timings[testNumber, 2] = sw.ElapsedTicks;
}

In addition to the Fibonacci number calculation I also measure the time taken for computation using a StopWatch (newly available utility since .NET 2.0 bits) and record results in a multi-dimensional array. The reason is again, I did not want to do any IO in this method and thus stayed away from even using Console.WriteLine(). BTW: Since there is no interactive way to see whether the method has completed; I monitored the CPU usage using ProcessExplorer (or TaskManager) and then printed the values in the multi-dimensional array once the CPU usage spike is normalized.

In the first test, I simply ran CpuIntensiveMethod() in a single thread by simple calling this method from my main(). And the result was something like this.

Test 0 ran in managed thread id 1: Milliseconds 32942 Ticks 117920861

During the run I could see my CPU was at 100% which was quite expected. So at this point someone might say, that this is the optimal result that I should ever expect. I would definitely agree if I just want to use a single processor in my entire life. Apparently when I ran this test in my dual core box the process was consuming only 50% of my total processing power. This is the reason why that this model does not fit into server applications world. In fact to put this story in to server applications perspective, let's assume that CpuIntensiveMethod is a function in a server application which invoked by multiple clients. So if we had only one thread, we would take ~33 seconds to service one client. And everyone else would not be taken care of until we are done with current one.

So how can we solve this? We talked about one thread per client scenario in my last post (i.e. distributing work among multiple threads). Would it work? Sit back, we are about to try it now. To simulate this scenario I did my next test by running CpuIntensiveMethod method from several thread pool threads. But before talking about the test, let me tell you a little bit about some important CLR thread pool internals. CLR thread pool has worker threads (IO and non-IO) and a special thread called Gate thread. The gate thread is responsible for spinning up new worker threads. The algorithm it uses to determine when to spin a new thread is actually based on CPU utilization, GC frequency and worker queue size. So currently (to be more precise; as of last time I looked at win32threadpool.cpp back in 2006) this thread spins up for every 500ms. So if CpuIntensiveMethod method does not take too long it might get executed on the same thread pool thread and we would not be able to simulate concurrent clients as we originally intended. This is actually why I wanted to run my test int.MaxValue number of times (so, dear dev fellows, if you are running this code in a beefy box, please forgive me now :)). Getting back to our second test, I could successfully run my method in 10 thread pool threads concurrently and here are the numbers that I ended up with.

Test 0 ran in managed thread id 3: Milliseconds 116489 Ticks 416980652
Test 1 ran in managed thread id 4: Milliseconds 141677 Ticks 507140436
Test 2 ran in managed thread id 5: Milliseconds 149974 Ticks 536840900
Test 3 ran in managed thread id 6: Milliseconds 132707 Ticks 475034210
Test 4 ran in managed thread id 7: Milliseconds 156416 Ticks 559901108
Test 5 ran in managed thread id 8: Milliseconds 137661 Ticks 492765340
Test 6 ran in managed thread id 9: Milliseconds 149318 Ticks 534491916
Test 7 ran in managed thread id 10: Milliseconds 132840 Ticks 475507986
Test 8 ran in managed thread id 11: Milliseconds 136866 Ticks 489920458
Test 9 ran in managed thread id 12: Milliseconds 119222 Ticks 426763067

Looks horrible! Isn't it??? Each thread has taken roughly more than 110 seconds for the computation. This is more than 3 times of the time taken by first test. But don't panic. In fact I was happy the fact that it produced results I wanted to see. The reason is simple. In this test I had 10 threads, each striving to take 100% CPU. This means more work for the scheduler. Switcing contexts, swapping pages etc. cause the delay we are seeing in the above table. Consequently this proves that having one thread per client does not help us achieving the best CPU utilization.

In my final test I tried to service the same 10 concurrent calls that I had in the previous test but this time using thread pool's IOCP as shown below.

for (int i = 0; i < 10; i++)
{
    unsafe
    {
        Overlapped overlapped = new Overlapped(0, 0, IntPtr.Zero, null);
        NativeOverlapped* noverlapped = overlapped.Pack(new IOCompletionCallback
            (IoThreadProc));
       noverlapped->OffsetHigh = i;
       ThreadPool.UnsafeQueueNativeOverlapped(noverlapped);                
    }
}

Before I mention my results I must show you some nits in the above code snippet. Creating a NativeOverlapped structure is an expensive operation. Therefore you should try to create only one and reuse it if that fits bill (wanna be happy? Then listen. This is what WCF IOThreadScheduler does). However, in my case I wanted to have a way to pass the test number to CpuIntensiveMethod method. Therefore I used a simple hack by passing it in the OffsetHigh filed in the NativeOverlapped instance. Also you must make sure that you release the memory block held by NativeOverlapped structure by calling Overlapped.Free() function. Otherwise you will notice a rapidly growing working set and eventually your program will end up with an OOM exception (but here that's fine as this is a demo and I have only 10 instances of NativeOverlapped structure). OK, moving to the test results, this is what I got after running test 3.

Test 0 ran in managed thread id 3: Milliseconds 33614 Ticks 120325281
Test 1 ran in managed thread id 4: Milliseconds 35301 Ticks 126363625
Test 2 ran in managed thread id 3: Milliseconds 32928 Ticks 117867839
Test 3 ran in managed thread id 4: Milliseconds 34606 Ticks 123877189
Test 4 ran in managed thread id 3: Milliseconds 33359 Ticks 119411572
Test 5 ran in managed thread id 4: Milliseconds 34528 Ticks 123596836
Test 6 ran in managed thread id 3: Milliseconds 33502 Ticks 119922606
Test 7 ran in managed thread id 4: Milliseconds 34434 Ticks 123260605
Test 8 ran in managed thread id 3: Milliseconds 34454 Ticks 123331465
Test 9 ran in managed thread id 4: Milliseconds 34645 Ticks 124013520

Pretty impressive! Isn't it??? Now each method has taken roughly around 33 seconds which was the same number we saw during test 1. So we were able to serve 10 concurrent calls while using our optimal processing power. Also notice that each method is executed on a thread with MT id either 3 or 4. What does this mean? When CLR thread pool creates its IOCP, it calculates the number of CPUs in the box and passes that value to NumberOfConcurrentThreads parameter of CreateIoCompletionPort function. So in my case, I ran these tests in a dual core box and thus IOCP only allowed only two threads to be active concurrently. That's why we see only manage thread ids 3 and 4.

Concluding this post, I hope that these tests will help you to understand the optimizations that could be achieved by using IO worker threads.
Wanna try out the tests yourself? Here are the sources.

Have fun!

posted Sunday, August 05, 2007 9:04 PM with 0 Comments

WCF Threading Internals (Updated)

Apologies! Last night I had accidently pressed the publish button without ticking the publish as a draft option. So those who have read this post already, sorry about the incomplete, poorly formatted text :(

Along with its bunch of different features, WCF carries a lot of performance optimizations as well (you think I'm kidding? Then see it yourself http://msdn2.microsoft.com/en-us/library/bb310550.aspx). As a part of this, WCF has given a lot of thought about threading model that it uses behind the scenes.

At this point you might probably think;

"Well… I know it uses thread pool API. There is nothing so much about it. I just know that it performs well."  Well you are correct but not really correct. I hope you want to go down to the metal. So please read on… :)

WCF uses CLR thread pool threads to do things asynchronously. However, interestingly it uses IO worker threads in the thread pool instead of the regular thread pool worker threads (don't fuzz if you are not aware of these two kinds of threads). The theory behind IO worker threads reveals a lot about why WCF use it. Therefore I thought I would dedicate this post to talk a little bit about it.

So before we actually dig in let me ask you a simple question. When do you consider that you are taking max out of your CPU? Is it when a single thread trying to take up 100% or multiple threads trying take up 100%? I know, you said single thread 100% case which is correct. When you have multiple CPU bound threads additional costs of things like context switching slow things down. Wanna see it yourself? Write a small lengthy loop. Measure the time it takes when you run it in a single thread. Then delegate it to several threads and again measure the time that each of them take to finish it and compare the values.

So… technically speaking, we can achieve the best CPU utilization only by sticking to "One tread per CPU per execution quantum" invariant.

I/O completion ports (IOCP) were introduce to Windows NT kernel to achieve this goal. Although it's a complex technology, the fundamental theory behind the scene is fairly straightforward.

Before the invent of IOCP; there were two major IO programming techniques. One thread for all IO and one thread per IO. In one thread for all IO model; a thread that is IO bound had to wait doing nothing. Also all other IO operation were blocked until the one going on is done. Although this model was fair enough for single threaded client apps, server applications did not fit in at all. For example, if a simple server was written in this way it will serve only one client at a time. One thread per IO model on the other hand spawn up a new thread for each client. But this eventually ended up with too many threads striving for CPU. So we either have too few threads or too many threads causing the trouble.

In order to solve this problem; IOCP work like a controller hub between two parties. In one end it receives IO completion packets saying that some work is available. On the other end it has some IO worker threads waiting for work. When an IOCP receives an I/O completion packet, it makes one of the waiting threads active and delegate the available work (thread is picked up in LIFO order to avoid potential context switching). So what? How does this model help to solve problem we addressed earlier. The secret lies within NumberOfConcurrentThreads parameter value we pass when we create the IOCP using CreateIoCompletionPort function. This parameter tells IOCP how many concurrent threads that we actually want to have active to process the incoming work. The preferred value for this number equals to the number of CPUs you have in the box. So no matter how fast the work is being queued, we only have desired number of threads concurrently processing them. The other cool thing about this is; when an active IO worker thread goes to wait state (may be it's doing some more IO work), windows scheduler tells the pertaining IOCP that one of its active threads are inactive now. So that the IOCP can make another thread active to perform some other work (Smart! Isn't it? ;)). Consequently the number of active threads in an IOCP at given time is usually a little bit higher than the provided NumberOfConcurrentThreads value.

OK. Now we know why IOCPs are so elegant. But how does WCF actually use it? Well... CLR thread pool has an IOCP associated with it. When thread pool creates its IOCP for the first time, it also creates IO worker threads which are waiting for work. So essentially, we can get an IO worker thread to do some work by sending an IO completion packet to this IOCP. To do that we can use ThreadPool.UnsafeQueueNativeOverlapped function (This method internally invokes the PostQueuedCompletionStatus function). Here is a little program to demonstrate how you could do that.

static void Main(string[] args)

{

  unsafe

  {

    // Create an Overlapped structure and pack it with a pointer to the function

    // that we want to invoke from the IO worker thread.

    Overlapped overlapped = new Overlapped(0, 0, IntPtr.Zero, null);

    NativeOverlapped* pOverlapped = overlapped.UnsafePack(new IOCompletionCallback(OnIoCompletion), null);

    // Send an IO completion packet to thrad pool's IOCP

    ThreadPool.UnsafeQueueNativeOverlapped(pOverlapped);

  }

}

static unsafe void OnIoCompletion(uint errorCode, uint numBytes, NativeOverlapped* pOverlapped)

{

  Console.WriteLine("This is from an IO worker thread");

}

WCF also basically follows the same concept. But it has an elegantly designed queue based IO thread scheduler. I would like to dedicate a separate post to talk about how exactly it works. But if you are reflector fan like me, take a look at System.ServiveModel.IoThreadScheduler class and you will see it in your own eyes.

So what does all this tell us? WCF uses this IoThreadScheduler to queue work items for IO worker threads. This way it preserves the "One thread per CPU per execution quantum" constant and achieves the best CPU utilization. It never (may be I should say I've never seen it but I have a lot of faith on the WCF team) uses ThreadPool.QueueUserWorkItem API and thus refrain from using regular thread pool worker threads (I'm sure now you see why WCF performs a lot better than ASMX runtime ;)).

OK, are you still not certain that WCF is working this way? Cool! I guess you don't have too much faith on me. Well.. Then get ready for a little exercise. Create a little service with a single operation. Make this operation do some lengthy CPU intensive work (perhaps a loop doing some math). And then try to invoke this operation from multiple clients simultaneously (or you can call it from multiple threads in the same client). How many requests that your server can service concurrently? Looking forward to hearing your results :)

Cheers

posted Thursday, August 02, 2007 10:49 PM with 0 Comments

SOAP Routing – What has it gotta do with “To” header ???

During the past few days I came across some SOAP routing intermediary implementations (of course running on WCF ;)) and each of them were trying to route the WCF messages by changing the “To” WS-Addressing header in the message. In some cases this even required completely reconstructing the message by reading its body contents. This actually made me think about the basics once again. What actually happens in the real routers? For example, does Windows TDI driver for TCP/IP change the destination IP to your router’s IP after looking at the routing table? No! Instead, the transport transmits the traffic to the appropriate IP as specified in the routing table. Then the router reads the destination IP in the incoming TCP segments and forwards the traffic to the next network hop according to the routing table in the router itself. So essentially, the router is a device that simply forwards messages without tweaking them. Likewise this should be the theory behind the SOAP routers as well (of course SOAP works at a much higher level and you can divert this approach to meet your custom needs. But I’m talking about the routing in general).

So how exactly you can do this in WCF? The answer is hidden in an attribute that you might not pay too much attention in your everyday WCF adventures ;). When you configure an endpoint you can actually, specify the service address as well as the actual listening address as follows.

[service endpoint]

<endpoint address="http://localhost:8000/service/dummyendpoint.svc"

  listenUri="http://localhost:8000/service/actualendpoint.svc"

  contract=""

  binding=""

  bindingConfiguration="" />

When you start your service the underlying transport actually uses the address specified in the listenUri to listen to the incoming traffic (if this is not specified, it uses the endpoint address by default). The service address on the other hand is the one which goes in the WS-Addressing “TO” header. This address is validated by service model layer to make sure that the messages that arrive at the endpoint are truly intended for this service (otherwise you’ll get the address filter mismatch error… remember that? ;)). So with these two attributes in our hands we can successfully model the aforementioned routing in the SOAP level as well. You can do it by having your router service actually listening on the endpoint address specified in the service (see below).

[router endpoint]

<endpoint address="http://localhost:8000/service/dummyendpoint.svc"

  contract=""

  binding=""

  bindingConfiguration="" />

Then you can simply forward the messages to the actual service endpoints according your routing rules. For example, you can determine a message to the above service endpoint by looking at the action header of the incoming message and forward it to the service by making a channel to http://localhost:8000/service/actualendpoint.svc endpoint (which is the actual endpoint of our service). Also note that this way, no matter how many intermediaries the message passes through the WS-Addressing “TO” header remains consistent.

When I started writing this post I intended to provide a very rough sample that I created couple of months ago (in fact I gave up re-inventing the wheel after seeing Shy Cohen’s wonderful lossy router ;)). But then I realized that the SDK routing sample perfectly demonstrates this. So take a look at it to get a better picture on it.

Have fun!

posted Friday, May 25, 2007 6:00 PM with 1 Comments

WCF - POX Streaming

Have you ever thought about returning a plain old XML document or some well formed HTML body snippet (for some crazy reason ;)) or your RSS feed from a WCF service? Well… I did :). In fact instead of returning the XML document itself, I wanted to stream it as the data source I was anticipating was not fast enough to provide me the complete document at once (i.e. it takes considerably more time to receive the portions of the document than the time taken for the actual transmission).

So my long (well… it's not really long) journey towards a solution started with the contract (oh! Nah! I'm not going to play that famous record once again ;)).

[OperationContract(Action = "*", ReplyAction="*")]

Message GetWeather();

My contract has only one operation. By default WCF uses SOAP action headers in the incoming/outgoing messages to properly dispatch them to the service/client. But in this case I have nothing SOAPish in my payload. Therefore, by setting Action="*" in my OperationContract I'm telling WCF that anything comes into the configured endpoint of this service must be dispatched to this method.

Moving on to my operation implementation, I have a single line of code that simply constructs a Message and return it to the runtime which takes care of transmitting it back to the client.

public Message GetWeather()

{

   Message msg = Message.CreateMessage(MessageVersion.None, "*", new WeatherReport());

   return msg;

}

So much of my solution lies within this line nevertheless. Let me brief you, Message is the fundamental unit of data transfer in WCF (if you have some socket background, think of it as the byte arrays in the world of sockets). You can create a message by calling one of the CreateMessage overloads in Message class. In WCF, these overloads are provided to support both push and pull mode data transfers. So in my case, I'm going for a push mode transfer and I'm doing it using an XML BodyWriter. You can create a BodyWriter by inheriting the BodyWriter abstract class. Then override the OnWriteBodyContenets, which is invoked by WCF runtime when it wants to serialize the message body. The runtime provides us a pointer to the XmlDictionaryWriter which, we can use to push the body contents. Consequently in my case I implemented my body writer in the WeatherReport class and wrote the XML document I wanted to send to the client in its OnWriteBodyContents overload.

protected override void OnWriteBodyContents(XmlDictionaryWriter writer)

{

   writer.WriteStartElement("weatherReport");

   Console.WriteLine("Sending weather report for Colombo");

   writer.WriteStartElement("Colombo");

   writer.WriteAttributeString("temp", "26");

   writer.WriteAttributeString("wind", "SW");

   writer.WriteAttributeString("humidity", "79");

   writer.Flush();

   Thread.Sleep(3000); // Simulate an I/O delay in the data source

 

   Console.WriteLine("Sending weather report for Munich");

   writer.WriteStartElement("Munich");

   writer.WriteAttributeString("temp", "25");

   writer.WriteAttributeString("wind", "NE");

   writer.WriteAttributeString("humidity", "37");

   writer.Flush();

   Thread.Sleep(3000);

 

   Console.WriteLine("Sending weather report for Seattle");

   writer.WriteStartElement("Seattle");

   writer.WriteAttributeString("temp", "15");

   writer.WriteAttributeString("wind", "SW");

   writer.WriteAttributeString("humidity", "80");

   writer.WriteEndElement();

   writer.Flush();

}

You might have already noticed that in the above code, I call writer.Flush() several times. I do this when I've written enough data that the client can understand (weather report for one city in this case) so that it will be transmitted to the client immediately. However, in order make sure that the data is sent back to the client immediately, we have to make sure that we are on the streaming mode. This has to be specified in our binding. I'm setting up my A(address),B(binding) and C(contract) imperatively in the code as follows.

CustomBinding binding = new CustomBinding();

// Encoder

TextMessageEncodingBindingElement encoder = new TextMessageEncodingBindingElement();

encoder.MessageVersion = MessageVersion.None;

binding.Elements.Add(encoder);

// Transport

HttpTransportBindingElement transport = new HttpTransportBindingElement();

transport.TransferMode = TransferMode.StreamedResponse;

transport.MaxBufferSize = 256;

binding.Elements.Add(transport);

// We will take about 10 minutes for our transmission.

binding.SendTimeout = TimeSpan.FromMinutes(10);

 

ServiceHost host = new ServiceHost(typeof(MyService));

host.AddServiceEndpoint(typeof(IMyService), binding,

"http://localhost:8011/myservice");

host.Open();

In this case, my binding contains only the most critical elements, the encoder and the transport we need to host a service. While setting up my encoder I set its MessageVersion property to MessageVersion.None. By doing this I'm telling the encoder that I want to get rid of all the SOAPish stuff in the message finally serialized (Tip: this is your key if you want to do non SOAP transfers). And the in the transport I set the transfer mode to StreamedResponse to stream the responses from my service (when we enable streaming in the http transport, it streams the content as specified in the chunked transfer coding in the HTTP spec). Furthermore I set the MaxBufferSize to 256 bytes since we are only sending a very small chunk at a time. This way you can optimize the memory consumption for read/write buffers used for streaming (default is 64K). Finally I create the ServiceHost and call the Open method in that to start the service.

On the client side, I setup my binding in almost the same way I did it in the service. Then I create a channel to communicate with my service endpoint and invoke the GetWeather operation. When I receive an instance of the Message class from the client side runtime, I get an XmlDictionaryReader at the body contents that I can use to read the underlying XML stream.

Message playlist = myservice.GetWeather();

XmlDictionaryReader reader = playlist.GetReaderAtBodyContents();

while (reader.Read())

{

   switch (reader.NodeType)

   {

     case XmlNodeType.Element:

       Console.WriteLine("{0} Temp:{1} Wind:{2} humidity:{3}",

       reader.Name,

       reader.GetAttribute("temp"),

       reader.GetAttribute("wind"),

       reader.GetAttribute("humidity"));

       break;

     case XmlNodeType.Text:

       break;

     case XmlNodeType.EndElement:

       break;
   }

}

reader.Close();

Now, it is important to note that I've also set the MaxBytesPerRead quota to 64 bytes in ReaderQuotas property of my encoder.

encoder.ReaderQuotas.MaxBytesPerRead = 64;

This value indicates how many bytes the XmlDictionaryReader should read when reading the element start tag and its attributes. Therefore, this value should essentially be large enough to read that information. If you set an unnecessarily large value here, the XmlDictionaryReader.Read() method will not return until it receives enough bytes from the underlying transport (this could be problematic if you receive very small data chunks with a considerable amount of delay as demonstrated in my code). Consequently you would not be able to read the data being streamed in timely fashion (this might even make you think that your data not actually streamed ;)).

You can download my sample code here and take a good look at it. Questions, ideas and corrections are welcomed!

Cheers,

posted Wednesday, May 23, 2007 2:08 PM with 4 Comments

thinktecture

Yesterday I was organizing my 7000+ photo gallery and suddenly noted something which, none of us have posted yet. It’s been a while since we did some cool modifications to thinktecture family ;). Although Ingo and Christian posted about the addition of our friends Neno and Do