Caution required with BizTalk’s SB-Messaging adapter

BizTalk 2013 ships with a native adapter for the Windows Azure Service Bus – the SB-Messaging adapter.

The adapter provides a very easy to use approach to receiving and sending messages to and from a service bus queue/topic/subscription but, as it turns out, it might be that it is a little bit simplistic or – at the very least – some caution is required as I’ve learnt with a customer yesterday –

The customer uses BizTalk to read a message from a Service Bus queue using the adapter. In the pipeline they ‘debatch’ the incoming message into multiple messages that get published to BizTalk’s message box.

This all worked just fine during testing, but when the customer moved to performance testing they hit a point after which they got a lot of errors such as the follows –

The adapter "SB-Messaging" raised an error message. Details "System.ObjectDisposedException: Cannot access a disposed object.

Object name: ‘Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement+RedirectContainerChannelFactory`1[System.ServiceModel.Channels.IRequestSessionChannel]‘.

The adapter "SB-Messaging" raised an error message. Details "Microsoft.ServiceBus.Messaging.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue.

When reading from the Service Bus the client can employ one of two strategies – a desructive read whereby the message is remove from there queue as soon as it is being read or a ‘peek lock’ strategy where the read message is locked and effectively hidden on the queue for a duration, allowing the client to process the message and then come back and either confirm that it had successfully processed the message or abandon the message, effectively  putting it back on the queue.

The peek-lock strategy is very useful to ensure no message loss and BizTalk’s SB-Messaging adapter makes a good use of it – until the received message is persisted in the message box it is not removed from the queue and hence zero-message loss can be guaranteed.

On the flip side, it means that a message may be delivered more than once and this has to be catered for.

In this customer’s case, the debatching they performed meant that, under load, it took BIzTalk too much time to complete the debatching and persisting all the messages to the message box. That meant that it did not go back to the queue in time to release the lock resulting with the error above. unfortunately this also caused a snow-ball effect as it meant that another BizTalk instance picked up the same message and the issue repeated itself, with the large number of messages being persisted to the message box increasing the load on the server and with it the duration of the receive pipeline processing….

Once identified they have found a quick workaround, at least in the first instance – remove the debatching from the service bus receive port and move it to a later stage in their BizTalk processing. This, I think, is a good appraoch, as it lets them ‘take ownership’ of the message in BizTalk, removing it from the Service Bus, before starting the heavy processing.

Another approach is to change the lock duration property on the queue/subscription in question. the default is 60 seconds. This could by time and with careful load-testing a reasonable value may be found, but – of course – it does not eliminate the risk, just pushes it out further.

Asking in the virtual corridors of Solidsoft it appears we have seen this issue with at least  two other customers. I think that the SB-Messaging adapter should support both strategies and allow the customer the choice, and perhaps flagging the need to set the lock duration value on the queue/subscription as required.

 

Cross posted on the Solidsoft blog

Caution required with BizTalk’s SB-Messaging adapter

BizTalk 2013 ships with a native adapter for the Windows Azure Service Bus – the SB-Messaging adapter.

The adapter provides a very easy to use approach to receiving and sending messages to and from a service bus queue/topic/subscription but, as it turns out, it might be that it is a little bit simplistic or – at the very least – some caution is required as I’ve learnt with a customer yesterday –

The customer uses BizTalk to read a message from a Service Bus queue using the adapter. In the pipeline they ‘debatch’ the incoming message into multiple messages that get published to BizTalk’s message box.

This all worked just fine during testing, but when the customer moved to performance testing they hit a point after which they got a lot of errors such as the follows –

The adapter "SB-Messaging" raised an error message. Details "System.ObjectDisposedException: Cannot access a disposed object.

Object name: ‘Microsoft.ServiceBus.Messaging.Sbmp.RedirectBindingElement+RedirectContainerChannelFactory`1[System.ServiceModel.Channels.IRequestSessionChannel]‘.

The adapter "SB-Messaging" raised an error message. Details "Microsoft.ServiceBus.Messaging.MessageLockLostException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue.

When reading from the Service Bus the client can employ one of two strategies – a desructive read whereby the message is remove from there queue as soon as it is being read or a ‘peek lock’ strategy where the read message is locked and effectively hidden on the queue for a duration, allowing the client to process the message and then come back and either confirm that it had successfully processed the message or abandon the message, effectively  putting it back on the queue.

The peek-lock strategy is very useful to ensure no message loss and BizTalk’s SB-Messaging adapter makes a good use of it – until the received message is persisted in the message box it is not removed from the queue and hence zero-message loss can be guaranteed.

On the flip side, it means that a message may be delivered more than once and this has to be catered for.

In this customer’s case, the debatching they performed meant that, under load, it took BIzTalk too much time to complete the debatching and persisting all the messages to the message box. That meant that it did not go back to the queue in time to release the lock resulting with the error above. unfortunately this also caused a snow-ball effect as it meant that another BizTalk instance picked up the same message and the issue repeated itself, with the large number of messages being persisted to the message box increasing the load on the server and with it the duration of the receive pipeline processing….

Once identified they have found a quick workaround, at least in the first instance – remove the debatching from the service bus receive port and move it to a later stage in their BizTalk processing. This, I think, is a good appraoch, as it lets them ‘take ownership’ of the message in BizTalk, removing it from the Service Bus, before starting the heavy processing.

Another approach is to change the lock duration property on the queue/subscription in question. the default is 60 seconds. This could by time and with careful load-testing a reasonable value may be found, but – of course – it does not eliminate the risk, just pushes it out further.

Asking in the virtual corridors of Solidsoft it appears we have seen this issue with at least  two other customers. I think that the SB-Messaging adapter should support both strategies and allow the customer the choice, and perhaps flagging the need to set the lock duration value on the queue/subscription as required.

Guest OS support lifecycle on Windows Azure

On the latest Windows Azure newsletter that landed in my mailbox Microsoft included the following paragraph -

Guest OS family 4 availability and Guest OS family 1 retirement
Effective June 2, 2014, Windows Azure will stop supporting guest operating system (Guest OS) family 1 for new and existing Cloud Services deployments. Developers are advised to move to the latest/supported Guest OS families before this deadline to avoid potential disruption of their cloud services.

Although based on a sample of exactly 1 – this provides an answer to a question I get asked regularly and never had a concrete answer for – how long will Azure support an operating system for

Given that so far Azure supported any operating system it had ever supported there was no clear indication when Microsoft will start to retire Guest OS’ from Azure. Until now.

My expectation was always that Guest OS’ will be supported on Azure as long as their on-premises equivalents are in mainstream support, but it looks like this is not the case as Windows Server 2008 SP2’s mainstream support runs until January 13th, 2015 – a year away.

4 years is a long time in computing these days, and the industry is moving in a faster and faster pace, so I can see why Microsoft would want to keep the platform moving forward so that they can offer the latest and greatest and I have, for a while now, encouraged customers to consider Windows Server 2008 R2 the minimum support O/S to avoid this issue, but it would have been good to align with the OS’ mainstream support cycle.

Another answer that was given is to the question – how much of a notice will Microsoft provide when they retire an OS – around 4 months.

Cross-posted on the Solidsoft Blog

Guest OS support lifecycle on Windows Azure

On the latest Windows Azure newsletter that landed in my mailbox Microsoft included the following paragraph -

Guest OS family 4 availability and Guest OS family 1 retirement
Effective June 2, 2014, Windows Azure will stop supporting guest operating system (Guest OS) family 1 for new and existing Cloud Services deployments. Developers are advised to move to the latest/supported Guest OS families before this deadline to avoid potential disruption of their cloud services.

Although based on a sample of exactly 1 – this provides an answer to a question I get asked regularly and never had a concrete answer for – how long will Azure support an operating system for

Given that so far Azure supported any operating system it had ever supported there was no clear indication when Microsoft will start to retire Guest OS’ from Azure. Until now.

My expectation was always that Guest OS’ will be supported on Azure as long as their on-premises equivalents are in mainstream support, but it looks like this is not the case as Windows Server 2008 SP2’s mainstream support runs until January 13th, 2015 – a year away.

4 years is a long time in computing these days, and the industry is moving in a faster and faster pace, so I can see why Microsoft would want to keep the platform moving forward so that they can offer the latest and greatest and I have, for a while now, encouraged customers to consider Windows Server 2008 R2 the minimum support O/S to avoid this issue, but it would have been good to align with the OS’ mainstream support cycle.

Another answer that was given is to the question – how much of a notice will Microsoft provide when they retire an OS – around 4 months.

Role-based authorisation with Windows Azure Active Directory

Using Window Azure Active Directory (WaaD) to provide authentication for web applications hosted anywhere is dead simple. Indeed in Visual Studio 2013 it is a couple of steps in the project creation wizard.

This provides means of authentication but not authorisation, so what does it take to support authorisation is the [Authorise(Role=xxx)] approach?

Conceptually, what’s needed is means to convert information from claims to role information about the logged-on user.

WaaD supports creating Groups and assigning users to them, which is what’s needed to drive role-based authorisation, the problem is that, somewhat surprising perhaps, group membership is not reflected in the claim-set delivered to the application as part of the ws-federation authentication.

Fortunately, getting around this is very straight forward. Microsoft actually published a good example here but I found it a little bit confusing and wanted to simplify the steps somewhat to explain the process more clearly, hopefully I’ve succeeded –

The process involves extracting the claims principal from the request and, from the provided claims,  find the WaaD tenant.

With that and with prior knowledge of the clientId and key for the tenant (exchanged out of band and kept securely, of course) the WaaD GraphAPI can be used to query the group membership of the user

Finally – the groups can be used to add role claims to the claim-set, which WIF would automatically populate as roles allowing the program to use IsInRole and the [Authorise] attribute as it would normally.

So – how is all of this done? –

The key is to add a ClaimsAuthenticationManager, which will get invoked when an authentication response is detected, and in it perform the steps described.

A slightly simplified (as opposed to better!) version of the sample code is as follows 

 public override ClaimsPrincipal Authenticate(string resourceName, ClaimsPrincipal incomingPrincipal)
        {
            //only act if a principal exists and is authenticated
            if (incomingPrincipal != null && incomingPrincipal.Identity.IsAuthenticated == true)
            {
                //get the Windows Azure Active Directory tenantId
                string tenantId = incomingPrincipal.FindFirst("http://schemas.microsoft.com/identity/claims/tenantid").Value;

                // Use the DirectoryDataServiceAuthorizationHelper graph helper API
                // to get a token to access the Windows Azure AD Graph
                string clientId = ConfigurationManager.AppSettings["ida:ClientID"];
                string password = ConfigurationManager.AppSettings["ida:Password"];

                //get a JWT authorisation token for the application from the directory 
                AADJWTToken token = DirectoryDataServiceAuthorizationHelper.GetAuthorizationToken(tenantId, clientId, password);

                // initialize a graphService instance. Use the JWT token acquired in the previous step.
                DirectoryDataService graphService = new DirectoryDataService(tenantId, token);

                // get the user's ObjectId
                String currentUserObjectId = incomingPrincipal.FindFirst("http://schemas.microsoft.com/identity/claims/objectidentifier").Value;

                // Get the User object by querying Windows Azure AD Graph
                User currentUser = graphService.directoryObjects.OfType<User>().Where(it => (it.objectId == currentUserObjectId)).SingleOrDefault();


                // load the memberOf property of the current user
                graphService.LoadProperty(currentUser, "memberOf");
                //read the values of the memberOf property
                List<Group> currentRoles = currentUser.memberOf.OfType<Group>().ToList();

                //take each group the user is a member of and add it as a role
                foreach (Group role in currentRoles)
                {
                    ((ClaimsIdentity)incomingPrincipal.Identity).AddClaim(new Claim(ClaimTypes.Role, role.displayName, ClaimValueTypes.String, "SampleApplication"));
                }
            }
            return base.Authenticate(resourceName, incomingPrincipal);
        }

You can follow the comments to pick up the actions in the code; in broad terms the identity and the tenant id are extracted from the token, the clientid and key are read from the web.config (VS 2013 puts them there automatically, which is very handy!), an authorisation token is retrieved to support calls to the graph API and the graph service is then used to query the user and its group membership from WaaD before converting, in this case, all groups to role claims.

To use the graph API I used the Graph API helper source code as pointed out here. in Visual Studio 2013 I updated the references to Microsoft.Data.Services.Client and Microsoft.Data.OData to 5.6.0.0.

Finally, to plug in my ClaimsAuthenticationManager to the WIF pipeline I added this bit of configuration –

  <system.identityModel>
    <identityConfiguration>
          <claimsAuthenticationManager
type="WebApplication5.GraphClaimsAuthenticationManager,WebApplication5" />

With this done the ClaimsAuthenticationManager kicks in after the authentication and injects the role claims, WIF’s default behaviour then does its magic and in my controller I can use, for example –

        [Authorize(Roles="Readers")]
        public ActionResult About()
        {
            ViewBag.Message = "Your application description page.";

            return View();
        }

Cross posted on the Solidsoft blog

Windows Azure Private Network behaviour change

I’ve learnt today that IP routing on Windows Azure when a private network (VPN) is configured has changed recently (not quite sure exactly when, but in the last few weeks I suspect) in a way that can be quite dramatic to many –

Previously, as soon as a site-to-site VPN was configured on a virtual network on Windows Azure, all outbound traffic from the network got routed through the VPN.

This surprised me at the time – I assumed that as the range of IP addresses exposed via the VPN is known, only traffic directed at this ranged will  get routed via the VPN and all other traffic will go directly to the internet. This assumption was proven to be wrong, as we’ve learnt at the time.

This, I was told by several people, is more secure given that VMs on Azure only have one NIC. It also provided organisation an opportunity to more tightly control traffic as all traffic got routed through the organisation’s firewall. for example – organisations who needed to present a consistent IP publically when calling remote systems could control that via their on-premises network configuration as I previously blogged about here, a post I now had to correct, see below.

The main downside of this was always that not all traffic was sensitive and routing everything through the VPN and the on-premises network added latency to the requests and load on the network. For example – a virtual machine on a private network with VPN, calling other Azure services such SQL Database, the Service Bus or the Caching Service would see all requests routed to their internal network before going back out to the internet and to, potentially, the same data centre the request originated from.

In conversation today I’ve learnt (and since confirmed, of course!) that this behaviour has changed and that Azure now behaves as I originally expected it to and now only outbound traffic directed at the IP range exposed via the VPN is routed via the VPN and all other traffic goes straight out through the internet.

This makes perfect sense, but is quite a big change and I’m surprised this wasn’t communicated clearly.

There are downsides – some customers, as I eluded earlier, enjoyed the extra level of control that routing all traffic via their network provided – be it firewall configuration or control over the IP they got routed to external services from. This is not possible at the moment, but I’m hoping that in the not too distant future Windows Azure will offer choice to customers.

Cross posted on the Solidsoft blog

Decisions

In an event on Windows Azure we ran earlier in the year my colleague Charles Young discussed some practical points to consider when designing solutions to run on Windows Azure one of which was – “be cost aware, not cost fanatical” or – “don’t distort your design to cost-driven-design”.

I thought this point is very true and the clarity and visibility that cloud brings with regards to cost that rarely exists for on-premises solution does risk muddying the water somewhat. Pragmatic teams would weigh in additional costs associated with better design with long term benefits, but I can see how this can go wrong for some.

Recently I spent a day with a customer in an exercise that highlighted how real this is -

The short story is that ‘the business’ needed a little data-driven web application built and so ‘the business’ contacted a reputable digital marketing agency to build it for them. 

‘The business’ was in no position to suggest to the agency where the web site will be hosted and so they went about their business as they saw fit, not necessarily considering set-up and operational costs.

‘The business’ then went to IT and asked them to arrange for the solution to be hosted. IT went to their default hosting solution which quoted them around £9,000 per month in hosting cost.

Thankfully, somebody at this point though to look into hosting this in Azure and in my recent visit we’ve calculated that this would cost around £650 per month to run on Windows Azure. a fair bit cheaper than the default position, enough to allow for quite a few ‘eventualities’ even beyond my initial conversation.

What I found really interesting is that the web site, being data-driven, needed a facility to allow the business to import new data regularly and the agency, not being aware where and how the solution would be hosted, have made the very reasonable decision to implement that import logic using SQL Server Integration Services (SSIS).

Unfortunately, on Windows Azure, this seemingly small decision meant having to add a SQL Virtual Machine to the mix which so far included a couple of web roles and SQL Databases and that VM alone, at around £350 per month, is around half of the entire running cost of the project.

Had this been known at the outset, would the design decision be different? hard to tell – so many different personas involved – the business who drive requirements but are also responsible for the bills, the agency who come up with the design, IT who are responsible for the day to day running and operations – but I can certainly see that weighing up the options could easily suggest that spending a bit more time in developing the import logic as part of the web site and not using SSIS could save a fair bit in the long run in running costs.

So, no – you don’t have to design with cost in mind, and you don’t even have design with cloud in mind, but you will miss some benefits if you ignore your hosting platform.

Cross posted on the Solidsoft Blog

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: