Please form a queue, for poison.

In the previous post about using MSMQ to facilitate One-Way-Calls I covered some of the basics of setting up a MSMQ binding. In that scenario if a consumer of that end point sent a malformed message or the message became corrupted in transmission, the service would either discard it and there would be no feedback on the problem or far worse the service may attempt to process the poison message again and again getting stuck in a loop. Such a processing loop would prevent a service from continuing normal operation of being able to handling subsequent messages. MSMQ 3.0 under Windows XP and Windows Server 2003 has only the ability to retry a failed message once, before it either simply drops the message or faults the channel. A message that is un-processable in this fashion is referred to as a poison message. The wise choice is therefore to use MSMQ 4.0 under Windows Server 2008/Vista/Windows 7 which offers more suitable alternatives.

One alternative I would like to discuss here is using a poison queue. The basic concept is; the primary (msmq based) service which has an “important function” (unlike our unimportant notification service in the previous msmq post) has an associated poison queue. Messages are moved into this queue when they are determined to in-fact be poison. These messages can then be dealt with by a separate service or just human monitoring.

A setup of a poison queue is achieved by exposing a second service and endpoint binding with an identical address but with a ;poison suffix (as well as making use of a different bindingConfiguration):

<services>
   <service name = "MyService">
      <endpoint
         address  = "net.msmq://localhost/private/ImportantQueue" 
         binding  = "netMsmqBinding"
         bindingConfiguration = "importantMsgHandling"
         contract = "IImportantService" 
      />
      <endpoint
         address  = "net.msmq://localhost/private/ImportantQueue;poison"
         binding  = "netMsmqBinding"
         contract = "IImportantService"
      />
   </service>

   <bindings>
      <netMsmqBinding>
         <binding name = "importantMsgHandling"
            maxRetryCycles= "2" 
            receiveRetryCount = "3"
            receiveErrorHandling="Move"
            retryCycleDelay = "00:00:10">
         </binding>
      </netMsmqBinding>
   </bindings>
</services>

I hit a few hiccups in this run through so I’ve added a trouble shooting section at the bottom of the post.

The other key configuration features to note on this configuration setup are (and illustrated below):

  • receiveErrorHandling – with the options of: Fault, Drop, Reject and Move. With our chosen option of move meaning once the error handling process has completed it will be moved to our poison queue
  • receiveRetryCount – number of immediate attempts to process the message.
  • maxRetryCycles – number of subsequent attempts to process message.
  • retryCycleDelay – time between retry cycles.
Poison Queue Message Processing

Poison Queue Message Processing

Once our poison message has failed to be processed, it is shifted to the “poison” sub-queue as shown in this Computer Management screen shot:

Computer Management - Poison Queue

Computer Management - Poison Queue

At this point we can do a few things, from the simplest option of having an admin user review the messages in this poison queue, to a more sophisticated approach of having a separate service attempt to process these poison messages with some more sophisticated logic. Juval Lowy (a while ago now) has published an MSDN Magazine article on an even more sophisticated error handling approach for dealing with a poison messages dubbed a “Response Service”. In essence a (potentially) disconnect client is given the ability to receive feedback via a separate queue. Message responses are generated based on the original flawed message. I plan to implement this approach if at some point the medical demo application I started a few weeks ago warrants it.

Troubleshooting:
On a side note under Windows 7 and Vista the user you are logged in as will not have access to listen for messages on a given port. You’ll quickly see the AddressAccessDeniedException:
HTTP could not register URL http://+:8000/. Your process does not have access rights to this namespace (see http://go.microsoft.com/fwlink/?LinkId=70353 for details)

AddressAccessDeniedxception

AddressAccessDeniedxception


To resolve this simply grant yourself permission to access the port via an administrator command prompt call to netsh.exe

netsh http add urlacl url=http://+:8000/ user="DOMAIN\User Name"

Note: “Domain” can be the machine name if you’re not on a domain. Add a few other ports you might be using too, i.e 8005 for the ServiceModelEx logbook service. For more info refer to this blog post, which has a detailed explanation and in the comments there is discussion of issues on other systems such as Windows Server 2003.