Sunday, May 30, 2010

Application hangs caused by calling System.Diagnostic.PerformanceCounter

Summary

Recently we had several IIS application hang crashes caused by thread locks.  The threads were locked by calls to Sytem.Diagnostic.PerformanceCounter.NextValue.  What we found was that under certain circumstances calls into this routine can block a thread for up to 21 minutes.  The circumstances all deal with failures to read the registry due to contention or IO failures. 

The root cause of our problem (why are certain performance registry keys unavailable for read/update) is still unknown.  So as a temporary measure we removed calls to this method from our IIS application.  

Analyzing Thread Blocks

We were really fortunate to have people with the technical skills in house that were comfortable using tools like IISDiag and WinDbg.  This saved us a call to Microsoft Technical support. 

We were able to capture some process dumps using IISDiag.  We got lucky here because we were doing some work on a server when the problem occurred.  This allowed us to manually dump the W3WP processes. 

Once we had the processes dumped we could use WinDbg to analyze the threads.  We found the following pattern with a number of our threads as highlighted by the stack dump below.

00000000037a7768 0000000077d70616 ntdll!NtDelayExecution+0xa


00000000037a7770 000006427f877b4d kernel32!SleepEx+0xaf


00000000037a7810 000006427f540cc1 mscorwks!EESleepEx+0x2d


00000000037a7890 000006427fa2f399 mscorwks!Thread::UserSleep+0x71


00000000037a78f0 00000642751b67a0 mscorwks!ThreadNative::Sleep+0xf9


00000000037a7aa0 000006427f50dfb0 System_ni!System.Diagnostics.PerformanceMonitor.GetData(System.String)+0x1b28f0


00000000037a7af0 000006427f4360c2 mscorwks!ExceptionTracker::CallHandler+0x158


00000000037a7bf0 000006427f48e18a mscorwks!ExceptionTracker::CallCatchHandler+0x9e


00000000037a7c80 0000000077ee461d mscorwks!ProcessCLRException+0x25e


00000000037a7d20 0000000077ee650c ntdll!RtlpExecuteHandlerForUnwind+0xd


00000000037a7d50 000006427f550eea ntdll!RtlUnwindEx+0x238


00000000037a83d0 000006427f48e139 mscorwks!ClrUnwindEx+0x36


00000000037a88e0 0000000077ee459d mscorwks!ProcessCLRException+0x20d


00000000037a8980 0000000077ee60a7 ntdll!RtlpExecuteHandlerForException+0xd


00000000037a89b0 0000000077ef31ed ntdll!RtlDispatchException+0x1b4


00000000037a9060 0000000077d4dd50 ntdll!KiUserExceptionDispatch+0x2d


00000000037a9600 000006427f44e6e3 kernel32!RaiseException+0x5c


00000000037a96d0 000006427fa42e50 mscorwks!RaiseTheExceptionInternalOnly+0x2ff


00000000037a97c0 0000064278d7e0a6 mscorwks!JIT_Throw+0x130


00000000037a9970 0000064278cec5d0 mscorlib_ni!Microsoft.Win32.RegistryKey.Win32Error(Int32, System.String)+0x3a14a6


00000000037a99c0 00000642782f511d mscorlib_ni!Microsoft.Win32.RegistryKey.InternalGetValue(System.String, System.Object, Boolean, Boolean)+0x9f0130


00000000037a9a60 0000064275003f46 mscorlib_ni!Microsoft.Win32.RegistryKey.GetValue(System.String)+0x2d


00000000037a9ab0 0000064275003d25 System_ni!System.Diagnostics.PerformanceMonitor.GetData(System.String)+0x96




To the untrained eye this looks like a bunch of garbage (at least it did to me until I took the time to learn about WinDbg and stack dumps).  What it is saying is that the thread has called the Win32 function Sleep from a try/catch exception handler in the System.Diagnostics.PerformanceMonitor.GetData routine.  The exception originally occurred in the Microsoft.Win32.RegistryKey.InternalGetValue routine. 



Putting the Thread to Sleep



So now we could see that System.Diagnostics.PerformanceMonitor.GetData was putting the thread to sleep.  So with a little help from Red-Gate Reflector (this is one of those tools you must have in your bag as a .Net developer) I could see what is happening in that routine.



Below is a snipit from that routine as decompiled by Red-Gate Reflector.




catch (IOException exception)


     {


         error = Marshal.GetHRForException(exception);


         switch (error)


         {


             case 6:


             case 0x6ba:


             case 0x6be:


                 this.Init();


                 break;


 


             case 0x15:


             case 0xa7:


             case 170:


             case 0x102:


                 break;


 


             default:


                 throw SharedUtils.CreateSafeWin32Exception(error);


         }


         num--;


         if (millisecondsTimeout == 0)


         {


             millisecondsTimeout = 10;


         }


         else


         {


             Thread.Sleep(millisecondsTimeout);


             millisecondsTimeout *= 2;


         }


         continue;


     }




So if one of the following Win32 Errors is the root cause for the IOException then the thread will go to sleep for some number of milliseconds. 



















































Error Description
6 The handle is invalid.
1726 The remote procedure call failed.
1722 The RPC server is unavailable.
21 The device is not ready.
167 Unable to lock a region of a file.
170 The requested resource is in use.
258 The wait operation timed out.


The thread is being put to sleep because the operation will be tried again (and again and again).  That is because the routine has a “while” loop constructed.




    int num = 0x11;


    int millisecondsTimeout = 0;


    int error = 0;


    new RegistryPermission(PermissionState.Unrestricted).Assert();


    while (num > 0)




So what we discovered is this routine will try up to 17 times before it will finally just give up and throw an exception back up the call stack.



Well 17 tries is not so bad, but what is bad is the bit of code that doubles the sleep timeout every time the while loop is executed.  The first puts the thread to sleep for 2 milliseconds, not bad.  But if you execute all 17 iterations the thread will block for a little over 21 minutes.  Very bad.



Final Word



Ultimately we will find some sort of issue with our server that results in the registry keys or RPC calls to fail… so it will be on us.  But whoever in the .Net Framework BCL team coded this routine to loop for 17 times and wait up to 21 minutes needs to be publicly flogged with hurtful words (which is what I am attempting to do in a pathetic way with this post).  Hopefully this gets corrected in future versions.



Technorati Tags: ,,

Saturday, February 6, 2010

Unhandled Exceptions that can Crash your IIS Application Pool

Recently my team discovered some nastiness with unhandled exceptions inside our custom SharePoint code.  Specifically we found that unhandled exceptions inside of SharePoint.Publishing.LoginRunningOperationJob can result an IIS Application Pool crash. 

The reason is because this class puts the delegate code onto a separate thread that when aborted can leave the Application Pool in an unstable state.  Which can (and does) result in an Application Pool recycle (which is bad for very large SharePoint sites that take a few minutes to spin-up).

So you need to make sure that your delegate code is wrapped in try/catch and do NOT throw the error from inside your catch (same as unhandled exception).

Technorati Tags: ,,

Sunday, January 24, 2010

Exciting year to be in IT

I’ve been in IT for a little while now and I have to say that this is the most excited I’ve been since 2000.  Why, well because of all the great stuff Microsoft plans to ship this year. 

Here is a short list of what has me so excited.

Silverlight 4

I know I’m late to the Silverlight party since a lot of people felt like version 3 was a good product for developers.  Well, I’m late on purpose.  I remember taking a look at Silverlight 1 and 2 and thinking, hmmmm I wonder where this will go.  Nothing there for me to go back to business and say we have to take a hard look at this now.  With Silverlight 3 I finally started to see some real potential for the business, but I wanted see if the adoption rate would be good enough.

Now with Silverlight 4 getting ready to ship I finally feel comfortable standing up and saying lets take a hard look at Silverlight for business application development.

.Net 4

I consider this to really be the 3rd major release of the .Net Framework stack.  I guess I’m most excited about the new parallel features that are coming with this version of the framework.  But, what is more important is the fact that the framework continues to grow and get better. 

MVC 2

I believe MVC has shown that it is here to stay.  The latest improvements in MVC 2 have really addressed some of the rough edges that were in the MVC 1 release. 

I see some debates raging about MVC vs. WebForms.  Frankly I think the debates are a little silly as each technology has its niche. It reminds me of the old VB vs. C++ debates for doing Windows Forms development.  Although MVC web development is no where near as complicated as building Windows Forms applications in C++ :D.

Just noticed that @scottgu published an article about this very subject as I am finishing up this blog posting. 413 Graves Mill RoadIsn’t it ironic. :D

Visual Studio 2010

Visual Studio 2010, all I can say is wow.  Some people will think I’m full of it because on the surface it does not look like Visual Studio 2010 has a lot of improvements.  I agree that a lot of the improvements are in specific areas (ex. SharePoint development), but the new Extension Manager model should not be overlooked.

I’ve seen some work coming out of the SharePoint camps that are taking advantage of the new Extension Manager.  One great example is the work being done by Waldek Mastykarz.

The other thing that really has me excited is the new enhancements inside of Visual Studio 2010 Team System.  Last week I watched a Channel 9 video about the new Test Lab Manager.  The more I learn about these “little” enhancements the more I can vision software development teams increasing productivity and quality.  Good stuff if you are a manager of a software development team.

SharePoint 2010

I sort of saved the best for last in this case.  While SharePoint 2010 will not be taking advantage of a lot of the new technology from Microsoft (MVC 2, .Net 4) there are some new features coming that make development a much better experience.

The new SharePoint Tools for SharePoint 2010 are great.  While there is still room for improvement these show that Microsoft got the message about development experience with SharePoint.

I’m also really excited about the new client object model.  This makes connecting to SharePoint data from AJAX, Javascript and Silverlight a palatable experience. 

I am also really excited about the Developer Dashboard technology.  I got to see this very early on and I almost made a mess in my pants.  The reason is because I had just finished up going through a painstaking process of “find the bottleneck” with SharePoint. 

Finally I’m pumped about the new Services architecture.  This is the only major architecture change I can see in SharePoint 2010 (perhaps I am being myopic).  This is a good thing as I think the upgrade from 2003 to 2007 was a lot to chew on.  Anyway the new Services architecture shows a lot of promise for building new extensions to SharePoint.  As soon as I saw the new model I thought of 2 new services that could add value to anyone running Publishing sites. 

Conclusion

The team at Microsoft is getting ready to ship a lot of products this year.  Big hats off to everyone involved. 

Sunday, January 17, 2010

Tapping into the hidden SharePoint API using Reflection

Summary

The SharePoint API provides a very rich experience for software developers as almost everything that can be done with the SharePoint user interfaces can be done through the API.  One could argue that the SharePoint API has contributed to SharePoint’s overall success.

Anyone that has used Reflector to view the internal works of the SharePoint API knows that there is a lot of really interesting functionality that is marked away as private and not to be used by Joe Developer when customizing SharePoint.  Typically this is done by using the C# Internal keyword.  Even though I’m sure Microsoft had our best interest at heart when they locked away some of SharePoint API, sometimes you find something inside of it that you really want/need to use.

Well when you are in that situation (which should be very rarely) .Net Reflection becomes your friend and allows you to send messages to these classes even though they are supposed to be protected from your fingers.

Example: Field.SetFieldBoolValue

Okay, so a few months back I was trying to do some cleanup on some SharePoint Publishing sites.  One of the things I was cleaning was duplicate fields on the Pages list (how we got in this situation is subject for a different blog post).  Well the field I needed to delete was marked as Hidden so it could not be deleted.  Also since the Field definition marked the field as hidden I could NOT set the Hidden value to false. 

So I was looking at the SPField.Hidden property inside of Reflector and discovered that it uses something called SetFieldBoolValue to actually set the Hidden property.

image

A quick scan of that method reveled that it is an Internal method of the SPField class so I could not access it directly (again this is a good thing since Microsoft wants to protect me from myself).

image

Even though I respect that someone at Microsoft does not want me to use this field I really needed to toggle the Hidden property so I could delete that field.

So I used a little Reflection kung-fu to give me access to SetFieldBoolValue.

Type type = field.GetType();


MethodInfo mi = type.GetMethod("SetFieldBoolValue", BindingFlags.NonPublic | BindingFlags.Instance);


mi.Invoke(field, new object[] { "Hidden", false });


mi.Invoke(field, new object[] { "CanToggleHidden", true });


field.Update();




So if you are new to .Net Reflection take a look at this introduction.  What makes this work is the BindingFlags on the GetMethod routine.  So now I can use the MethodInfo to send a message that tells the field object to call the SetFieldBoolValue method.  Great Success!



Conclusion



By using .Net Reflection you can gain access to places in the SharePoint API that are marked as Internal and not to be accessed.  Typically you do NOT want to do this, but sometimes you may need to crack open the engine.  One of the best examples I have seen of this was done recently by a developer on our team.  He used reflection to gain access to the underlying SharePoint list collection to see if a list item had a field value.  This allowed us to really cut down on the number of exceptions thrown by our application.



Tuesday, January 5, 2010

Internet Explorer Discussion Toolbar and SharePoint Publishing Sites.

Summary

The Internet Explorer Discussion Toolbar will probe your web site to see if it is using SharePoint (or Front Page Server Extensions). If it finds that you are using SharePoint then it will enabled the toolbars discussion feature which will most likely result in an Access Denied or some other error message to the user.

While this is a very minor thing you may want to consider blocking access to the URLs Internet Explorer Discussion Toolbar uses to determine if a site is using SharePoint. This can easily be done by using an ISAPI Filter and blocking traffic to /_vti_bin/owssvr.dll and /MSOffice/cltreq.asp.

Background

Recently my team launched some public facing SharePoint Publishing Sites and discovered a small issue with the Internet Explorer Discussion Toolbar. When we would browse our guest (anonymous) access URL we would be prompted for a login. We only were seeing it from certain test clients using Internet Explorer. By installing Fiddler on one of the test clients we could quickly see traffic going to the /_vti_bin/owssvr.dll which would return a HTTP 401 messaging indicating that the client was not authorized.

Below is some sample traffic I collected using Fiddler and http://sharepoint.microsoft.com. As you can see the 11th request (3rd line below) is a call to /_vti_bin/owssvr.dll.

Fiddler Traffic Capture

With a little trial and error we were able to quickly figure out that this toolbar was generating those requests to /_vti_bin/owssvr.dll. I’m not an Internet Explorer Discussion Toolbar expert, but it appears to send that request every time a request is made to the server.

If it receives a 200 then it enables discussions for the page. Below is a screenshot from http://sharepoint.microsoft.com. The Discussion Toolbar is enabled and ready to go.

discussion toolbar enabled

Just because the toolbar is enabled does not mean people will be able to attach comments to your web pages. I tried this and discovered that the toolbar will fail with an Access Denied error since it is trying to write to the SharePoint site collection.

How we stopped it.

During the testing I discovered that if the request to /_vti_bin/owssvr.dll fails then the toolbar will display a message stating that discussions are not allowed for this page. Below is a screenshot of the discussion toolbar disabled.

discussion toolbar disabled

To stop this activity we used ISAPI_Rewrite to deny all requests going to /_vti_bin/owssvr.dll and /MSOffice/cltreq.asp through the IIS sites that support browsing (we have separate IIS site for content editing). We did NOT want to block traffic to /_vti_bin/owssvr.dll through our editing site because we were concerned it would break some of the Office Client integration features.

Sunday, June 21, 2009

SharePoint Saturday Charlotte

This past Saturday I had the privilege of presenting at SharePoint Saturday Charlotte Event along side some top talent in the SharePoint community.  It was great to finally meet some of the people I follow on Twitter (too many to name).  Also, big kudos to Dan Lewis @danlewisnet, Brian Gough and all of the #SPSCLT Volunteers, you guys/girls rock!!!

As promised here the slide deck from my talk about Performance Testing with SharePoint.  I was hoping for a little bigger turnout, but when I saw I was in the same time slot as Becky Isserman (@MossLover) and Laura Rogers (@WonderLaura) I knew I would be lucky to get 7 people. :)

I really enjoyed the sessions I got to attend. 

The day started off with a GREAT presentation from Phil Wicklund. Phil had lots of pragmatic advice for managing your SharePoint investment. 

Next, I learned a LOT about how MS has implemented Windows Azure from Rick Taylor (@slkrck).  Rick is a GREAT speaker and had lots of war stories (which I LOVE hearing). 

Then, I got to hear Mike Watson (@mikewat) talk about SharePoint hosting architectures.  He has some really good insight into what it takes to make SharePoint purr. Also, I got to hear that our hosting architecture for SharePoint is in line with what he thinks is the RIGHT way to do it.

Next, I listened into Dan User (@usher) talk about Taxonomies.  I felt much better about my Internet facing solution that will have about 30 Web Applications in one Farm after hearing what he is doing. 

Finally, I listened to Dan Attis (@jdattis) talk about a solution he recently wrapped up that used SharePoint lists to store data for a Web Interface that was not SharePoint.  It sounds like a really cool solution the folks at B&R built.  Also, I learned about Object Initializers in .Net 3.5, really cool stuff.

Definitely will not be my last SharePoint Saturday event.

Thursday, June 4, 2009

Big Thanks to Office / SharePoint Teams for TAP Airlift

I had the privilege of attending the Office 14 TAP Air Lift this week in Seattle.  This is my second time coming out to to a SharePoint Air Lift and I must say they never disappoint.  While I cannot share any information from the Air Lift I can say it is exciting times to be working with SharePoint. 

I really want to extend a big thank you to Microsoft for hosting a great event.  The folks in Office / SharePoint development teams have some big deadlines in front of them.  For them to take time out of their busy schedules to spend some one on one time with customers says a LOT.