In-Depth

Investigating Exceptions Causing Web Site Havoc

A stepped approach using traditional tools in Visual Studio and third-party solutions to troubleshoot and debug Web application issues.

Sometimes you may have a partially functional application where certain parts of the application are working fine, but other parts are not behaving as expected. Issues in the development environment aren't too hard to troubleshoot, but if it’s happening in a production environment affecting users and clients, things can get tricky, especially when traditional ways of troubleshooting don’t provide enough useful information. This article will walk you through one possible way to get to the bottom of the issue.

In order to set the stage, let me first introduce a simplified version of an online store and the issue facing it. The application doesn’t have all the bells and whistles of a typical real-life eCommerce Web site, but it certainly has all the necessary ingredients to describe a real-life problem. It presents a typical use case where users can choose products, review orders and then check out to finish the transaction. As shown in Figure 1, a dropdown lists all the available products. Users can add items by clicking on the "Add Item to Cart" button. Once a user is done placing items in the cart, she can finish by clicking the Checkout button.

[Click on image for larger view.] Figure 1. Simple Online eCommerce System

As a user adds more items, the Item Count increments accordingly (see Figure 2).

[Click on image for larger view.] Figure 2. Items Added in the Cart

A user can view all the items in the cart by clicking on "Show Cart Items" button (see Figure 3).


[Click on image for larger view.] Figure 3. View Items in the Cart

Problem
So, what happens -- sometimes – is a user clicks on the Checkout button and gets the error message shown in Figure 4. Not all users face this problem. Some of them can check out their items and finish a transaction without encountering any error. This makes the issue a little trickier to troubleshoot, as the problem isn’t always reproducible, even in the production environment. For the users who do encounter this error, other parts of the application still seem to be working normally. For example, after getting this error, if a user adds another item in the cart, the Item Count is incremented appropriately, as shown in Figure 5.

[Click on image for larger view.] Figure 4. Error Message Upon Checkout
[Click on image for larger view.] Figure 5. Adding Item Still Works, Even After Getting Error at Checkout
[Click on image for larger view.] Figure 6. Show Cart Items Function Also Working Properly After Getting Error During Checkout

If a user clicks on the Show Cart Items button now, that functionality is also working as expected.

Bottom line is that the Checkout function is broken for some unknown conditions. This obviously is causing loss of revenue for the company, which can never be a good thing for any business. These types of issues always need to be resolved as soon as possible, and that can exert some extra pressure on the developers responsible for maintaining the application.

Analysis
The first logical action to troubleshoot is to look at the application log files, entries in the Event Viewer or any other type of logging that the target application may have been using. In this case, the following error is reported in the application log file:

ERROR 2017-01-25 17:12:35,970 38919ms CheckoutForm  btnCheckout_Click  - System.NullReferenceException: Object reference not set to an instance of an object.
   at OrderSystem.CheckoutForm.CalculateTotalPrice(Int32 customerId, Order order, ShipmentInfo shipmentInfo) in C:\ ApplicationThrowingException\ProductOrderSystem\ProductOrderSystem\CheckoutForm.aspx.cs:line 78
   at OrderSystem.CheckoutForm.btnCheckout_Click(Object sender, EventArgs e) in C:\ApplicationThrowingException\ProductOrderSystem\ProductOrderSystem\CheckoutForm.aspx.cs:line 51

This log entry clearly indicates that a null object reference condition has been encountered. A NullReferenceException isn’t too complicated to understand. According to the MSDN documentation page, this exception is thrown during an attempt to deference a null object reference. It's such a common error that developers encounter causing so many questions to be raised, that Stack Overflow has a separate tag for NullReferenceException. There are also many recommendations on how to resolve it (you can see one example on Stack Overflow).

There’s no question that developers can do a better job checking for a valid object before invoking any operation on it, and the framework can also do a much better job reporting exactly which object it’s complaining about. In fact, it's a popular suggestion on the UserVoice site. Teams at Microsoft are listening to this feedback and looking to provide new capabilities (like a new Exception Helper) that can help determine which object is null. Unfortunately, that’s not very helpful in this situation and you have to rely on conventional approaches to determine which object is null. Because logs are pointing to the CalculateTotalPrice method, let’s take a look at the code for it, as shown in Figure 7.

[Click on image for larger view.] Figure 7. Different Points in CalculateTotalPrice Can Cause NullReferenceException

A quick review shows that it’s full of traps that can cause a NullReferenceException. Here’s a summary of each of potential trap marked with arrows/number in the code in Figure 7. Each of these should have a check for object not null before invoking any operation:

  1. The Products property of Order instance should be checked for null object.
  2. Price property of Product instance should be checked for null.
  3. Multiple object instances/properties that aren’t checked against Null on this line.
  4. Duty property of the object returned by ImportDutyCalculator.GetImportDuty method.
  5. Rate property of the object returned by ShipmentRateCalculator.GetShipmentRates method.

As shown in Figure 8, the CalculateTotalPrice method is invoked from the Checkout button-click event handler. This method is wrapped in a try-catch block and the catch block has the message that’s observed in the UI. As noticed in the problem description earlier, when an error is encountered during a call to CalculateTotalPrice, the checkout process cannot complete, causing revenue loss for the business.

[Click on image for larger view.] Figure 8. Call to CalculateTotalPrice Is Wrapped in Try-Catch Block

The information logged in the log file combined with code review gave a few possibilities about the cause of the error, but you’re still not able to pinpoint the root cause. The traditional approach to get to the bottom of this issue is likely to add more logging in the CalculateTotalPrice method; however, this certainly is not the optimal approach, especially not for enterprise environments where making code changes and deploying them to production datacenters is not always possible. Let’s look for some other approaches.

IntelliTrace
IntelliTrace is a powerful Visual Studio feature that could be used to record and play (forward and backward) a sequence of certain types of events happening during the lifetime of an application. (If you’re new to IntelliTrace, check out this series of articles.) IntelliTrace logs can also be collected from a production application using IntelliTrace Standalone Collector. The link to that app also provides instructions on how to collect logs for a Web application that includes executing some Windows PowerShell scripts. Alternatively, you can also use IntelliTrace Data Collector to collect these logs. It's an open source tool that I’ve authored and is available for download at GitHub.

The output of this log-collection process will be an iTrace file that can be opened with Visual Studio 2015 Enterprise Edition. Figure 9 shows the collected iTrace file with a few key areas marked in red rectangles, but what’s important to notice is that it shows a NullReferenceException with the same stack trace that appeared in the application log file.

[Click on image for larger view.] Figure 9. IntelliTrace Logs in Visual Studio

In order to analyze IntelliTrace logs, it’s important to set up the symbol/PDF file appropriately. Assuming all of that has been done, clicking the "Debug Newest Exception in Group" button will open the source code and point to the last recorded event, as shown in Figure 10. The code line highlighted here is item No. 3, which was analyzed in Figure 7; it's there that you see quite a few possibilities that can cause the NullReferenceException. The Autos windows shows the same exception in Figure 10.

[Click on image for larger view.] Figure 10. Log Shows Code Causing NullReferenceException

Let’s step back and try to analyze different objects in this line of code that could be Null. This code is basically part of a foreach loop block iterating over various Products in the Order instance. This line also accesses a field in the Customer object passed as a method argument. Following is the list of all the objects that should have been checked for a null reference. See if you can extract any useful information from this IntelliTrace log and determine the specific object causing all of this trouble:

  1. Product class has a Detail property instance, as shown in Figure 11. This property is accessed in that line of code without any check for null object.
    [Click on image for larger view.] Figure 11. Product Class's Detail Property Should Have Been Checked for a Null Object
  2. ProductDetail class has InventoryInfo property, as shown in Figure 12. This property is accessed in that line of code without being checked for a null object.
    [Click on image for larger view.] Figure 12. ProductDetail.InventoryInfo Should Be Checked for Null Before Accessing It
  3. InventoryInfo class has an AvailableAtLocations property, as shown in Figure 13. Even though in the constructor this object has been initialized, it’s a public property, so there’s nothing stopping anyone from setting it as null in code.
    [Click on image for larger view.] Figure 13. AvailableAtLocations Should Have Been Checked for a Null Object
  4. Customer class has an Address property, as shown in Figure 14. This property is accessed in the line of code without being checked for null object.
    [Click on image for larger view.] Figure 14. Address Property Should Have Been Checked for a Null Object
  5. Address class has a CityName property, as shown in Figure 15. This property is of string type, but because it’s a public property, it shouldn’t be accessed with a check for null object.
    [Click on image for larger view.] Figure 15. CityName Should Have Been Checked for a Null Object

comments powered by Disqus

Featured

Subscribe on YouTube