In-Depth

Parsing the BSON Beast

Like JSON, only in binary format, BSON is now easier to parse with built-in media type formatters that are included with ASP.NET Web API 2.2 Client Libraries. Here's how.

Binary JSON, or BSON, is a format similar to JSON, but as the name suggests is in a binary format. Developers like to use BSON because it's lightweight with minimal spatial overhead, it's easy to parse, and it means more efficient encoding and decoding. With the release of the Microsoft ASP.NET Web API 2.2 Client Libraries, parsing BSON can be done using the built-in media type formatters.

Before demonstrating the use of the BSON format, I'll first discuss the format. BSON objects consist of an ordered list of elements. Each element contains field type, name and value.

Field names are strings. Types can be any of the following:

  • string
  • integer (32-bit)
  • integer (64-bit)
  • double (64-bit)
  • date (integer number of milliseconds)
  • byte array
  • boolean
  • null
  • BSON object
  • BSON array
  • Regular expression
  • JavaScript code

Because the BSON format allows for storing various data types, there's no need to convert a string to a given type. This accelerates parsing and data retrieval in comparison to JSON or other text-based formats.

To demonstrate BSON, I'll rely on a previous article I wrote as a starting point, "Implementing Binary JSON in ASP.NET Web API 2.1," which shows how to create a Web API service that renders BSON. Moving forward, here I'll modify the project to utilize more data types in BSON. In addition, I'll examine the data structure passed to the client application.

Looking at the Visual Studio solution for the Web API service, I modified the Car.cs file to include two DateTime fields called DateServiced and TimeServiced. These will be used later to illustrate the BSON format of DateTime types. The modified file can be seen in Listing 1.

Listing 1: Complete Listing of Car.cs


using System;

namespace CarInventory21.Models
{
  public class Car
  {
    public Int32 Id { get; set; }
    public Int32 Year { get; set; }
    public string Make { get; set; }
    public string Model { get; set; }
    public string Color { get; set; }
    public DateTime DateServiced { get; set; }    //Newly added field        
    public DateTime TimeServiced { get; set; }   //Newly added field
  }
}

In the CarController.cs file, I modified the instantiation of the Cars object to include the newly added fields, as seen here:

Car[] cars = new Car[] 
{ 
  new Car { Id = 1, Year = 2012, Make = "Cheverolet", Model = "Corvette", Color ="Red", 
    DateServiced=Convert.ToDateTime("07/21/2014"), TimeServiced=Convert.ToDateTime("08:32:00") }, 
  new Car { Id = 2, Year = 2011, Make = "Ford", Model = "Mustang GT", Color = "Silver", 
    DateServiced=Convert.ToDateTime("08/16/2014"), TimeServiced=Convert.ToDateTime("08:33:00") }, 
  new Car { Id = 3, Year = 2008, Make = "Mercedes-Benz", Model = "C300", Color = "Black", 
    DateServiced=Convert.ToDateTime("06/30/2014"), TimeServiced=Convert.ToDateTime("08:34:00") } 
};

Next, I'll ensure the Web API service is configured to send the BSON format. To do this, I'll make sure the BsonMediaTypeFormatter object is being added to the config object (HttpConfiguration type). The complete listing of the WebApiConfig.cs is shown in Listing 2.

Listing 3: The WebApiConfig.cs
using System.Net.Http.Formatting;
using System.Web.Http;

namespace CarInventory21
{
  public static class WebApiConfig
  {
    public static void Register(HttpConfiguration config)
    {
      // Web API configuration and services

      // Web API routes
      config.MapHttpAttributeRoutes();

      config.Routes.MapHttpRoute(
        name: "DefaultApi",
        routeTemplate: "api/{controller}/{id}",
        defaults: new { id = RouteParameter.Optional }
      );
            
      config.Formatters.Clear();                             // Remove all other formatters
      config.Formatters.Add(new BsonMediaTypeFormatter());   // Enable BSON in the Web service
    }
  }
}

After all coding changes have been completed, I'll set the index.html to be the start page. If you recall this is performed by simply right-clicking on the page in Solution Explorer and selecting Set As Start Page. The completed CarInventory solution is shown in Figure 1.

[Click on image for larger view.] Figure 1. Solution Explorer View of CarInventory21.sln

Now, when I run the application, index.html will be rendered, as seen in Figure 2.

[Click on image for larger view.] Figure 2. View of Index.html

After I click the /api/car link, the Web API responds by returning a complete listing of the Car[] object in a BSON format. The results are returned to the browser and, hence, prompts me to save or open the file. I'll save the file as car.json and then view it in Visual Studio 2013 so I can examine the binary structure. Looking at the overall structure, you'll see records, or documents as referred to in the BSON specification, are automatically indexed with a 0-based number, as seen in Figure 3.

[Click on image for larger view.] Figure 3. BSON Records Containing a 0-Based Index

In addition, each record index is preceded by \0x03, indicating it's an embedded document. This can be seen in Table 1, where all the field designations used in the BSON specification are outlined. The null character (\0x00) is used as a field separator throughout the structure.

As mentioned previously, each field within each record contains information on the type, name and value of each field. The first record, highlighted in Figure 4, shows the first field in the record is "Id" represented by the Hex codes 49 64. Immediately before those bytes is a byte with value \0x10.

[Click on image for larger view.] Figure 4. First Record in the Output

Looking at Table 1, the value \0x10 designates a 32-bit integer, which is the data type of the Id field.

Table 1: BSON Data Type Designation
Type Value (Hex) Type Description
\x00 BSON Document : init32 refers to the total number of bytes of the document
e_list ::= element e_list | "" Sequence of elements
\x01 Floating point
\x02 UTF-8 string
\x03 Embedded document
\x04 Array
\x05 Binary data
\x06 Deprecated
\x07 (byte*12) ObjectId
\x08 \x00 Boolean "false"
\x08 \x01 Boolean "true"
\x09 int64 UTC milliseconds in Unix epoch
\x0A Null value
\x0B Regular expression
\x0C Deprecated
\x0D JavaScript Code
\x0E Symbol
\x0F JavaScript code w/ scope
\x10 32-bit Integer
\x11 Timestamp.
\x12 64-bit integer
\xFF Min key
\x7F Max key
e_name ::= cstring Key name
string ::= int32 (byte*) "\x00" String
cstring ::= (byte*) "\x00" CString
binary ::= int32 subtype (byte*) Binary
subtype ::= "\x00" Binary / Generic
subtype ::= "\x01" Function
subtype ::= "\x02" Old generic subtype
subtype ::= "\x03" UUID
subtype ::= "\x05" MD5
subtype ::= "\x80" User defined
code_w_s ::= int32 string document Code w/ scope

comments powered by Disqus

Featured

Subscribe on YouTube