What is Pivot – from Microsoft Live Labs?

Pivot a research product currently available from GetPivot.com from Microsoft Live Labs. It uses technology from Seadragon – Deep Zoom technology also of MS Live Labs.


A quote from the Pivot team:

“We tried to step back and design an interaction model that accommodates the complexity and scale of information rather than the traditional structure of the Web.”

Which translates into the ability to aggregate and interpolate large data sets.

If I’m losing you, this is a must see 5 minute TED 2010 video presented by Gary Flake.

Pivot User Interface

Pivot itself is an advanced type of web browser that facilitates this, but…

It is many things:

  • “The ability to slice and dice data.”
  • “Allows for the whole to be greater than the sum of the parts.”
  • “Lets you move from the tiniest detail, to the full scope of the entire set of the information.”
  • “Build a connection to the data, and be immersed.”
  • “Making large sets of data more approachable.”
  • “Empowering people to create new types of collections, and new types of experience from the interaction.”

That last point struck a chord with me, and I will attempt to create a simple collection. A collection in this case is a set of items with common attributes, ranging from hundreds to thousands to millions of items.

Collection Types

Pivot Collection Types

Pivot Collection Types

Steps to building a collection:

  1. Select your content; determine your linking level (diagram above).
  2. Expose your data in a way you can build the collection. OData service is an option.
  3. Build the cXML (Collection XML file).
  4. Navigate to the CXML file.
  5. Or navigate to the hosted version of the data set. (See Pivot API for .NET)

Pivot Navigation Bar

Resources and sources of quotes/images:

  • Gary Flake and team Pivot walk through on YouTube.
  • John Bristowe‘s post about using Pivot on the StackOverflow API
  • John Bristowe’s video on Pivot operations on the StackOverflow data.
  • Pivot API for .NET
  • Collection types examples diagram.
    • Advertisements

Working with the OData URI Conventions

Following on from my resource output format options post in my OData series. I thought I would briefly cover the topic of URI Conventions I had previously eluded to.

URI Conventions for OData allow for a simple standard way to control the resource that is returned. The documentation for these conventions can be found here on the OData.org site.

The first important choice being the result format. In my previous post I discussed the options of AtomPub and JSON, the way you go about making your format choice for the returned resources is via the $format option. If your choice is JSON then simply suffix query option $format=JSON, for example on the NetFlix OData service:


As a quick interjection here for another convention; $callback is a handy feature to specify the name of a JavaScript function that will be called when the data returns, it’s even easy to use for an anonymous callback method by using “?” so $callback=?. I’ll go more into detail about this when covering my implementation in a future post.

In my experience with a few services the AtomPub choice is the default and the $format tag is not required, and in some cases (NetFlix in particular) $format=atom is not a valid input query. I’ll need to investigate this more, it seems like an oversight on at least the NetFlix service, I don’t see an issue with being more verbose even for a default value.

I also briefly mentioned deferred content and the use of $expand to force eager-loading of the result tree structure. This currently doesn’t seem to work on the NetFlix service, so this is yet another thing that will require some more investigation. It’s quite possible that the feature is disabled to prevent a type of attack (DoS) or just to prevent general abuse and waste of bandwidth/processing.

A simple convention is the $orderby option. It’s use is simple too and allows for chaining of various ascending and descending choices simply separated by commas.


and combined:

http://odata.netflix.com/Catalog/Titles?$orderby=ReleaseYear,Runtime desc,Name asc

I believe the browser will replace the spaces with the ‘%20‘ representation. If you make a mistake in the query (i.e. supplying an invalid order field) the error will look like:

No property ‘Title’ exists in type ‘System.Nullable`1

The last three I want to cover are; $filter, $top and $skip. They are related as they simply filter the returned resource. Skip and Top are quite simple and operate as you would expect (just like in LINQ).

Filter on the other hand is more advanced; it has all the logic operator options (==, !=, >=, and so on), all the arithmetic operators (add, sub, div, mult, mod) and precedence grouping (brackets). For the exact syntax and all the options see the specification.

I’ll just give a simple example that can be clicked through:

http://odata.netflix.com/Catalog/Titles?$filter=Runtime gt 3 and Runtime lt 90

I did say last, but as a bonus there’s a nice data reduction option to return only the selected called $select. Simply suffix an &$select=Name,Runtime set of params on any of the above queries and see the returned resource simplified to only the “data you want”.

For the lazy to modify here’s a click-able example:

http://odata.netflix.com/Catalog/Titles?$filter=Runtime gt 3 and Runtime lt 90&$select=Runtime,Name

NOTE: OData queries are case sensitive.

OData, AtomPub and JSON

Continuing my mini-series of looking into OData I thought I would cover off the basic structure of AtomPub and JSON. They are both formats that OData can deliver the requested resources (a collection of entities; e.g. products or customers).

For the most part there isn’t much difference in terms of data volume returned by AtomPub vs JSON, tho AtomPub being XML, is slightly more verbose (tags and closing tags) and referencing namespaces via xmlns. A plus for AtomPub for your OData service is ability to define the datatype as you’ll see below via m:type the example being an integer Edm.Int32. Whereas the lack of such features is a plus in a different way for JSON – it’s simpler, and a language such as JavaScript interprets the values of basic types (string, int, bool, array, etc).

I’m not attempting to promote one over the other, just saying that each can serve a purpose. If you’re after posts that discuss this is a more critical fashion, have a look at this post by Joe Gregorio.

What I do aim to show is that comparing the two side by side there’s only a slight difference, and based on what you’re intending to accomplish with processing said data the choice for format is up to you. If you’re just re-purposing some data on a web interface JSON would be a suitable choice. If you’re processing the data within another service first, making use of XDocument (C#.NET) would seem suitable.

There’s also a concept of ‘Deferred Content’ for both formats and it is achieved in a similar way through links. The objective being to conserve resources in processing and transmission by not transmitting the entire element tree on a request. In the comparisons below where there is a link to another URI that is content that has not been returned, the most obvious example is image data i.e. links to jpeg resrouces. OData has a URI command option called $expand that can force the inline return of the element data (this concept is called eager-loading). Have a look at my introductory post about the OData query options.

NOTE: In the examples that follow the returned result data is from the NetFlix OData service, I have stripped out some of the xmlns, and shortened/modified the urls in particular omitting http:// just so it fits better (less line wrapping).

So let us compare…

Yes that stuff that’s makes up web feeds.

Example from the NetFlix OData feed access via URL http://odata.netflix.com/Catalog/Titles

Atom Feed

<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<feed allThatOtherNameSpaceStuff="">
  <title type="text">Titles</title>
  <entry m:etag="abced">
    <title type="text">FullMovieTitle</title>
    <summary type="html">Your everyday regular movie</summary>
    <allTheOtherTags type="text">...</allTheOtherTags>
    <m:properties xmlns:m="severalNameSpaces">
      <d:Synopsis>Your everyday regular movie</d:Synopsis>
      <d:Runtime m:type="Edm.Int32">3600</d:Runtime>
      <d:BoxArt m:type="NetflixModel.BoxArt">

Yes that simple text used in JavaScript.

Example from the NetFlix OData feed access via URL http://odata.netflix.com/Catalog/Titles?$format=JSON

Javascript Object Notation

  "d" : 
    "results": [ { 
      "__metadata": { 
        "uri": "o.ntf.lx/Ctlog/Titles('movieName')", 
        "etag": "abcdef", 
        "type": "NetflixModel.Title", 
        "edit_media": "o.ntf.lx/Ctlog/Titles('mvName')/$value", 
        "media_src": "c.dn/boxshots/large/mnbx.jpg", 
        "content_type": "image/jpeg", 
      "Id": "movieName", 
      "Synopsis": "Your everyday regular movie"
      "Runtime": 3600
      "BoxArt": { 
        "__metadata": { 
            "type": "NetflixModel.BoxArt" }, 
            "SmallUrl": "http://c.dn/boxshots/m1bx.jpg"
    } ]

So What is OData?

I started my blog with the intent of having in depth series on WCF, well my focus shifted as other interesting Microsoft technology came my way at work and out of work. It’s not feasible for me at the moment to be able to dedicate a substantial amount of time to a specific topic. So I won’t be committing too much on OData, but I will be presenting an introductory level session of OData at the beginning of May so would like to get my thoughts and notes out here on my blog.

So let’s get started with OData.


Copy-pasting directly off the FAQ on the OData.org site is:

The Open Data Protocol (OData) is an open protocol for sharing data.

Hmm I’ve seen this kind, and this kind of idea before, but let’s keep reading…

It provides a way to break down data silos and increase the shared value of data by creating an ecosystem in which data consumers can interoperate with data producers in a way that is far more powerful than currently possible, enabling more applications to make sense of a broader set of data.

Ok, that sounds like a plan.

Every producer and consumer of data that participates in this ecosystem increases its overall value.

That last point has the marketing twist, which to me translates into; “unless there’s enough people using this protocol, it will only exist in obscurity.”

Only Joking!

Not to make this sound like an attack OData, it’s early days so let’s keep investigating.

One of the initial aspects of OData I find both interesting and commendable is the decision for the protocol to work through the standard HTTP request methods (verbs); GET, POST, PUT, DELETE. But my attention then quickly turned to access policies, security and authentication so I jumped ahead in my research and discovered the concept of Query Interceptors as part of implement an OData services that helps deal with this, I’ll go into detail on this in a future post.

Microsoft is putting some attention behind into OData, with it being featured in the day 2, MIX10 keynote presented by Doug Purdy. There’s also a mid May 2010 “roadshow” (awareness / day conference) happening across 8 locations (3 US, 2 EU, 2 Asia). I am looking forward to seeing the content that comes out of that.

Right now I’m in the process of dissecting the MIX10 Keynote OData segment, and the two other MIX10 sessions on OData. My objective is not only to learn a little bit about OData but also determine if it will assist in making some data as part of my day job more accessible.

open book

My attack plan for understanding the protocol concepts will be:

  • To first dig a little into the specification and the use of existing tech (ATOM and JSON), a summary post is here.
  • Create an OData service for part of complex system and large set of data.
  • Then hopefully being able to determine if it was worth it (so long as I get it working).

Along the way presenting findings here and at Melbourne user groups, so stay tuned.