Script and LINQ?

C# offers a nice syntax for LINQ. While it is not quite possible to use the same syntax in script, the concepts do carry over, and the interesting constructs can be implemented. This post provides a couple of examples, and a quick reference for the relevant script APIs that come into play...

One of the things I really like about LINQ is how well the concepts work against in-memory collections. It is not just a data API. Some time back, I got an email around simulating these concepts in script. Sure enough, barring actual syntax (which would require a language and script engine update), it turns out that a number of LINQ constructs translate almost directly into script... In fact some of the constructs are available natively in Mozilla. These can be conditionally added so they are available in IE as well. In addition, there are a few more constructs that need to be added for all browsers to get equivalents for a pretty good subset of LINQ features. I've prototyped these in the context of Script# by extending Array... check it if you want to play with them some more.

I'll base my examples and sample data on one of ScottGu's first LINQ posts from way back. To setup the sample, imagine that I have a Location class to represent a City, Country, and Distance from Seattle, and an array of those called allLocations.

public class Location {
    public string City;
    public string Country;
    public int Distance;
    public Location(string city, string country, int distance) { ... }
}
Location[] allLocations =
    new Location[] {
        new Location("London", "UK", 4789),
        new Location("Amsterdam", "Netherlands", 4869),
        new Location("Boston", "USA", 2488),
        new Location("San Francisco", "USA", 684),
        new Location("Nice", "France", 5428),
        new Location("Las Vegas", "USA", 872),
        new Location("Raleigh", "USA", 2363),
        new Location("Chicago", "USA", 1733),
        new Location("Helsinki", "Finland", 4771),
        new Location("Dublin", "Ireland", 4527),
        new Location("Charleston", "USA", 2421)
    };

I can now use a LINQ statement to extract some data (name and country) for cities whose names are longer than 6 characters.

var someLocations =
    from location in allLocations
    where location.City.Length > 6
    select new {
        City = location.City,
        Country = location.Country
    };

And I'd get a collection containing the desired cities. So now, let's look at the script equivalent. I basically have a JSON array of objects, and then I can use APIs on the script Array object to author the equivalent of from, where and select clauses of the LINQ statement above.

var allLocations = [
    { City: "London", Country: "UK", Distance: 4789 },
    { City: "Amsterdam", Country: "Netherlands", Distance: 4869 },
    { City: "Boston", Country: "USA", Distance: 2488 },
    { City: "San Francisco", Country: "USA", Distance: 684 },
    { City: "Nice", Country: "France", Distance: 5428 },
    { City: "Las Vegas", Country: "USA", Distance: 872 },
    { City: "Raleigh", Country: "USA", Distance: 2363 },
    { City: "Chicago", Country: "USA", Distance: 1733 },
    { City: "Helsinki", Country: "Finland", Distance: 4771 },
    { City: "Dublin", Country: "Ireland", Distance: 4527 },
    { City: "Charleston", Country: "USA", Distance: 2421 }
];

var someLocations =
  allLocations.filter(function(location) { return location.City.length > 6; })
              .map(function(location) {
                     return { City: location.City, Country: location.Country };
                   });

Simple enough?

Here is a full list of interesting functions on Array along with a brief description:

filter
Creates a new array with all elements of the source array for which the provided filtering function returns true. This is equivalent to the where clause in LINQ.
map
Creates a new array with the results of calling a provided function on every element of the source array. This is equivalent to the select clause of LINQ.
groupBy
Creates a new array of tuples consisting of a key, and group of matching items, bucketed by calling the provided key generator function. This is equivalent to the groupby clause of LINQ.
sort
Sorts the elements of an array using the default sort function, or a custom compare callback function. This enables implementing the orderby clause of LINQ.
aggregate
Reduces an array to a single value using the provided aggregation function (for example, to compute the sum of all elements of the array). This is equivalent to the Fold method introduced by LINQ.
extract
Creates a new array of elements from the specified range within the source array. This is similar to the Take and Skip extensions provided by LINQ.
index
Creates a new dictionary mapping keys generated by a key generator function to the corresponding array element from the source array. This is similar to the ToDictionary extension method provided by LINQ.
forEach
Calls the specified callback once for each element of the array.
every
Checks all elements of the array satisfy the test implemented by the specified filter function.
some
Checks if at least one element of the array satisfies the test implemented by the specified filter function.

Of all these functions, filter, forEach, map, every and some are natively provided by Mozilla. The rest are extensions provided by Script#.

Let's try something more complex that involves a many more of these building blocks composed together. Lets write a statement that returns the top 2 groups of cities of cities that are the farthest away from Seattle, where the grouping is done by country. Each item in the final array contains the country, a list of cities, and the sum of the corresponding distances.

Here is the c# version along with the dump of the result to the console:

var data =
    (from location in allLocations
     group location by location.Country into locationGrouping
     select new {
         Country = locationGrouping.Key,
         Cities = from location in locationGrouping select location.City,
         TotalDistance = locationGrouping.Sum(location => location.Distance)
     })
    .OrderByDescending(locationGroup => locationGroup.TotalDistance)
    .Take(2);
ObjectDumper.Write(data, 2);

Country=USA     Cities=...      TotalDistance=10561
  Cities: Boston
  Cities: San Francisco
  Cities: Las Vegas
  Cities: Raleigh
  Cities: Chicago
  Cities: Charleston
Country=France  Cities=...      TotalDistance=5428
  Cities: Nice

Now let's do the same in script, along with a dump to the debug console:

var data =
  allLocations
    .groupBy(function(location) {
               return location.Country;
             })
    .map(function(locationGrouping) {
           return {
             Country: locationGrouping.key,
             Cities: locationGrouping.map(function(location) {
                                            return location.City;
                                          }),
             TotalDistance:
               locationGrouping.aggregate(0, function(sum, location) {
                                               return sum + location.Distance;
                                             })
             };
         })
    .sort(function(group1, group2) {
            return group2.TotalDistance - group1.TotalDistance;
          })
    .extract(0, 2);
Debug.dump(data);

Array: {Array}
  [0]: {Object}
    Country: USA
    Cities: {Array}
      [0]: Boston
      [1]: San Francisco
      [2]: Las Vegas
      [3]: Raleigh
      [4]: Chicago
      [5]: Charleston
    TotalDistance: 10561
  [1]: {Object}
    Country: France
    Cities: {Array}
      [0]: Nice
    TotalDistance: 5428

As you can see, the result is the same which is goodness... but theres almost a one-to-one correspondence.

What do you think? If you got back an array of objects say from a JSON Web service call, would these be useful to shape the data for the purposes of data-binding? Admittedly this sample is a bit contrived, and hence somewhat complex, but it is definitely interesting how the individual APIs compose. A more realistic sample might use a smaller set of APIs at any given point.


[ Tags: | | | ]
Posted on Thursday, 12/21/2006 @ 6:55 PM | #Projects


Comments

12 comments have been posted.

vikram

Posted on 12/21/2006 @ 9:20 PM
HI

This is really cool. I love your posts. Hope fully you will be more frequent. Great work

Huw

Posted on 12/22/2006 @ 1:35 AM
Hi, Nikhil

I agree with Vikram that this is really cool. I think it would be awesome if Script# supported the System.Query and supporting namespaces to make Linq to in memory collections possible. Linq is my favorite addition to the C# language I use it all the time.

Keep up the good work.

张家福

Posted on 12/22/2006 @ 8:18 PM
I Love Script#,支持....

Eddy Recio

Posted on 12/23/2006 @ 6:10 PM
Hey Nikhil,

Script# is great (fill in many more complements here...). I downloaded v 0.2.1.0 and now I seem to receive the following compiler error "Index properties are not supported". The problem is that the errors are on the Core libraries of Script#. Even the Samples projects suffer from this error now. So I am not sure if I am experiencing an upgrade, set up or configuration issue. Perhaps this is a bug that slipped in on the last release of the compiler. Feel free to contact me if you would like to take this offline.

Eddy

PS. I noticed the documentation previously stated that “Index Properties” were not supported, yet some of you libraries had index properties. I assumed that they were supported in the “System Assemblies” i.e. write your own JavaScript to support it.

Nikhil Kothari

Posted on 12/24/2006 @ 10:08 AM
Eddy, yes, I noticed the index properties bug - I'll fix it asap - hopefully I'll put up a new build later today.

zproxy

Posted on 12/28/2006 @ 5:21 AM
Yes this looks nice.

But, do how about the yield return statement in your script# project?

Any plans with that?

I am trying to decompile the IL of it, and some prototypes exist which do indicate it is possible to express the same functionality within javascript to a certain extent.

Note that LINQ makes heavy use of yield statements.

Ron Buckton

Posted on 12/28/2006 @ 1:15 PM
I have something similar at www.codeplex.com/AjaxEnhancements currently. The AjaxEnhancements project is an sister project to the Ajax Control Toolkit for some features I'd like to see in the core but weren't in-line with the rest of the toolkit's purpose. I wrote an extensive Collections and Query library for javascript and ASP.NET AJAX that provides a lot of the same features on the client side.

Ron Buckton

Posted on 12/28/2006 @ 1:19 PM
Does your client-script implementation of LINQ support deferred execution? I did that with AjaxEnhancements but the Mozilla built-ins do immediate execution as map/etc return arrays. I had to create IEnumerable/IEnumerator "interfaces" using the ASP.NET AJAX model and a custom $foreach global method to sort out all of the requirements to get something like this to work.

Nikhil Kothari

Posted on 12/29/2006 @ 10:47 AM
Ron, thanks for sharing.

The approach I took does not defer query execution. My goal was to have it fit naturally with existing array methods, use the native Mozilla methods when possible. Deferring helps if the mainline scenario is to compose several queries. It gets in the way when you just want to do one or two. Its possible the latter is more likely in script. Additionally, since you're going to fetch the relevant data to begin with for performance reasons, it is probably quite reasonable not to defer execution. This helps reduce complexity of implementation and correspondingly script size as well which is goodness. The primary goal wasn't to replicate LINQ in its entirety, but to provide equivalents I thought had most bang for the buck.

I also did not want to write my own List class, and other collection types as I wanted the LINQ-like APIs to work with any array including those returned from various DOM APIs... hence defining my own collection classes was not an option. Incidently, this too helps contain the amount of script and number of script types defined down to minimum. This is key in my mind... this type of infrastructure functionality shouldn't become a huge part of an app in itself... you need all the bytes you can save for use in implementing app features.

Nikhil Kothari

Posted on 12/29/2006 @ 10:57 AM
zproxy, theres an approach to implementing yield by implementing an FSM in your generated script, but it also requires implementing something akin to an IL interpreter in script, which is going to possibly be slower, as well as require more script (see my above comment on script size philosophy around infrastructure code). Most yield usage can actually be implemented in alternative ways, often just as usable and readable, if not more. Personally, I am not sure if yield is a must-have... I tend to think its a nice-to-have.

Dimitris Foukas

Posted on 1/15/2007 @ 6:54 AM
Hi Nikhil,

I think that the main attraction of the LINQ "style" (as opposed to LINQ the API) is this ability to delay the decision of translating the query expression to the appropriate api at binding time. For example, is it possible to express HTML DOM manipulation in terms of LINQ combinators in Script# today? Even better, since Javascript is already a HO FP language, can we do that with a library leveraging for example jQuery? That is the direction I would like this work to evolve to....

Regards,

Dimitris

Justin

Posted on 5/3/2007 @ 10:13 AM
I have been hearing a lot about LINQ lately. You got me interested enough to go check it out. Thanks.
Post your comment and continue the discussion.