7 LINQ Tricks to Simplify Your Programs

Ever since I learned about LINQ, I keep discovering new ways to use
it to improve my code. Every trick makes my code a little bit faster to
write, and a little bit easier to read.

This posting summarizes some of the tricks that I came across. I will show you how to use LINQ to:

If you have your own bag of LINQ tricks, please share them in the comments!

1. Initialize an array 

Often, you need to initialize elements of an array to either the
same value, or to an increasing sequence values, or possibly to a
sequence increasing or decreasing by a step different from one. With
LINQ, you can do all of this within the array initializer – no for
loops necessary!

In the following code sample, the first line initializes a to an
array of length 10 with all elements set to -1, the second line
initializes b to (0,1,..9), and the third line initializes c to
(100,110,…,190):

int[] a = Enumerable.Repeat(-1, 10).ToArray();
int[] b = Enumerable.Range(0, 10).ToArray();
int[] c = Enumerable.Range(0, 10).Select(i => 100 + 10 * i).ToArray();

A word of caution: if you are initializing large arrays, you may
want to forego the elegance and use the old-fashioned for loop instead.
The LINQ solution will grow the array dynamically, so garbage arrays
will need to be collected by the runtime. That said, I use this trick
all the time when initializing small arrays, or in testing/debugging
code.

2. Iterate over multiple arrays in a single loop 

A friend asked me a C# question: is there a way to iterate over
multiple collections with the same loop? His code looked something like
this:

foreach (var x in array1) {
DoSomething(x);
}
foreach (var x in array2) {
DoSomething(x);
}

In his case, the loop body was larger, and he did not like the
duplicated code. But, he also did not want to allocate a new array to
hold elements from both array1 and array2.

LINQ provides an elegant solution to this problem: the Concat
operator. You can rewrite the above two loops with a single loop as
follows:

foreach (var x in array1.Concat(array2)) {
DoSomething(x);
}

Note that since LINQ operates at the enumerator level, it will not
allocate a new array to hold elements of array1 and array2. So, on top
of being rather elegant, this solution is also space-efficient.

3. Generate a random sequence 

This is a simple trick to generate a random sequence of length N:

Random rand = new Random();
var randomSeq = Enumerable.Repeat(0, N).Select(i => rand.Next());

Thanks to the lazy nature of LINQ, the sequence is not pre-computed
and stored in an array, but instead random numbers are generated
on-demand, as you iterate over randomSeq.

4. Generate a string 

LINQ is also a nice tool to generate various kinds of strings. I
found this quite useful to generate strings for testing and debugging
purposes.

Let’s say that you want to generate a string with the repeating
pattern "ABCABCABC…" of length N. Using LINQ, the solution is quite
elegant:

string str = new string(
Enumerable.Range(0, N)
.Select(i => (char)('A' + i % 3))
.ToArray());

[EDIT] Petar Petrov suggested another interesting way to
generate strings with LINQ. His approach applies to different scenarios
than my solution above:

string values = string.Join(string.Empty, Enumerable.Repeat(pattern, N).ToArray());

5. Convert sequences or collections 

One thing you cannot do in C# or VB is to cast a sequence of type T
to a sequence of type U, even if T us a derived class from U. So, you
cannot just simply cast List<string> to List<object>. (For
an explanation why, see Bick Byers’ posting).

But, if you are trying to convert IEnumerable<T> to
IEnumerable<U>, LINQ has a simple and efficient solution for you:

IEnumerable<string> strEnumerable = ...;
IEnumerable<object> objEnumerable = strEnumerable.Cast<object>();

If you need to convert List<T> to List<U>, there is also a simple LINQ solution, but it involves copying the list:

List<string> strList = ...;
List<object> objList = strList.Cast<object>().ToList();

6. Convert a value to a sequence of length 1 

When you need to convert a single value to a sequence of length 1,
what do you do? You could construct an array of length 1, but I prefer
the LINQ Repeat operator:

IEnumerable<int> seq = Enumerable.Repeat(myValue, 1);

7. Iterate over all subsets of a sequence 

Sometimes it is useful to iterate over all subsets of an array. This
situation arises quite frequently in brute-force solutions to hard
problems. For small inputs, subset sum, boolean satisfiability and the knapsack problem can all be solved easily by iterating over all subsets of some sequence.

In LINQ, we can generate all subsets of array arr as follows:

T[] arr = ...;
var subsets = from m in Enumerable.Range(0, 1 << arr.Length)
select
from i in Enumerable.Range(0, arr.Length)
where (m & (1 << i)) != 0
select arr[i];

Note that if the number of subsets overflows an int, the above code
will not work. So, only use it if you know that the length of arr is at
most 30. If the length of arr is greater than 30, chances are that you
don’t want to iterate over all of its subsets anyway because it is
going to take minutes or more.

 

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

How To Query Data with Parallel LINQ

This post shows a simple way to write code that takes advantage of
multiple processors. You will see that LINQ queries can allow you to
side step the difficult tasks normally involved in writing
multi-threaded code. To get started, all you need is a little basic
knowledge of how to write simple LINQ queries.

The code shown
in this post uses a pre-release version of PLINQ called the Microsoft
Parallel Extensions to .NET Framework 3.5. When PLINQ finally ships, it
will run only on .NET 4.0 or later. The version I'm using that runs on
top of 3.5 is for evaluation purposes only. There will never be a
shipping version that runs on .NET 3.5.

This LINQ provider is
being created at Microsoft by the Parallel Computing team; it is not
the work of the C# team that created LINQ to Objects and LINQ to SQL.
Here is the website for the Parallel Computing team:

http://msdn.microsoft.com/en-us/concurrency/

At
the time of this writing, these extensions were available only in
pre-release form. You could download them either as Visual Studio 2008
compatible extensions to .NET 3.5, or as part of the pre-release
version of Visual Studio 2010. Since the download sites might change
over the coming months, I suggest that you find these resources by
going to the Parallel Computing site, or to the Visual Studio site:

http://msdn.microsoft.com/en-us/vs2008

Parallel
LINQ, or PLINQ, is only a small part of the Parallel Extensions to the
.NET Framework. It is, however, an important part. Since it is a simple
and natural extension of the LINQ syntax, I think developers familiar
with that technology will find it easy to use.

Consider this code:

var list = Enumerable.Range(1, 10000);
var q = from x in list.AsParallel()
where x < 3300
select x;
foreach (var x in q)
{
Console.WriteLine(x);
}

These lines look nearly identical to the code you have seen in many
simple LINQ samples. The only significant difference is the call to AsParallel
at the end of the first line. Though we have often used type inference
to hide the return type of a LINQ query, I'm going to pause and take a
second look at this instance. Rather than returning
IEnumerable<T>, this version of PLINQ returns IParallelEnumerable<int>:

IParallelEnumerable<int> q = from x in list.AsParallel() etc….

In the near future, PLINQ queries of this type will probably return ParallelQuery<int>.
Because this product is still evolving, it might be simplest to use
var, at least during the pre-release phase, and let the compiler choose
the type. That way you can save typing, avoid problems with anonymous
types, and you need not concern yourself about changes in the API as
the product develops. It is almost always appropriate to use var to designate the return type of a LINQ query, and there are only special circumstances when you would do otherwise.

Here are the results from this first PLINQ query:

2
1
3
4
6
512
5
7
513
8
12
514
9
13
515
10
14
516
11
15
517
16
72
518
17

The numbers shown here are in a relatively random order because they
are being returned from different threads. It is important to remember
that the sequence of values returned by LINQ is not always guaranteed
to be presented in a particular order. If Order is important in your
code, you can add a call to AsOrdered to the query after the call to AsParallel. Alternatively, you could insert a GroupBy
clause to establish the desired ordering. Otherwise developers should
assume that the ordering from a PLINQ query will be entirely random

Now that you understand the basics of Parallel LINQ, let’s
move on to look at a more interesting example. Improved performance is
the main reason to write code that can run in parallel. The program
shown in this post uses a timer to demonstrate how PLINQ can improve
performance in a program.

Performance improvements become more evident when our code has
access to more processors. The code I show here runs faster on a two
processor machine, but it really starts to come into its own on a four
processor machine. Moving up to even more processors yields more
powerful results. Here, for instance, are the results showing an
improvement of 1.33 times when using two processors, and almost two
times when using 4 processors:

2 Processors = 1.44 x improvement:
Linear: 00:00:13.15
Parallels: 00:00:09.10
4 Processors = 1.96 x improvement:
Linear: 00:00:15.00
Parallel: 00:00:07.68

These tests are being running against pre-release software, so these
numbers are almost certain to change before release, and of course
different machines will yield different results. Furthermore, the
degree of improvement that you see is likely to change depending on the
type of algorithm you run, the number of cores on your machine, the
architecture of the machine, how many caches there are and how they’re
laid out, etc. Though it is rare, some queries show superlinear
performance enhancements. In other words, there is a greater than 4x
speedup on a 4-core box. An improvement of 2 times, such as the one
shown, or even a 3 time improvement, is common.

This sample program is called FakeWeatherData, and it is available for download from the LINQ Farm on Code Gallery.
It features a simple LINQ to XML query run against a file with 10,000
records in it. The data I'm querying is not real, but consists of
random dates and temperatures generated by a simple algorithm included
in the FakeWeatherData program.

The XML file is structured like this:

<?xml version="1.0" encoding="utf-8" ?>
<Samples>
<Sample>
<Year>1973</Year>
<Month>May</Month>
<Day>15</Day>
<Temperature>10</Temperature>
</Sample>
<Sample>
<Year>1970</Year>
<Month>Feb</Month>
<Day>10</Day>
<Temperature>14</Temperature>
</Sample>
<Sample>
<Year>1970</Year>
<Month>Jan</Month>
<Day>15</Day>
<Temperature>11</Temperature>
</Sample>
  ... Many lines of code omitted here
</Samples>

There is also a simple C# class used by the program to encapsulate the data from the XML file:

class WeatherData
{
public string Year { get; set; }
public string Month { get; set; }
public string Day { get; set; }
public string Temperature { get; set; }
}

The parallel version of the query in the program looks like this:

for (int i = 0; i < NUM_REPS; i++)
{
var list = (from x in doc.Root.Elements("Sample").AsParallel()
where x.Element("Year").Value == "1973" &&
x.Element("Month").Value == "Apr" &&
x.Element("Day").Value == "15"
                select new WeatherData
                {
Day = x.Element("Day").Value,
Month = x.Element("Month").Value,
Temperature = x.Element("Temperature").Value,
Year = x.Element("Year").Value
}).ToList();
}

Accompanying this code is a similar LINQ query that does not use PLINQ

for (int i = 0; i < NUM_REPS; i++)
{
var list = (from x in doc.Root.Elements("Sample")
where x.Element("Year").Value == "1973" &&
x.Element("Month").Value == "Apr" &&
x.Element("Day").Value == "15"
                select new WeatherData
                {
Day = x.Element("Day").Value,
Month = x.Element("Month").Value,
Temperature = x.Element("Temperature").Value,
Year = x.Element("Year").Value
}).ToList();
}

The program queries the data in the XML file first using the
Parallel code, then using standard LINQ. By comparing the time it takes
each block of code to execute you can get a sense of the relative
improvement available through PLINQ. I'll show you how to make such
comparisons in just a moment. I will also discuss some tools that will
become available to help profile code of this type.

You can see that the PLINQ query contains a call to AsParallel,
while the other query does not. Other than that the two queries are
identical. The fact that the two queries look so much alike points to a
primary strength of PLINQ: very little specialized knowledge is
necessary in order to begin using it. This does not mean that the
subject is trivial, but only that the barrier to entry is low. This is
not the case with most concurrent programming models.

LINQ queries are designed to be read-only, working with
immutable data. This is a good model for parallelism, because it makes
it unlikely that data will mutate, thereby setting up the potential for
a race condition. You should note, however, that PLINQ does nothing to
prevent this from happening, it is simply that LINQ is designed to make
it unlikely.

Note also that the declarative LINQ programming style ensures
that developers specify what they want done, rather than how it should
be done. This leaves PLINQ free to ensure that concurrent LINQ queries
run in the safest manner possible. If LINQ had been defined more
strictly, such that it had to process each element in a certain order,
then the PLINQ team would have had a much more difficult task.

The code in both these queries pulls out only the records from
the XML file that have their date set to April 15, 1973. Because of
deferred execution, the query would not do anything if I did not call
ToList(). As a result, I added that call and converted the result into
a List<WeatherData>. Though hardly earthshaking in import, these
calls ensure that the code actually does something, and thus gives
PLINQ scope to take advantage of the multiple processers on your
system.

Simple timers are created to measure the difference between
the standard LINQ query and the PLINQ query. I've also used a method
used in many of Parallel LINQ team's samples for displaying the time
elapsed during a test run:

private static void RunTest()
{
XDocument doc = XDocument.Load("XMLFile1.xml");
Stopwatch sw = new Stopwatch();
sw.Start();
LinqOrdinarie(doc);
sw.Stop();
ShowElapsedTime("Ordinaire", sw.Elapsed);
sw.Reset();
sw.Start();
ParallelLinq(doc);
sw.Stop();
ShowElapsedTime("Parallels", sw.Elapsed);
}
private static TimeSpan ShowElapsedTime(string caption, TimeSpan ts)
{
string elapsedTime = String.Format("{0}: {1:00}:{2:00}:{3:00}.{4:00}",
caption, ts.Hours, ts.Minutes, ts.Seconds,
ts.Milliseconds / 10);
Console.WriteLine(elapsedTime, "RunTime");
return ts;
}

At least with the pre-release version of PLINQ that I've played
with, I've found it very useful to set up timers to confirm that PLINQ
is actually able to speed up an operation. My record at guessing which
code will benefit from running in parallel is not good, and so I find
that confirming the effectiveness of the code by explicitly measuring
it is worthwhile. You can either use the simple StopWatch class from
the System.Diagnostics namespace, as shown here, or else you can use a
profiler. Note that a thread aware profiler might ship with some
versions of Visual Studio 2010.

I've found that the advantages of concurrent LINQ become more
obvious the longer the operation I'm timing lasts. As a result, I've
placed the query inside a loop, and added a variable to the program
called NUM_REPS. By setting NUM_REPS to a large number, say 500, you
can clearly see the benefits that can be accrued when you run LINQ
queries in parallel on multiple processors. Note that the first time
PLINQ is used, its assembly will need to be loaded, the relevant types
will need to be JIT compiled, and new threads will need to be spun up,
etc. As a result, many developers see improved performance after they
get past the initial warm-up time.

Though it is very easy to get started with PLINQ, there are
still complexities inherent in the subject that you need to consider.
For instance, PLINQ will sometimes develop a different partitioning
scheme for your data depending on whether you are working with an
Enumerable or an Array. To learn more about this subject, see the
following post from the Parallel Programming team:

http://blogs.msdn.com/pfxteam/archive/2007/12/02/6558579.aspx

The simple PLINQ examples shown in this post should help you get
started with this powerful and interesting technology. Parallel LINQ is
still in its infancy, but already it provides means of greatly
simplifying tasks that are not normally easy to perform.

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

How to Use Compiled Queries in Linq to Sql for High Demand ASP.NET Websites

If you are using Linq to SQL, instead of writing regular Linq Queries, you should be using Compiled Queries.
if you are building an ASP.NET web application that’s going to get
thousands of hits per hour, the execution overhead of Linq queries is
going to consume too much CPU and make your site slow. There’s a
runtime cost associated with each and every Linq Query you write. The
queries are parsed and converted to a nice SQL Statement on *every*
hit. It’s not done at compile time because there’s no way to figure out
what you might be sending as the parameters in the queries during
runtime. So, if you have common Linq to Sql statements like the
following one throughout your growing web application, you are soon
going to have scalability nightmares:

var query = from widget in dc.Widgets
where widget.ID == id && widget.PageID == pageId
select widget;
var widget = query.SingleOrDefault();

There’s a nice blog post by JD Conley that shows how evil Linq to Sql queries are:

1 How to Use Compiled Queries in Linq to Sql for High Demand ASP.NET Websites

You see how many times SqlVisitor.Visit is called to convert
a Linq Query to its SQL representation? The runtime cost to convert a
Linq query to its SQL Command representation is just way too high.

Rico Mariani has a very informative performance comparison of regular Linq queries vs Compiled Linq queries performance:

image

Compiled Query wins on every case.

So, now you know about the benefits of compiled queries. If you are
building ASP.NET web application that is going to get high traffic and
you have a lot of Linq to Sql queries throughout your project, you have
to go for compiled queries. Compiled Queries are built for this
specific scenario.

In this article, I will show you some steps to convert regular Linq
to Sql queries to their Compiled representation and how to avoid the
dreaded exception “Compiled queries across DataContexts with different LoadOptions not supported.”

Here are some step by step instruction on converting a Linq to Sql query to its compiled form:

First we need to find out all the external decision factors in a
query. It mostly means parameters in the WHERE clause. Say, we are
trying to get a user from aspnet_users table using Username and Application ID:

Query to get a user from aspnet_users table

Here, we have two external decision factor – one is the Username and
another is the Application ID. So, first think this way, if you were to
wrap this query in a function that will just return this query as it
is, what would you do? You would create a function that takes the DataContext (dc named here), then two parameters named userName and applicationID, right?

So, be it. We create one function that returns just this query:

Converting a LInq Query to a function

Next step is to replace this function with a Func<> representation that returns the query. This is the hard part. If you haven’t dealt with Func<> and Lambda expression before, then I suggest you read this and this and then continue.

So, here’s the delegate representation of the above function:

Creating a delegate out of Linq Query 

Couple of things to note here. I have declared the delegate as static readonly
because a compiled query is declared only once and reused by all
threads. If you don’t declare Compiled Queries as static, then you
don’t get the performance gain because compiling queries everytime when
needed is even worse than regular Linq queries.

Then there’s the complex Func<DropthingsDataContext, string, Guid, IQueryable<aspnet_User>> thing. Basically the generic Func<> is declared to have three parameters from the GetQuery function and a return type of IQueryable<aspnet_User>. Here the parameter types are specified so that the delegate is created strongly typed. Func<> allows up to 4 parameters and 1 return type.

Next comes the real business, compiling the query. Now that we have the query in delegate form, we can pass this to CompiledQuery.Compile
function which compiles the delegate and returns a handle to us.
Instead of directly assigning the lambda expression to the func, we
will pass the expression through the CompiledQuery.Compile function.

Converting a Linq Query to Compiled Query

Here’s where head starts to spin. This is so hard to read and
maintain. Bear with me. I just wrapped the lambda expression on the
right side inside the CompiledQuery.Compile function. Basically that’s the only change. Also, when you call CompiledQuery.Compile<>, the generic types must match and be in exactly the same order as the Func<> declaration.

Fortunately, calling a compiled query is as simple as calling a function:

Running Compiled Query

There you have it, a lot faster Linq Query execution. The hard work
of converting all your queries into Compiled Query pays off when you
see the performance difference.

Now, there are some challenges to Compiled Queries. Most common one
is, what do you do when you have more than 4 parameters to supply to a
Compiled Query? You can’t declare a Func<> with more than 4 types. Solution is to use a struct to encapsulate all the parameters. Here’s an example:

Using struct in compiled query as parameter

Calling the query is quite simple:

Calling compiled query with struct parameter

Now to the dreaded challenge of using LoadOptions with Compiled Query. You will notice that the following code results in an exception:

Using DataLoadOptions with Compiled Query

 

The above DataLoadOption runs perfectly when you use regular
Linq Queries. But it does not work with compiled queries. When you run
this code and the query hits the second time, it produces an exception:

Compiled queries across DataContexts with different LoadOptions not supported

A compiled query remembers the DataLoadOption once its called. It does not allow executing the same compiled query with a different DataLoadOption again. Although you are creating the same DataLoadOption with the same LoadWith<>
calls, it still produces exception because it remembers the exact
instance that was used when the compiled query was called for the first
time. Since next call creates a new instance of DataLoadOptions, it does not match and the exception is thrown. You can read details about the problem in this forum post.

The solution is to use a static DataLoadOption. You cannot create a local DataLoadOption instance and use in compiled queries. It needs to be static. Here’s how you can do it:

image

 

Basically the idea is to construct a static instance of DataLoadOptions using a static function. As writing function for every single DataLoadOptions
combination is painful, I created a static delegate here and executed
it right on the declaration line. This is in interesting way to declare
a variable that requires more than one statement to prepare it.

Using this option is very simple:

image

Now you can use DataLoadOptions with compiled queries.

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

How to Debug LINQ Queries

Debugging LINQ queries can be problematic.  One
of the reasons is that quite often, you write a large query as a single
expression, and you can’t set a breakpoint mid-expression.  Writing
large queries in expression context is particularly powerful when using
functional construction to form XML (or using the strongly typed DOM in
Open XML SDK V2).  This post presents a little
trick that makes it easier to use the debugger with LINQ queries that
are written using ‘method syntax’.

The gist of the technique is to insert a call to the Select extension method in the middle of your query.  You code the Select so that it projects exactly the same results as its source, but using a statement lambda expression  If
you place the return statement of the lambda expression on its own
line, you can set a breakpoint and examine values as they make their
way through the query.

The
following is a query to split a string into words, convert the words to
lower case, count the number of occurances of each word, and return the
ten most used words:

var uniqueWords = text

    .Split(' ', '.', ',')

    .Where(i => i != "")

    .Select(i => i.ToLower())

    .GroupBy(i => i)

    .OrderByDescending(i => i.Count())

    .Select(i => new { Word = i.Key, Count = i.Count() })

    .Take(10);

 

But if you set a breakpoint anywhere on the query, you see this:

LINQ Debug

Let’s say that you want to examine each group and see the group key and the count of items for each group.  You can insert a Select statement, as follows:

var uniqueWords = text

    .Split(' ', '.', ',')

    .Where(i => i != "")

    .Select(i => i.ToLower())

    .GroupBy(i => i)

    .Select(z => {

        return z;

    })

    .OrderByDescending(i => i.Count())

    .Select(i => new { Word = i.Key, Count = i.Count() })

    .Take(10);

 

you
can now set a breakpoint on the ‘return z’ statement, and examine each
group in turn as they are yielded up by the GroupBy extension method:

Debug LINQ Statements

You can see that the key is “on”, and that there are four items in the group.

After you are done debugging, you can remove the added call to Select.

Alternatively,
you could convert any of the other lambda expressions to a statement
lambda expression, format code so that a statement is on its own line,
and then set a breakpoint.

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

Using LINQ to Query Excel Tables

Excel has a very cool feature where you can declare that a range of cells is a table.  It is a feature that allows you to use Excel very much like a database.  You can add new rows as necessary, sort the table by columns, do some simple filtering, calculate the sum of columns, and more.  Each table has a unique table name, and each column has a column name.  Because
these tables are stored in Open XML documents, we can implement some
simple extension methods and some classes so that we can query these
tables using LINQ in a manner that is similar to querying a SQL
database.  This post presents a bit of code to do this.  The code and sample documents are attached to this post (
LtxOpenXml.zip (169.21 kb).

Note: this code is presented as an example – a proof-of-concept.  This code could be further optimized, so that it performs better (although it performs quite well as is).  And
it may be interesting in the future to modify the code to use a
strongly-typed approach – as the code is currently implemented, if you
misspell a table or column name, the code throws an exception.  However, this code is useful as is for doing ad-hoc queries of Excel tables.  (I certainly will be using it!
)

This code uses the Open XML SDK, either V1, or the CTP of V2.  You can download V1 of the SDK here.  You can download CTP1 of V2 of the SDK here.

Following is a screen clipping of an Excel spreadsheet that contains a table:

 Using LINQ to Query Excel Tables

You can see the four columns of this table: Item, Qty, Price, and Extension.  In addition, in the Design tab of the ribbon, in the far left box, you can see that this table has a table name of “Inventory”.  Using the code presented in this post, you can query this table as follows:

var query =

    from i in spreadsheet.Table("Inventory").TableRows()

    where (int)i["Qty"] > 2

    select i;

 

foreach (var r in query)

{

    Console.WriteLine(r["Item"]);

    Console.WriteLine(r["Qty"]);

    Console.WriteLine(r["Price"]);

    Console.WriteLine(r["Extension"]);

    Console.WriteLine();

}

 

When you run this code, it produces:

Book

44

2

88

 

Phone

4

10

40

 

As
you can see from the above code, to access a particular column from a
table row, you can use a default indexed property, passing the name of
the column:

Console.WriteLine(r["Item"]);

Console.WriteLine(r["Qty"]);

Console.WriteLine(r["Price"]);

Console.WriteLine(r["Extension"]);

 

This allows us to write code that is easy to read.

The table class (returned by the Table method) has a TableColumns method that iterates the columns in the table:

// list all of the columns in the Inventory table

Console.WriteLine("Table: Inventory");

foreach (var c in spreadsheet.Table("Inventory").TableColumns())

    Console.WriteLine("  {0}", c.Name);

 

When you run this code, you see:

Table: Inventory

  Item

  Qty

  Price

  Extension

 

The LtxOpenXml Namespace

Some time ago, I wrote some code that enabled querying Open XML spreadsheets using LINQ to XML, presented in the blog post ‘Open XML SDK and LINQ to XML’.  I’ve added the code to query tables to the code presented in that post.  The extension methods that enable querying tables make use of that code.  The enhanced LtxOpenXml namespace now contains code for:

  • Querying word processing documents
  • Querying spreadsheets
  • Querying tables contained in spreadsheets

The code for querying word processing documents and spreadsheets is unmodified.  Refer to the above mentioned blog post for details on using those extension methods.

The code that enables querying of spreadsheet tables is, of course, written in the pure functional style.  No state is maintained, and all methods to query the document are lazy.

If
you have questions about how to write functional code (like the code
that implements the extension methods and classes associated with this
post), go through this
Functional Programming Tutorial.

I’ve provided a summary of the types and extension methods included in the LtxOpenXml namespace at the end of this post.

Use of Data Types

Here’s another example of a table that contains a few more columns with more data types:

 Using LINQ to Query Excel Tables

Each row returned by the TableRows method is a collection of TableCell objects.  I’ve
defined explicit conversions between TableCell and some of the most
common .NET types, so that you can simply cast a TableCell to your
desired type.  Here’s a query to list all vehicles in the table:

// list all vehicles

var q = from c in spreadsheet.Table("Vehicles").TableRows()

        select new VehicleRecord()

        {

            Vehicle = (string)c["Vehicle"],

            Color = (string)c["Color"],

            Year = (int)c["Year"],

            HorsePower = (int)c["HorsePower"],

            Cost = (decimal)c["Cost"],

            AcquisitionDate = (DateTime)c["AcquisitionDate"],

            ExecutiveUseOnly = (bool)c["ExecutiveUseOnly"]

        };

 

Console.WriteLine("List of all vehicles");

PrintVehicles(q);

Console.WriteLine();

 

I’ve written a PrintVehicles method:

public static void PrintVehicles(IEnumerable<VehicleRecord> list)

{

    int[] tabs = new[] { 12, 10, 6, 6, 10, 14, 10 };

    foreach (var z in list)

        Console.WriteLine("{0}{1}{2}{3}{4}{5}{6}",

            z.Vehicle.PadRight(tabs[0]),

            z.Color.PadRight(tabs[1]),

            z.Year.ToString().PadRight(tabs[2]),

            z.HorsePower.ToString().PadRight(tabs[3]),

            z.Cost.ToString().PadRight(tabs[4]),

            ((DateTime)z.AcquisitionDate).ToShortDateString()

                .PadRight(tabs[5]),

            ((bool)z.ExecutiveUseOnly).ToString()

                .PadRight(tabs[6]));

}

 

When you run the above query, you see:

List of all vehicles

Pickup      White     2002  165   23000     2/22/2002     False

Pickup      Red       2004  185   32000     10/21/2004    False

Sports Car  Red       2003  165   23000     1/1/2004      True

Sedan       Blue      2005  200   21000     2/25/2005     False

Limo        Black     2008  440   72000     4/1/2008      True

 

You can query for all executive vehicles, like this:

// list all executive vehicles

q = from c in spreadsheet.Table("Vehicles").TableRows()

        where (bool)c["ExecutiveUseOnly"] == true

        select new VehicleRecord()

        {

            Vehicle = (string)c["Vehicle"],

            Color = (string)c["Color"],

            Year = (int)c["Year"],

            HorsePower = (int)c["HorsePower"],

            Cost = (decimal)c["Cost"],

            AcquisitionDate = (DateTime)c["AcquisitionDate"],

            ExecutiveUseOnly = (bool)c["ExecutiveUseOnly"]

        };

 

You can write queries that select on data types such as DateTime:

// list all vehicles acquired after 2004

q = from c in spreadsheet.Table("Vehicles").TableRows()

    where (DateTime)c["AcquisitionDate"] >= new DateTime(2004, 1, 1)

    select new VehicleRecord()

    {

        Vehicle = (string)c["Vehicle"],

        Color = (string)c["Color"],

        Year = (int)c["Year"],

        HorsePower = (int)c["HorsePower"],

        Cost = (decimal)c["Cost"],

        AcquisitionDate = (DateTime)c["AcquisitionDate"],

        ExecutiveUseOnly = (bool)c["ExecutiveUseOnly"]

    };

 

And of course, you can use all of the grouping, ordering, and filtering capabilities of LINQ queries:

// vehicles grouped by user

var groups = from v in spreadsheet.Table("Vehicles").TableRows()

             group v by v["ExecutiveUseOnly"];

 

foreach (var g in groups)

{

    Console.WriteLine("Executive Use: {0}", (bool)g.Key);

    foreach (var v in g)

        Console.WriteLine("  Vehicle:{0}  Year:{1}",

            v["Vehicle"], v["Year"]);

    Console.WriteLine();

}

 

I’ve
imported the Customers and Orders from the Northwind database into a
spreadsheet, where the Customers table is in one sheet, and the Orders
table is in another sheet within the worksheet.  Here is the Customers table:

 Using LINQ to Query Excel Tables

And here is the Orders table:

 Using LINQ to Query Excel Tables

We can now write a query that joins the customers and orders tables:

using (SpreadsheetDocument spreadsheet =

    SpreadsheetDocument.Open(filename, false))

{

    // list all of the columns in the Customer table

    Console.WriteLine("Table: Customer");

    foreach (var c in spreadsheet.Table("Customer").TableColumns())

        Console.WriteLine("  {0}", c.Name);

    Console.WriteLine();

 

    // list all of the columns in the Order table

    Console.WriteLine("Table: Order");

    foreach (var o in spreadsheet.Table("Order").TableColumns())

        Console.WriteLine("  {0}", o.Name);

    Console.WriteLine();

 

    // query for all customers with city == London,

    // then select all orders for that customer

    var q = from c in spreadsheet.Table("Customer").TableRows()

            where (string)c["City"] == "London"

            select new

            {

                CustomerID = c["CustomerID"],

                CompanyName = c["CompanyName"],

                ContactName = c["ContactName"],

                Orders = from o in spreadsheet.Table("Order").TableRows()

                         where (string)o["CustomerID"] ==

                               (string)c["CustomerID"]

                         select new

                             {

                                 CustomerID = o["CustomerID"],

                                 OrderID = o["OrderID"]

                             }

            };

 

    // print the results of the query

    int[] tabs = new[] { 20, 25, 30 };

    Console.WriteLine("{0}{1}{2}",

        "CustomerID".PadRight(tabs[0]),

        "CompanyName".PadRight(tabs[1]),

        "ContactName".PadRight(tabs[2]));

    Console.WriteLine("{0} {1} {2} ", new string('-', tabs[0] – 1),

        new string('-', tabs[1] – 1), new string('-', tabs[2] – 1));

    foreach (var v in q)

    {

        Console.WriteLine("{0}{1}{2}",

            v.CustomerID.Value.PadRight(tabs[0]),

            v.CompanyName.Value.PadRight(tabs[1]),

            v.ContactName.Value.PadRight(tabs[2]));

        foreach (var v2 in v.Orders)

            Console.WriteLine("  CustomerID:{0}  OrderID:{1}",

                v2.CustomerID, v2.OrderID);

        Console.WriteLine();

    }

}

 

This code produces the following output:

Table: Customer

  CustomerID

  CompanyName

  ContactName

  ContactTitle

  Address

  City

  Region

  PostalCode

  Country

  Phone

  Fax

 

Table: Order

  OrderID

  CustomerID

  EmployeeID

  OrderDate

  RequiredDate

  ShipVia

  Freight

  ShipName

  ShipAddress

  ShipCity

  ShipRegion

  ShipPostalCode

  ShipCountry

 

CustomerID          CompanyName              ContactName

——————- ———————— —————————–

AROUT               Around the Horn          Thomas Hardy

  CustomerID:AROUT  OrderID:10355

  CustomerID:AROUT  OrderID:10383

  CustomerID:AROUT  OrderID:10453

  CustomerID:AROUT  OrderID:10558

  CustomerID:AROUT  OrderID:10707

  CustomerID:AROUT  OrderID:10741

  CustomerID:AROUT  OrderID:10743

  CustomerID:AROUT  OrderID:10768

  CustomerID:AROUT  OrderID:10793

  CustomerID:AROUT  OrderID:10864

  CustomerID:AROUT  OrderID:10920

  CustomerID:AROUT  OrderID:10953

  CustomerID:AROUT  OrderID:11016

 

BSBEV               B's Beverages            Victoria Ashworth

  CustomerID:BSBEV  OrderID:10289

  CustomerID:BSBEV  OrderID:10471

  CustomerID:BSBEV  OrderID:10484

  CustomerID:BSBEV  OrderID:10538

  CustomerID:BSBEV  OrderID:10539

  CustomerID:BSBEV  OrderID:10578

  CustomerID:BSBEV  OrderID:10599

  CustomerID:BSBEV  OrderID:10943

  CustomerID:BSBEV  OrderID:10947

  CustomerID:BSBEV  OrderID:11023

 

CONSH               Consolidated Holdings    Elizabeth Brown

  CustomerID:CONSH  OrderID:10435

  CustomerID:CONSH  OrderID:10462

  CustomerID:CONSH  OrderID:10848

 

EASTC               Eastern Connection       Ann Devon

  CustomerID:EASTC  OrderID:10364

  CustomerID:EASTC  OrderID:10400

  CustomerID:EASTC  OrderID:10532

  CustomerID:EASTC  OrderID:10726

  CustomerID:EASTC  OrderID:10987

  CustomerID:EASTC  OrderID:11024

  CustomerID:EASTC  OrderID:11047

  CustomerID:EASTC  OrderID:11056

 

NORTS               North/South              Simon Crowther

  CustomerID:NORTS  OrderID:10517

  CustomerID:NORTS  OrderID:10752

  CustomerID:NORTS  OrderID:11057

 

SEVES               Seven Seas Imports       Hari Kumar

  CustomerID:SEVES  OrderID:10359

  CustomerID:SEVES  OrderID:10377

  CustomerID:SEVES  OrderID:10388

  CustomerID:SEVES  OrderID:10472

  CustomerID:SEVES  OrderID:10523

  CustomerID:SEVES  OrderID:10547

  CustomerID:SEVES  OrderID:10800

  CustomerID:SEVES  OrderID:10804

  CustomerID:SEVES  OrderID:10869

 

Summary of the LtxOpenXml Namespace

This section summarizes the LtxOpenXml extension methods and types that make it easy to work with Open XML SpreadsheetML tables.

For
details on the extension methods and types for word processing
documents and spreadsheets (other than Tables within spreadsheets), see
the post,
Open XML SDK and LINQ to XML.

Tables Extension Method

This method returns a collection of all tables in the spreadsheet.  Its signature:

public static IEnumerable<Table> Tables(this SpreadsheetDocument spreadsheet)

 

Table Extension Method

This method returns the Table object with the specified table name.  Its signature:

public static Table Table(this SpreadsheetDocument spreadsheet,

    string tableName)

 

Table Class

This method represents an Excel Table.  Its definition:

public class Table

{

    public int Id { get; set; }

    public string TableName { get; set; }

    public string DisplayName { get; set; }

    public string Ref { get; set; }

    public int? HeaderRowCount { get; set; }

    public int? TotalsRowCount { get; set; }

    public string TableType { get; set; }

    public TableDefinitionPart TableDefinitionPart { get; set; }

    public WorksheetPart Parent { get; set; }

    public Table(WorksheetPart parent) { Parent = parent; }

    public IEnumerable<TableColumn> TableColumns()

    {

       

    }

    public IEnumerable<TableRow> TableRows()

    {

       

    }

}

 

This class contains a number of properties about the table.  In
addition, it contains two methods, TableColumns, which returns a
collection of TableColumn objects (the columns of the table), and
TableRows, which returns a collection of TableRow objects (the rows of
the table).

TableColumn Class

This class represents a column of a table.  Its definition:

public class TableColumn

{

    public int Id { get; set; }

    public string Name { get; set; }

    public int? FormatId { get; set; }  // dataDxfId

    public int? QueryTableFieldId { get; set; }

    public string UniqueName { get; set; }

    public Table Parent { get; set; }

    public TableColumn(Table parent) { Parent = parent; }

}

 

The most important property of this class is the Name property.

TableRow Class

This class represents a row of a table.  Its definition:

public class TableRow

{

    public Row Row { get; set; }

    public Table Parent { get; set; }

    public TableRow(Table parent) { Parent = parent; }

    public TableCell this[string columnName]

    {

        get

        {

           

        }

    }

}

 

The
most important feature of this class is the default indexed property
that takes a column name and returns a TableCell object.  This is what allows us to write code like this:

Console.WriteLine(r["Item"]);

Console.WriteLine(r["Qty"]);

Console.WriteLine(r["Price"]);

Console.WriteLine(r["Extension"]);

 

TableCell Class

This class represents a cell of a row of a table.  It implements IEquatable<T> so that you can do a value compare of two cells.  It also implements a number of explicit conversions to other data types so that it’s easy to deal with columns of various types.  Its definition:

public class TableCell : IEquatable<TableCell>

{

    public string Value { get; set; }

    public TableCell(string v)

    {

        Value = v;

    }

    public override string ToString()

    {

        return Value;

    }

    public override bool Equals(object obj)

    {

        return this.Value == ((TableCell)obj).Value;

    }

    bool IEquatable<TableCell>.Equals(TableCell other)

    {

        return this.Value == other.Value;

    }

    public override int GetHashCode()

    {

        return this.Value.GetHashCode();

    }

    public static bool operator ==(TableCell left, TableCell right)

    {

        if ((object)left != (object)right) return false;

        return left.Value == right.Value;

    }

    public static bool operator !=(TableCell left, TableCell right)

    {

        if ((object)left != (object)right) return false;

        return left.Value != right.Value;

    }

    public static explicit operator string(TableCell cell)

    {

        if (cell == null) return null;

        return cell.Value;

    }

    public static explicit operator bool(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return cell.Value == "1";

    }

    public static explicit operator bool?(TableCell cell)

    {

        if (cell == null) return null;

        return cell.Value == "1";

    }

    public static explicit operator int(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return Int32.Parse(cell.Value);

    }

    public static explicit operator int?(TableCell cell)

    {

        if (cell == null) return null;

        return Int32.Parse(cell.Value);

    }

    public static explicit operator uint(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return UInt32.Parse(cell.Value);

    }

    public static explicit operator uint?(TableCell cell)

    {

        if (cell == null) return null;

        return UInt32.Parse(cell.Value);

    }

    public static explicit operator long(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return Int64.Parse(cell.Value);

    }

    public static explicit operator long?(TableCell cell)

    {

        if (cell == null) return null;

        return Int64.Parse(cell.Value);

    }

    public static explicit operator ulong(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return UInt64.Parse(cell.Value);

    }

    public static explicit operator ulong?(TableCell cell)

    {

        if (cell == null) return null;

        return UInt64.Parse(cell.Value);

    }

    public static explicit operator float(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return Single.Parse(cell.Value);

    }

    public static explicit operator float?(TableCell cell)

    {

        if (cell == null) return null;

        return Single.Parse(cell.Value);

    }

    public static explicit operator double(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return Double.Parse(cell.Value);

    }

    public static explicit operator double?(TableCell cell)

    {

        if (cell == null) return null;

        return Double.Parse(cell.Value);

    }

    public static explicit operator decimal(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return Decimal.Parse(cell.Value);

    }

    public static explicit operator decimal?(TableCell cell)

    {

        if (cell == null) return null;

        return Decimal.Parse(cell.Value);

    }

    public static implicit operator DateTime(TableCell cell)

    {

        if (cell == null) throw new ArgumentNullException("TableCell");

        return new DateTime(1900, 1, 1).AddDays(Int32.Parse(cell.Value) – 2);

    }

    public static implicit operator DateTime?(TableCell cell)

    {

        if (cell == null) return null;

        return new DateTime(1900, 1, 1).AddDays(Int32.Parse(cell.Value) – 2);

    }

}

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

Merry Christmas

I'll be on and off line for the next few days, so there may not be a new Linq Exchange post until Jan 2.

Have a Merry Holiday (please substitute the appropriate holiday you celebrate, here), and Happy New Year.

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

LINQ to XML and Line Numbers

There are times when it is useful to know the line number of a node
in an XML file. This information can be a helpful to users,
particularly if you want to report an error. It can also be convenient
to search for a node by line number, but that can, of course, be a very
risky endeavor, as documents can be modified accidentally, and their
line numbers changed without notice.

This post shows a few
fundamentals about working with line numbers in a LINQ to XML program.
The code shown in this post is taken from a project called XmlLineNumber. You can download this program from the LINQ Farm on Code Gallery.

Reporting a Line Number

Let’s
begin our exploration by detailing a technique for reporting the number
of a node that you have found in an XML file. To get started we need to
use code from a class called XObject. As shown in Figure 1, XObject sits at the top of the LINQ to XML class hierarchy.

Chapter13 XmlHierarchy thumb LINQ to XML and Line Numbers

Figure 1: The core objects in the LINQ to XML class hierarchy

XObject implements an interface called IXmlLineInfo:

public interfaceIXmlLineInfo
{
    int LineNumber { get; }
    int LinePosition { get; }
    bool HasLineInfo();
}

The eponymous LineNumber property of this interface is able to store the information we want. To enlist it in our service we need only call XDocument.Load with LoadOptions.SetLineInfo:

XDocument xml = XDocument.Load(fileName, LoadOptions.SetLineInfo);

If you load this XML file into memory using SetLineInfo from the LoadOptions enumeration, then line numbers will be associated with the nodes in your document. The file we are loading is called FirstFourPlanets.xml. It’s a sweet little file that looks like this:

<?xmlversion="1.0" encoding="utf-8"?>

<
Planets>

  <
Planet>

    <
Name>Mercury</Name>

    <
Moons/>

  </
Planet>

  <
Planet>

    <
Name>Venus</Name>

    <
Moons/>

  </
Planet>

  <
Planet>

    <
Name>Earth</Name>

    <
Moons> <Moon>

        <
Name>Moon</Name>

        <
OrbitalPeriod UnitsOfMeasure="days">27.321582</OrbitalPeriod>

      </
Moon>

    </
Moons>

  </
Planet>

  <
Planet>

    <
Name>Mars</Name>

    <
Moons>

      <
Moon>

        <
Name>Phobos</Name>

        <
OrbitalPeriod UnitsOfMeasure="days">0.318</OrbitalPeriod>

      </
Moon>

      <
Moon>

        <
Name>Deimos</Name>

        <
OrbitalPeriod UnitsOfMeasure="days">1.26244</OrbitalPeriod>

      </
Moon>

    </
Moons>

  </
Planet>

</
Planets>

Here is code that uses the IXmlLineInfo interface to report the line number of a node discovered through a standard LINQ to XML search:

XText phobos = (from x in xml.DescendantNodes().OfType<XText>()
where x.Value == "Phobos"
select x).Single();
var lineInfo = (IXmlLineInfo)phobos;
Console.WriteLine("{0} appears on line {1}", phobos, lineInfo.LineNumber);

This code looks through all the descendants of the root node for nodes of type XText which are equal to the word Phobos. It uses the LINQ query operator Single to ensure that the query returns only a single node. If the query returned more than one result, the call to Single would raise an exception, which in this case is the behavior we want. The program then casts the result as an instance of IXmlLineInfo, and reports the line number to the user:

Phobos appears on line 24

Searching by Line Number

Let's now turn things around and show how to search through an XML
file and look for a node by line number. If you glance at the FirstFourPlanets.xml file, you will see that line 21 looks like this:

<Name>Mars</Name>

Here is code from the XmlLineNumbers sample showing how to search for that node by line number:

XDocument xml = XDocument.Load(fileName, LoadOptions.SetLineInfo);
var line = from x in xml.Descendants()
let lineInfo = (IXmlLineInfo)x
where lineInfo.LineNumber == 21
select x;
foreach (var item in line)
{
Console.WriteLine(item);
}

Note that the first line uses LoadOptions.SetLineInfo to ensures that line information is recorded when the document is loaded into memory.

The LINQ query shown here uses Descendants to iterate over the elements in the FirstFourPlanets.xml file. The where filter in the query checks to see if any of those elements has its line number set to 21. It happens that the 15th element returned by the call to Descendants fits that search criteria, and so that node, and that node alone, is found when we foreach over the results.

Notice the cast to convert the XElement nodes returned by the call to Descendants:

let lineInfo = (IXmlLineInfo)x

This cast is necessary, since the actual fields of the IXmlLineInfo interface are not exposed by XElement.

Once again, I want to stress that reporting the line number of a
node seems like a reasonable thing to do, but searching for an element
by line number is usually not a good idea in production code. For
unexplained reasons, code that was on line 532 has a way of migrating
to line 533 when you least expect it. In any case, you now know enough
to begin working with line numbers in a LINQ to XML program.

Download the source.

Reference: http://blogs.msdn.com/charlie/archive/2008/09/26/linq-farm-linq-to-xml-and-line-numbers.aspx

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

Yield Statement in C#

Many times we need to collect a number of results depending upon certain conditions. For that we need to create some sort of collection object, inserting the output to that collection and then returning that collection. But it can be very cumbersome to do that.

A rather simple way to do that is to use the Yield statement (C# 2,0 onwards). This keyword is used to return items from a loop within a method and retain the state of the method through multiple calls.

Yield returns IEnumerator or generic IEnumerator<T>.

public static IEnumerator<int> GetCounter()
{
      for (int count = 0; count < 10; count++)
      {
          yield return count;
      }
}

This will return the collection IEnumerator<int> which can be used to get the objects returned.

IEnumerator<int> list =  GetCounter();
Console.Write(list.MoveNext() + " "+ list.Current);

There is also the yield break statement. If a yield break statementis hit within a method,execution of that method stops with no  return.Using this, the first method could be rewritten like this;

public static IEnumerator<int> GetCounter()
{
   int max = 10, min = 5;
   while (true)
   {
      if (min >= max)
      {
         yield break;
      }
      yield return min++;
   }
}

Before I go any further, it's worth remembering that an iterator block doesn't just run from start to finish.  When the method is originally called, the iterator is just created. It's only when MoveNext() is called. At that point, execution starts at the top of the method as normal, and progresses as far as the first yield return or yield break statement, or the end of the method. At that point, a Boolean value is returned to indicate whether or not the block has finished iterating. If/when MoveNext()  is called again, the method continues executing from just after the yield return statement.

using System;
using System.Collections.Generic;
class Test
{
    static readonly string Padding = new string(' ', 30);
    static IEnumerator<int> GetNumbers()
    {

        Console.WriteLine("First line of GetNumbers()");

        Console.WriteLine("Just before yield return 0?);

        yield return 10;

        Console.WriteLine("Just after yield return 0?);

 

        Console.WriteLine("Just before yield return 1?);

        yield return 20;

        Console.WriteLine("Just after yield return 1");
    }    
    static void Main()
    {

           Console.WriteLine("Calling GetNumbers()");

           IEnumerator<int> iterator = GetNumbers();

           Console.WriteLine("Calling MoveNext()");

           bool more = iterator.MoveNext();

Console.WriteLine("Result={0}; Current={1}", more, iterator.Current);
 

           Console.WriteLine("Calling MoveNext() again");

           more = iterator.MoveNext();

Console.WriteLine("Result={0}; Current={1}", more, iterator.Current);
 

           Console.WriteLine("Calling MoveNext() again");

           more = iterator.MoveNext();

           Console.WriteLine("Result={0} (stopping)", more);
    }
}

—–Result
Calling GetNumbers()
Calling MoveNext()…
                              First line of GetNumbers()
                              Just before yield return 0
Result=True; Current=10
Calling MoveNext() again…
                              Just after yield return 0
                              Just before yield return 1
Result=True; Current=20
Calling MoveNext() again…
                              Just after yield return 1
Result=False (stopping)

This is very useful in LINQ query expressions in C# 3.0 as this provides the iterations required in LINQ queries.

Suppose there is CustomerCollection Class having customers list (IEnumerable<Customer> _customers)

public class CustomerCollection : IEnumerable<Customer>

This class is being used in LINQ query to iterate through the customer object contained in its list (_customers) and thus is supposed to implement GetEnumerator method as:

public IEnumerator<Customer> GetEnumerator()
{

    foreach (Customer customer in _customers)
        yield return customer;   
}

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

Iterators, Lambda, and LINQ

Since the creation of the .Net Framework, Microsoft has kept the
concept of “Type Safe” at the forefront of their design goals.  When
1.1 shipped, the framework had a “generic” collection type called an
ArrayList that seemed to break this goal.  Microsoft quickly went above
and beyond with the 2.0 framework by adding Generics and Anonymous
Methods to the mix.  Anonymous methods coupled with Generics paved the
way to Lambda expressions, Extension methods,  and LINQ (Language
INtegrated Query).  The topic of this paper is loosely defined as:  The
path to understanding Lambda and LINQ.

What came first?

Iterating through a collection with a for-each construct has been
around for a long time.  .Net has the capability and in the beginning
it was the recommended way of looping through a collections of widgets
to find something or count something.  The basic construct is:

        private int CountForEach()
        {
            int count = 0;
            foreach (DLDog Dog in Dogs)
            {
                if (Dog.FoodBrand == "Purina" )
                {
                    count++;
                }
            }
            return count;
        }

Nothing too earth shattering about this.  Assuming we have a collection
of 10 dogs and 3 of them use “Purina”, we will return a count of 3.

How would Generics and Anonymous methods change this syntax?  In the
.Net 2.0 framework we can simplify the previous code a little as
follows:

        private int CountGenericDelegate()
        {
            MyActionCount = 0;
            Dogs.ForEach(MyAction);
            return MyActionCount;
        }

This is a little misleading because we still need to write the delegate
method “MyAction” but it does give us some flexibility in that we can
pass any method to the ForEach generic method that adheres to the
Action<T> prototype which is a predefined delegate in .Net 2.0. 
Here is what the MyAction needs to look like in this example:

        private void MyAction(DLDog Dog)
        {
            if (Dog.FoodBrand == "Purina" )
            {
                MyActionCount++;
            }
        }

Another predefined delegate is the Predicate<T> delegate.  This
one returns a Boolean based on some condition.  The <T> is
typically your List<T> type.  Here is an example that uses the
Predicate<T>:

        private List<DLDog> QueryDelegate()
        {
            return Dogs.FindAll(ByTypeAndCost);
        }
        // Predicate<T> typed method for FindAll
        bool ByTypeAndCost(DLDog Dog)
        {
            if (Dog.DogType == "German Sheperd" && Dog.AnnualVet > 1500M)
                return true;
            else
                return false;
        }

Enter .NET 3.5 – SWEEEET!

In .NET 3.5 Microsoft pulled all punches and really exploited the
power of delegates, generics, and anonymous methods.  Building on that
technology they added extension methods, lambda, and LINQ to the
system.  Now our first count example can be simplified as:

        private int CountLambda()
        {
            MyActionCount = 0;
            return Dogs.Count(n => n.FoodBrand == "Purina" );
        }

This syntax is foreign as you can see but after a short explanation it
will become very natural.  The “=>” operator is loosely defined as
“Goes To”.  Under the covers – the compiler is doing this:

        private int CountLambda()
        {
            this.MyActionCount = 0;
            return this.Dogs.Count<DLDog>(delegate(DLDog n)
            {
                return (n.FoodBrand == "Purina" );
            });
        }

Previous code sample courtesy of “Reflector“…

Lambda is nothing more than syntactic sugar for inline anonymous
methods and the .Count method on a generic list is nothing more than an
extension method provided by the .Net 3.5 framework to the .Net 2.0’s
generic list class.  No smoke and mirrors here!

Now LINQ is another animal but simply exploits everything up to this point and the previous example looks like:

        private int CountLinq()
        {
            var query = from dog in Dogs select dog;
            return query.Count(n => n.FoodBrand == "Purina" );
        }

The query variable is of type IEnumerable<T> and the T in this
case is a DLDog object.  The compiler ends up with the following:

        private int CountLinq()
        {
            return this.Dogs.Select<DLDog, DLDog>(
                delegate(DLDog dog)
                {
                    return dog;
                }).Count<DLDog>(
                delegate(DLDog n)
                {
                    return (n.FoodBrand == "Purina" );
                }
                );
        }

Notice the use of the .Select extension method.  The method is defined as:

      public static IEnumerable<TResult> Select<TSource, TResult>(
            this IEnumerable<TSource> source,
            Func<TSource, TResult> selector)

(In the Visual Studio help system… )

So in summary – LINQ is really syntactic sugar for the extension methods provided by the .NET 3.5 framework!  Simple!

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark

New Features in Visual Studio 2010 and the .NET Framework 4.0

Visual Studio 2008 may be
better than sliced bread, but the development team at Microsoft has
already been working on the next release. They have recently given us
Visual Studio 2010 and the .NET Framework 4.0 as a Community Technology
Preview (CTP); it boasts several features that would appeal to
developers.

This
article won't go into every single feature, but will go into features
that are the most relevant to .NET developers. Please note that because
this is a CTP, it doesn't mean that the final release will be exactly
as you see in the CTP or as is described here. I can go over the
features roughly as follows:

New Features in the Visual Studio 2010 IDE and .NET Framework 4.0

  • Call Hierarchy of methods
  • A New Quick Search
  • Multi-targeting more accurate
  • Parallel Programming and Debugging
  • XSLT Profiling and Debugging
  • The XSD Designer

New ASP.NET features

  • Static IDs for ASP.NET Controls
  • The Chart control
  • Web.config transformation

New VB.NET features

  • Auto Implemented Properties for VB.NET
  • Collection Initializers
  • Implicit Line Continuations
  • Statements in Lambda Expressions

New C# features

  • Dynamic Types
  • Optional parameters
  • Named and Optional Arguments

Conclusion and Resources

New Features in the Visual Studio 2010 IDE and .NET Framework 4.0

Call Hierarchy of Methods

In complicated solutions, a single method may be used
from several different places, and attempting to follow how a
particular method is being called can be difficult. Call hierarchy
attempts to address this problem by visually presenting the flow of
method calls to and from the method you are looking at. In other words,
you can look at what calls your method and what your method calls in a
treeview format.

Obviously, I cannot present a truly complex example, but a simple example should help illustrate the point.

protected void Page_Load(object sender, EventArgs e)
{
BindDataControls()
}
private void BindDataControls()
{
//DataBinding here
}
protected void Button1_Click(object sender, EventArgs e)
{
BindDataControls();
}

Now, if you wanted to figure out what calls BindDataControls(), you can right-click on it and choose "View Call Hierarchy."

This gives you a window with a treeview format, as
shown below, with nodes that you can expand ad infinitum (or until your
machine runs out of memory). You also can right-click on the method
names and go to their definition, or you can reset those methods as the
root of the Call Hierarchy window if you want to work your way from
there and don't care about other methods anymore Further, you can view
call hierarchies from the object browser too, so you needn't always be
viewing the code. This is a helpful visual cue for very complicated
projects that we've all worked on at some point or another.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

A New Quick Search

A nifty little feature that Microsoft has added is the Quick Search
window. This isn't the same as the Search or Search and Replace window
that searches for specific textual strings. It's different in the sense
that it searches across symbols (methods, properties, and class names)
across your solution and filters them in the result view for you.

In this example, I typed in 'but' and it brought back all symbols
that contained 'but' regardless of position. You can specify multiple
words to search for by separating them with a space.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

Multi-targeting more accurate

Although VS 2008 supports targeting different frameworks from the
same IDE, one problem was that the Toolbox and Intellisense displayed
types that were available to the .NET 3.5 Framework whether or not you
were working with a .NET 3.5 project. This may have caused problems
when you tried to use something, only to realize that it wasn't
actually available to you.

VS 2010 gets smarter by only displaying items that you can use in the Toolbox and Intellisense (and other relevant areas).

Further, if you change your project to use a framework version that
isn't available on your machine, you will be prompted and can choose to
retarget to another version or download the target framework that you
wanted.

Parallel Programming and Debugging

Don't let the name intimidate you. I think Microsoft has done a
great job of allowing you to take advantage of multi-processor systems
with very easy syntax to enable parallel programming that you can
quickly adapt to.

In addition, to make things easier, VS 2010 comes with a set of
visual tools that will help you debug and view simultaneously running
threads. This means that you can view the task instances and the call
stacks for each task in parallel.

If you've been using the Parallel Extensions to the .NET Framework, then most of this will be familiar to you.

A Task.StartNew will kick off your threads for you. If you then hit
a breakpoint and view the VS 2010 Threads window, you can view the
state of each thread and where it is currently. This can be seen
below—the green indicating the main thread and the yellow indicating
all the worker threads spawned.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

There will also be a Multistack window that allows you to peruse the
call stacks of all of the threads currently being executed; this is
again a helpful visual cue that groups together tasks sharing a stack
frame.

Parallel Programming itself becomes a feature of the .NET Framework
4.0 as opposed to it being an extension right now. With Parallel
Programming, you get Parallel LINQ (PLINQ) and syntax to make
parallelization of algorithms easier. In fact, it's unbelievably easy.

Say you have a for loop that performed a complicated task:

for (int i = 0; i < 10; i++)
{
DoSomethingReallyComplicated(i);
}

Assuming DoSomethingReallyComplicated does something really
complicated (as the name implies), you could parallelize it by using
Parallel.For and enclosing the iteration in a lambda expression.

Parallel.For(0, 10, i => { DoSomethingReallyComplicated(i); });

Similarly, there is Parallel.ForEach<> for foreach loops. You
could also use Parallel LINQ to do the same thing. Taking a
theoretical, basic LINQ equivalent to the above, you would get this:

from i in Enumerable.Range(0,9)
where DoSomethingAndReturnAValue(i)
select i

PLINQ would involve a very slight change; just add AsParallel():

from i in Enumerable.Range(0,10).AsParallel()
where DoSomethingAndReturnValue(i)
select i

You should be able to see the real effects of these changes on
multi-core machines under intensive circumstances. Of course, there is
much more than I've gotten into here, such as parallel profiling views,
but that's beyond the scope here. You could get started now, if you'd
like, by downloading the Parallel FX CTP.

XSLT Profiling and Debugging

Everyone hates XSLT. If you don't hate XSLT, you haven't used it
enough. Part of this attitude towards XSLT stems from the difficulty
you face when debugging it—it is a huge unknown and can blow up in your
faces when you least expect it.

Visual Studio 2010 will offer an XSLT profiler to help with writing
XSLT in the context of profiling and optimization. After writing your
XSLT, you can use the "Profile XSLT" option in Visual Studio to supply
it with a sample XML file that gets used for the analysis.

I decided to test out a sample XSLT from the Gallery of Stupid XSL and XSLT Tricks page because they would undoubtedly be complicated. The XSLT Profile on Compute square roots using the Babylonian Method produced this:

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

Clicking on the 'offending' functions then takes you to the function
details, among other views on the analysis available to you. Available
to you are the paths of execution taken, assembly-level view of the
execution, call hierarchy, statistics by function, and function details.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

Also available to you in the menu options is XSLT Debugging. You can
launch an XSLT file with sample XML and step through it as you would
normal managed code.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

No, your eyes aren't deceiving you. You can set breakpoints, have
locals set, and step through the template functions as you please. This
should help alleviate your collective fears of XSLT immensely.

The XSD Designer

While I'm on the subject of XML, VS 2010 also introduces a new XSD Designer.

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

It comes with a schema explorer, a visual view of the relationships in different levels of detail, and sample XML generation.

New ASP.NET Features

Static IDs for ASP.NET Controls

A few much-needed features make their way to ASP.NET in the .NET
Framework 4.0. You now have the ability to specify the ClientID that
gets rendered to the page instead of having to fiddle with viewstate or
hidden fields to manage the IDs that get generated out to the page. If
you've ever had to use the .ClientID property of a control or found
yourself using ClientScript.RegisterStartupScript or
ClientScript.RegisterClientScriptBlock, you will know what I mean; it
doesn't always feel good to have to write all your JavaScript in the
codebehind. It'd be nice to let the JavaScript sit on the page or in an
external JavaScript file—where it belongs—so that a change in the
JavaScript doesn't require a recompilation of the website, while at the
same time still using dynamic ASP.NET controls.

With
ASP.NET 4.0, you now can create controls that support static Client
IDs. To do this, simply get your control to implement INamingContainer.
When using the control on the page, set its ClientIDMode property to
Static.

For example, here is the codebehind for a simple web user control (ASCX) with a label in it.

public partial class WebUserControl1 : System.Web.UI.UserControl,
INamingContainer
{
protected void Page_Load(object sender, EventArgs e)
{
this.ClientIDMode = ClientIDMode.Static;
}
}

You would need to expose the ClientIDMode property so
that the parent page can set it. That's all you need to do, and the
control gets rendered to the page with an ID of Label1.

When it comes to databound controls, though, you'd
have to do things a little differently; it's really not acceptable to
have more than one control with the same ID on a page. For this
purpose, the ClientIDMode can be set to predictable and you also can
specify a suffix that ensures that the IDs generated are predictable
(but not fixed).

An example of its usage would be (in a gridview):

<asp:GridView ID="GridViewTest" runat="server" ClientIDMode="Predictable" RowClientIDSuffix="Pizza">
...
</asp:GridView>

And any controls rendered in the gridview should have IDs like "GridView1_PizzaLabelTest_1".

The Chart Control

The Chart control is finally here! This control
sports various chart types, including point, column, bar, area,
doughnut, range, and types I hadn't even heard of before such as Renko and Kagi….
The chart also has scale breaks for 'excessive' data, and even can be
rendered in a 3D mode. As a quick test, I got to make these charts
based on the meteor consumption of my friends:

 New Features in Visual Studio 2010 and the .NET Framework 4.0
(Full Size Image)

 New Features in Visual Studio 2010 and the .NET Framework 4.0

Combined with ASP.NET AJAX functionality, this can open up a vast
array of presentation configurations for your website if you deal with
such data.

The chart works off an HTTP Image Handler. This means that you also
have the option of plugging in your own charting image handling
assembly if you want to customize it further.

Web.config transformation

In enterprise environments, your ASP.NET application will have to go
across various stages of deployment such as testing, staging,
preproduction, and production. Due to the relatively isolated nature of
these environments, you often will have had to create separate
web.config files for each environment and ensure that any changes you
make to one web.config has made it to the other web.configs as well.

VS Team System 2010 has implemented a web.config transformation
feature that can perform the value-based transformation that you need
when deploying to different environments. The transformation is
performed as part of an MS Build task, but you will need to specify the
transforms to perform across the various web.configs; this is done by
adding a set of xdt:Transform attributes to various nodes that may need
changing (such as the obvious connection string).

An example of this would be:

<add name="theSiteDB"
connectionString="Server=PreProduction;Database=mySite;
User Id=hello;password=world"
providerName="System.Data.SqlClient" xdt:Transform="Replace"
xdt:Locator="Match(name)"/>

Because of the XML-based nature of these transforms, it therefore is
also possible to perform the transformation manually via the VS 2010
GUI or create deployment packages for the various environments that
require minimal human intervention.

New VB.NET Features

Auto implemented properties

Often, you find yourselves declaring public properties like this:

public string DoorColor
{
get
{
return doorColorValue;
}
set
{
doorColorValue = value;
}
}

 

In VB.NET, this would be:

Public Property DoorColor As String
Get
Return _doorColor
End Get
Set(ByVal value As String)
_doorColor = value
End Set
End Property

More often than not, no extra logic ever gets placed
in the get/set blocks. This is why C# was given an auto implemented
property in which the private variable is declared behind the scenes
for you.

public string DoorColor
{
get; set;
}

This feature was not present for VB.NET in .NET
Framework 3.5, but is now available in .NET Framework 4.0. It's as
simple as this:

Public Property DoorColor As String

Collection Initializers

The syntax to initialize collections in VB.NET 10 (in .NET Framework 4.0) is now slightly shorter.

Dim breakfast = {New Crumpets With {.CrumpetAge = 99,
.CrumpetSmell = "Bad"},
New Crumpets With {.CrumpetAge = 29, .CrumpetSmell = "Good"}}

Note that you don't need to specify the collection
type because it is implicitly understood. In .NET 3.5, this would not
have compiled because it would have understood 'breakfast' as type
'Object' instead of Crumpets. .NET 4.0 understands Crumpets, a
statement I never thought I'd actually write down.

Implicit Line Continuations

C# has had this for a long time—long lines of code
can be split across several lines for more readability. VB.NET has had
it, but you've always had to add an underscore (_) at the end of each
line, which could get a little annoying. Certain types of statements
can now be split across several lines without the _ required.

Therefore, the collection initializer example from the previous section can be declared like this:

Dim breakfast =
{
New Crumpets With {
.CrumpetAge = 221,
.CrumpetSmell = "Foul"
},
New Crumpets With {
.CrumpetSmell = "good",
.CrumpetAge = 1
}
}

It works on Console.WriteLine, too.

Console.WriteLine(
DoorColor)

And with nested method calls.

Console.WriteLine(
breakfast(
1
).GetSomeString(
99)
)

Changes to Lambda Expressions: statements and subs

In VB.NET 9 (.NET Framework 3.5), Lambda Expressions
always needed to return a value regardless of whether it was required
or not. This could often cause confusion when switching between C# and
VB.NET. The .NET Framework 4.0 addresses this issue by allowing VB.NET
to create Lambda Expressions that don't return anything. And, Lambda
Expressions can now also contain statements instead of having to pass
the logic to other methods as you had to do previously.

To write expressions that return nothing:

Dim lambdaNothing = Sub() Nothing
Dim lambdaSomething = Sub(Message) Console.WriteLine(Message)

To write expressions that contain statements:

Dim lambdaReturn = Function(i)
If i >= 100 Then
Return 3
ElseIf i >= 10 AndAlso i < 100 Then
Return 2
Else
Return 1
End If
End Function

New C# Features

Dynamic Types and Dynamic Programming

C# should now support a new static type called 'dynamic'. This
essentially allows for dynamic dispatch or late binding of the variable
in question. For example, suppose you have two simple classes with a
common-name method:

public class Coffee
{
public int GetZing()
{
return 1;
}
}
public class Juice
{
public string GetZing()
{
return "Orange";
}
}

You also have a method that returns one of the two object types:

private Object GetOneOfThem(int i)
{
if (i > 10)
{
return new Juice();
}
else
{
return new Coffee();
}
}

You can then make a call to GetOneOfThem() without
knowing what type you're going to have returned, but you can still
attempt to call method names on it.

dynamic drink = GetOneOfThem(someVariable);
Console.WriteLine(drink.GetZing());

At runtime, GetOneOfThem() evaluates to either
Juice() or Coffee() and then the corresponding method is called based
on what the compiler resolves 'drink' to be. The above is a very simple
example of dynamic programming—there is more to this. The underlying
implementation of dynamic programming is the DLR—the Dynamic Language
Runtime—which is what allows you to invoke things in a dynamic fashion.
You can read more about dynamic programming and other new features in C#.

Optional Parameters

VB.NET has had this for a long time, and now C# gets it too—optional parameters in your function signatures.

private string GetMeaninglessDrivel(string drivelSeed="bork")
{
//...
}

In the past, the lack of optional parameters in C# was overcome by
overloading the method several times and calling the same common method
from all overloads with different values being passed in. Optional
parameters can help prevent unnecessary multiple overloads of methods
and making a class unnecessarily complex.

Named and Optional Arguments

This feature is somewhat related to Optional Parameters. Suppose you have a method that has several optional parameters.

private void SaveTheseValues(int i=1, int j=2, int k=3)
{
//...
}

If you wanted to pass in values just for j and k, but not i, you can
now name the arguments that you are passing in instead of passing in
values and skipping values with multiple commas.

So, instead of this:

	SaveTheseValues(,5,4)
	

You can do this:

	SaveTheseValues(j:5,k:4)
	

 

Conclusion and Resources

That was a quick overview of what will probably be available to you
in Visual Studio 2010 and the .NET Framework 4.0. To reiterate, you
must keep in mind that because it is a CTP, not everything that you see
will be exactly as it is when it is finally released (which is why I
often discourage books that come out at the same time as the Visual
Studio release – they're based on Betas and CTPs and so will not be
completely reliable)

You can
have a look at VS 2010 yourself if you'd like and if you can meet the
hefty requirements for the Virtual PC image. You will need about 70 to
80 GB of hard disk space available, 2GB of RAM and a dual-core system
at the least. The download is here.

The Parallel FX Library is also available as a CTP that works with the .NET 3.5 framework. You can download the Parallel FX CTP here.

Don't forget to have a read through the C# Future documentation as well for more information on the DLR.

Finally, you can also participate in the feedback process by submitting any bugs you find to the Microsoft Connect program for VS 2010.

About the Author
Mendhak is a web developer and a
Microsoft MVP who works with ASP.NET and PHP among the usual array[] of
web technologies. He is also rumored to be a three eyed frog, but the
evidence is lacking. He can be contacted via his website, www.mendhak.com

GD Star Rating
loading...
GD Star Rating
loading...
  • Share/Bookmark
 Page 5 of 10  « First  ... « 3  4  5  6  7 » ...  Last »