# 优化LINQ将多个列表组合到新的通用列表中[英] Optimizing LINQ combining multiple lists into new generic list

### 问题描述

```    var FirstNames = new List<string>(){ "Bob", "Sondra", "Avery", "Von", "Randle", "Gwen", "Paisley" };
var LastNames = new List<string>(){ "Anderson", "Carlson", "Vickers", "Black", "Schultz", "Marigold", "Johnson" };
var Birthdates = new List<DateTime>()
{
Convert.ToDateTime("11/12/1980"),
Convert.ToDateTime("09/16/1978"),
Convert.ToDateTime("05/18/1985"),
Convert.ToDateTime("10/29/1980"),
Convert.ToDateTime("01/19/1989"),
Convert.ToDateTime("01/14/1972"),
Convert.ToDateTime("02/20/1981")
};
```

```    var students = from fn in FirstNames
from ln in LastNames
from bd in Birthdates
where FirstNames.IndexOf(fn) == LastNames.IndexOf(ln)
where FirstNames.IndexOf(fn) == Birthdates.IndexOf(bd)
select new { First = fn, Last = ln, Birthdate = bd.Date };
```

## 推荐答案

```var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});
```

```int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds); // Returns 69191 on my machine.
```

```int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var FirstNamesList = FirstNames.ToList(); // Blows up in 32-bit .NET with out of Memory
var LastNamesList = LastNames.ToList();
var BirthDatesList = BirthDates.ToList();

var result = Enumerable.Range(0, FirstNamesList.Count())
.Select(i => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
Birthdate = BirthDatesList[i]
});

result = BirthDatesList.Select((bd, i) => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
BirthDate = bd
});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
```

```private static IEnumerable<TResult> Zip<TFirst, TSecond, TThird, TResult>(
IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
IEnumerable<TThird> third,
Func<TFirst, TSecond, TThird, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
using (IEnumerator<TThird> iterator3 = third.GetEnumerator())
{
while (iterator1.MoveNext() && iterator2.MoveNext() && iterator3.MoveNext())
{
yield return resultSelector(iterator1.Current, iterator2.Current, iterator3.Current);
}
}
}
```

```var result = FirstNames
.Zip(LastNames, BirthDates, (f,l,b) => new {First=f,Last=l,BirthDate=b});
```

## 其他推荐答案

```var result = Birthdates.Select((bd, i) => new
{
First = FirstNames[i],
Last = LastNames[i],
Birthdate = bd
});
```

## 其他推荐答案

yeap，使用范围生成器:

```var result = Enumerable.Range(0, FirstNames.Count)
.Select(i => new
{
First = FirstNames[i],
Last = LastNames[i],
Birthdate = Birthdates[i]
});
```

### 问题描述

Given the following three lists:

```    var FirstNames = new List<string>(){ "Bob", "Sondra", "Avery", "Von", "Randle", "Gwen", "Paisley" };
var LastNames = new List<string>(){ "Anderson", "Carlson", "Vickers", "Black", "Schultz", "Marigold", "Johnson" };
var Birthdates = new List<DateTime>()
{
Convert.ToDateTime("11/12/1980"),
Convert.ToDateTime("09/16/1978"),
Convert.ToDateTime("05/18/1985"),
Convert.ToDateTime("10/29/1980"),
Convert.ToDateTime("01/19/1989"),
Convert.ToDateTime("01/14/1972"),
Convert.ToDateTime("02/20/1981")
};
```

I'd like to combine them into a new generic type where the relationship the lists share is their position in the collection. i.e. FirstNames[0], LastNames[0], Birthdates[0] are related.

So I have come up with this LINQ, matching the indices, which seems to work fine for now:

```    var students = from fn in FirstNames
from ln in LastNames
from bd in Birthdates
where FirstNames.IndexOf(fn) == LastNames.IndexOf(ln)
where FirstNames.IndexOf(fn) == Birthdates.IndexOf(bd)
select new { First = fn, Last = ln, Birthdate = bd.Date };
```

However, I have stressed tested this code (Each List<string> and List<DateTime> loaded with a few million records) and I run into SystemOutOfMemory Exception.

Is there any other way of writing out this query to achieve the same results more effectively using Linq?

## 推荐答案

That is what Zip is for.

```var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});
```

Regarding scaling:

```int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var result = FirstNames
.Zip(LastNames, (f,l) => new {f,l})
.Zip(BirthDates, (fl, b) => new {First=fl.f, Last = fl.l, BirthDate = b});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds); // Returns 69191 on my machine.
```

While these blow up with out of memory:

```int count = 50000000;
var FirstNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var LastNames = Enumerable.Range(0, count).Select(x=>x.ToString());
var BirthDates = Enumerable.Range(0, count).Select(x=> DateTime.Now.AddSeconds(x));

var sw = new Stopwatch();
sw.Start();

var FirstNamesList = FirstNames.ToList(); // Blows up in 32-bit .NET with out of Memory
var LastNamesList = LastNames.ToList();
var BirthDatesList = BirthDates.ToList();

var result = Enumerable.Range(0, FirstNamesList.Count())
.Select(i => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
Birthdate = BirthDatesList[i]
});

result = BirthDatesList.Select((bd, i) => new
{
First = FirstNamesList[i],
Last = LastNamesList[i],
BirthDate = bd
});

foreach(var r in result)
{
var x = r;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
```

At lower values, the cost of converting the Enumerables to a List is much more expensive than the additional object creation as well. Zip was approximately 30% faster than the indexed versions. As you add more columns, Zips advantage would likely shrink.

The performance characteristics are also very different. The Zip routine will start outputting answers almost immediately, while the others will start outputting answers only after the entire Enumerables have been read and converted to Lists, so if you take the results and do pagination on it with .Skip(x).Take(y), or check if something exists .Any(...) it will be magnitudes faster as it doesn't have to convert the entire enumerable.

Lastly, if it becomes performance critical, and you need to implement many results, you could consider extending zip to handle an arbitrary number of Enumerables like (shamelessly stolen from Jon Skeet - https://codeblog.jonskeet.uk/2011/01/14/reimplementing-linq-to-objects-part-35-zip/):

```private static IEnumerable<TResult> Zip<TFirst, TSecond, TThird, TResult>(
IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
IEnumerable<TThird> third,
Func<TFirst, TSecond, TThird, TResult> resultSelector)
{
using (IEnumerator<TFirst> iterator1 = first.GetEnumerator())
using (IEnumerator<TSecond> iterator2 = second.GetEnumerator())
using (IEnumerator<TThird> iterator3 = third.GetEnumerator())
{
while (iterator1.MoveNext() && iterator2.MoveNext() && iterator3.MoveNext())
{
yield return resultSelector(iterator1.Current, iterator2.Current, iterator3.Current);
}
}
}
```

Then you can do this:

```var result = FirstNames
.Zip(LastNames, BirthDates, (f,l,b) => new {First=f,Last=l,BirthDate=b});
```

And now you don't even have the issue of the middle object being created, so you get the best of all worlds.

Or use the implementation here to handle any number generically: Zip multiple/abitrary number of enumerables in C#

## 其他推荐答案

Another option is to use Select overload with the indexer supplied:

```var result = Birthdates.Select((bd, i) => new
{
First = FirstNames[i],
Last = LastNames[i],
Birthdate = bd
});
```

## 其他推荐答案

Yeap, use range generator:

```var result = Enumerable.Range(0, FirstNames.Count)
.Select(i => new
{
First = FirstNames[i],
Last = LastNames[i],
Birthdate = Birthdates[i]
});
```