Some collections, like List<T>
There is a predefined starting size.
Whenever you add a new item to the collection, there are two scenarios:
- The collection has free space, allocated but not yet populated, so adding an item is immediate;
- The collection is already full: Internally, .NET resizes the collection, so the next time you add a new item, we’ll go back to option #1.
Clearly, the second approach has an impact on overall performance. Can we prove it?
Here’s a benchmark you can run using BenchmarkDotNet:
[Params(2, 100, 1000, 10000, 100_000)]
public int Size;
[Benchmark]
public void SizeDefined()
{
int itemsCount = Size;
List<int> set = new List<int>(itemsCount);
foreach (var i in Enumerable.Range(0, itemsCount))
{
set.Add(i);
}
}
[Benchmark]
public void SizeNotDefined()
{
int itemsCount = Size;
List<int> set = new List<int>();
foreach (var i in Enumerable.Range(0, itemsCount))
{
set.Add(i);
}
}
These two methods are almost identical: the only difference is that in one method we specify the initial size of the list: new List<int>(itemsCount)
.
Take a look at the result of the comparison run with .NET 7:
method | size | mean | Error | StdDev | median | Gen0 | Gen1 | Gen2 | assigned |
---|---|---|---|---|---|---|---|---|---|
defined size | 2 | 49.50 ns | 1.039 ns | 1.678 ns | 49.14 ns | 0.0248 | – | – | 104 b |
SizeNotDefined | 2 | 63.66 ns | 3.016 ns | 8.507 ns | 61.99 ns | 0.0268 | – | – | 112 b |
defined size | 100 | 798.44 ns | 15.259 N. | 32.847 N. | 790.23 ns | 0.1183 | – | – | 496 b |
SizeNotDefined | 100 | 1,057.29 n. | 42.100 ns | 121.469 N. | 1,056.42 ns | 0.2918 | – | – | 1224 b |
defined size | 1000 | 9,180.34 n. | 496.521 ns | 1,400.446 n. | 8,965.82 ns | 0.9766 | – | – | 4096 b |
SizeNotDefined | 1000 | 9,720.66 n. | 406.184 ns | 1,184.857 n. | 9,401.37 n. | 2.0142 | – | – | 8464 b |
defined size | 10000 | 104,645.87 n. | 7,636.303 n. | 22,395.954 ns | 99,032.68 ns | 9.5215 | 1.0986 | – | 40096 b |
SizeNotDefined | 10000 | 95,192.82 N. | 4,341.040 ns | 12,524.893 n. | 92,824.50 ns | 31.2500 | – | – | 131440 b |
defined size | 100000 | 1,416,074.69 n. | 55,800.034 ns | 162,771.317 N. | 1,402,166.02 N. | 123.0469 | 123.0469 | 123.0469 | 400300 B |
SizeNotDefined | 100000 | 1,705,672.83 n. | 67,032.839 n. | 186,860.763 N. | 1,621,602.73 n. | 285.1563 | 285.1563 | 285.1563 | 1049485 b |
Note that they generally perform in a similar amount of time; For example, when running the same method with 100,000 items, we have the same size of execution time: 1,416,074.69 ns versus 1,705,672.83 ns.
The huge difference is in the allocated area: 400,300 B vs. 1,049,485 B. Almost 2.5 times better!
OK, it works. next question: How can we check list capacity?
We just learned that capacity affects the performance of a list.
How can I try it live? Easy: look at Capacity
attribute!
List<int> myList = new List<int>();
foreach (var element in Enumerable.Range(0,50))
{
myList.Add(element);
Console.WriteLine($"Items count: {myList.Count} - List capacity: {myList.Capacity}");
}
If you run this method, you will see this output:
Items count: 1 - List capacity: 4
Items count: 2 - List capacity: 4
Items count: 3 - List capacity: 4
Items count: 4 - List capacity: 4
Items count: 5 - List capacity: 8
Items count: 6 - List capacity: 8
Items count: 7 - List capacity: 8
Items count: 8 - List capacity: 8
Items count: 9 - List capacity: 16
Items count: 10 - List capacity: 16
Items count: 11 - List capacity: 16
Items count: 12 - List capacity: 16
Items count: 13 - List capacity: 16
Items count: 14 - List capacity: 16
Items count: 15 - List capacity: 16
Items count: 16 - List capacity: 16
Items count: 17 - List capacity: 32
Items count: 18 - List capacity: 32
Items count: 19 - List capacity: 32
Items count: 20 - List capacity: 32
Items count: 21 - List capacity: 32
Items count: 22 - List capacity: 32
Items count: 23 - List capacity: 32
Items count: 24 - List capacity: 32
Items count: 25 - List capacity: 32
Items count: 26 - List capacity: 32
Items count: 27 - List capacity: 32
Items count: 28 - List capacity: 32
Items count: 29 - List capacity: 32
Items count: 30 - List capacity: 32
Items count: 31 - List capacity: 32
Items count: 32 - List capacity: 32
Items count: 33 - List capacity: 64
Items count: 34 - List capacity: 64
Items count: 35 - List capacity: 64
Items count: 36 - List capacity: 64
Items count: 37 - List capacity: 64
Items count: 38 - List capacity: 64
Items count: 39 - List capacity: 64
Items count: 40 - List capacity: 64
Items count: 41 - List capacity: 64
Items count: 42 - List capacity: 64
Items count: 43 - List capacity: 64
Items count: 44 - List capacity: 64
Items count: 45 - List capacity: 64
Items count: 46 - List capacity: 64
Items count: 47 - List capacity: 64
Items count: 48 - List capacity: 64
Items count: 49 - List capacity: 64
Items count: 50 - List capacity: 64
So as you can see, The list capacity is doubled whenever the current capacity is insufficient.
Additional readings
To populate the lists in our Benchmarks we used Enumerable.Range
. Do you know how it works? Take a look at this C# tip:
🔗 C# Tip: LINQ’s Enumerable.Range to create a sequence of consecutive numbers
This article first appeared on Code4IT 🐧
finishing
In this article, we learned that just a minimal change can affect the performance of our application.
We just used a different builder, but the difference is amazing. Obviously, this trick only works if you already know the final length of the list (or, at least, an estimate). The more accurate, the better!
I hope you enjoyed this article! Let’s keep in touch Twitter or on LinkedIn, If you want! 🤜🤛
Happy coding!
🐧
.