SQL Insert Race! (Performance TEST) - C#
Single thread vs Parallel thread vs Efcore Bulk extension
Introduction
SQLBulkCopy is a class that can help developers to achieve high-performance operations on DB using c#.
If you have no idea whats the topic is all about then you can check these two articles to get a headstart. Sql Bulk Insert, Sql Bulk Update
On a single thread, SqlBulkCopy already excels in terms of data writing but on the internet, some folks suggest better performance can be achieved by using parallel threads on SQL bulk copy.
My test machine is:
- Processor: Core I5 6600,
- Ram: 16gb DDR4,
- OS: Windows 10 20H2,
- Dev env:
- dotnet 5,
- SQL server 2019
In this test, I am dividing 3 lists consists (0.5 million, 2.5 million, 15 million) amount of data.
Our measurement :
- Single thread bulk copy with all default options.
- Single thread bulk copy with table lock enabled.
- Parallel thread bulk copy with table lock enabled.
- Ef core bulk copy extension. (Although this implements SQLBulkCopy under the hood)
Code
So our data model looks like this
public class Employee
{
public int Id { get; set; }
public string EmpId { get; set; }
public string EmpName { get; set; }
public bool IsEmployee { get; set; }
public string Address { get; set; }
public decimal Salary { get; set; }
public string AddingBunchOfPropsForTest { get; set; }
public DateTime CreatedOn { get; set; }
}
Performance result:
I have run each test 3 times to check these numbers. so here we go...
Data Size | Single Default | Single Locked | Parallel Locked | EF core Bulk Extension |
0.5 million | 2.69 seconds | 8.06 seconds | 4.165 seconds | 10.50 seconds |
2.5 million | 30 seconds | 20.47 seconds | 25.16 seconds | 54.45 seconds |
15 million | 7 mins | 6 mins | 5 mins | 11 mins |
To be very frank, I am really surprised to see the performance of efcore bulk extension. It's an open-source third-party extension, very easy to integrate and it's free ๐.
Observation:
If you are working with this amount of data (I mean the last category) any kind of raw SQL operation should be the first priority. If not then I think orm will just do fine. ๐
Study:
If you want to know about what is Default or Table lock? then I think this official doc can help you to understand this concept very quickly.
Happy learning ๐ฏ