marko devcic

  • github:
    deva666
  • email:
    madevcic {at} gmail.com

Bulk Database insert

Posted on 23 August 2014

I was really interested in performance of using a classic for loop for inserting multiple rows into a database, or to be more precise, I was interested how bad the performance of the for loop actually is.

I wanted to compare Entity framework, ADO.NET, SQL Server stored procedure and SqlBulkCopy class to see how they handle inserting 10, 100, 1000 and 10000 rows at once. So I wrote a really simple app to test them. I used EF v.6.01 and SQL Server 2012.

The database I created is also simple, it has only one column.

CREATE TABLE TestTable(
    RowKey int NOT NULL,
 CONSTRAINT PK_TestTable PRIMARY KEY CLUSTERED 
(
    RowKey ASC
))

The stored procedure accepts a XML parameter with values for inserting:

CREATE PROCEDURE XMLInsert
@pXml Xml
AS
BEGIN
    BEGIN TRANSACTION;
    SET NOCOUNT ON;
    select 
        T.node.value('.', 'INT') as Num
    into #temp
    from @pXml.nodes('/root/element') as T (node)

    insert into TestTable
    (RowKey)    
    (select Num from #temp)

    if @@ERROR <> 0
        ROLLBACK TRANSACTION;
    else
        COMMIT TRANSACTION;

END

EF supports atomic transactions, i.e. if exception is raised during SaveChanges call, the whole transaction is rolled back and none of the rows will be inserted. I presume this doesn't bring a big performance hit, but nevertheless the stored procedure is under the transaction, so is the code using ADO.NET.

Here's the complete code for testing them:

        private void test_Click(object sender, RoutedEventArgs e)
        {
            Task.Run(() => TestBulkInsert(1))
               .ContinueWith(t => TestBulkInsert(10))
               .ContinueWith(t => TestBulkInsert(100))
               .ContinueWith(t => TestBulkInsert(1000))
               .ContinueWith(t => TestBulkInsert(10000));
        }

        private void TestBulkInsert(int rows)
        {
            List<TestTable> addRangeRows = GetAddRangeRows(rows);

            List<TestTable> forLoopRows = GetForLoopRows(rows);

            XElement xml = GetXml(rows);

            DataTable table = GetDataTable(rows);

            Stopwatch sw = new Stopwatch();
            using (var context = new TestInsertEntities())
            {
                sw.Start();

                context.TestTable.AddRange(addRangeRows);
                context.SaveChanges();

                Console.WriteLine(string.Format("EF AddRange insert for {0} rows: {1} ", rows, sw.Elapsed));

                sw.Restart();

                foreach (var t in forLoopRows)
                {
                    context.TestTable.Add(t);
                }

                context.SaveChanges();
                sw.Stop();
                Console.WriteLine(string.Format("EF for loop insert for {0} rows: {1} ", rows, sw.Elapsed));

                sw.Restart();

                context.XMLInsert(xml.ToString());

                sw.Stop();

                Console.WriteLine(string.Format("Stored procedure XML insert for {0} rows: {1} ", rows, sw.Elapsed));

                sw.Stop();
            }

            string insertCommand = "INSERT INTO TESTTABLE VALUES(@param);";

            using (SqlConnection connection = new SqlConnection(connectionString))
            {
                SqlCommand command = new SqlCommand(insertCommand, connection);
                command.CommandType = CommandType.Text;
                command.Parameters.Add("@param", SqlDbType.Int);

                sw.Start();

                connection.Open();
                var transaction = connection.BeginTransaction();
                command.Transaction = transaction;
                try
                {
                    for (int i = 3 * rows + 1; i < 4 * rows + 1; i++)
                    {
                        command.Parameters["@param"].Value = i;
                        command.ExecuteNonQuery();
                    }
                }
                catch (Exception)
                {
                    transaction.Rollback();
                }

                transaction.Commit();

                connection.Close();
                sw.Stop();
                Console.WriteLine(string.Format("ADO.net for loop insert for {0} rows: {1} ", rows, sw.Elapsed));

                sw.Reset();
                sw.Start();

                connection.Open();
                using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
                {
                    bulkCopy.DestinationTableName = "TestTable";
                    bulkCopy.WriteToServer(table);
                }
                connection.Close();
                sw.Stop();

                Console.WriteLine(string.Format("SqlBulkCopy insert for {0} rows: {1} ", rows, sw.Elapsed));

                connection.Open();
                command = new SqlCommand("DELETE FROM TESTTABLE;", connection);
                command.ExecuteNonQuery();
                connection.Close();
            }
        }

        private static DataTable GetDataTable(int rows)
        {
            DataTable table = new DataTable();
            table.Columns.Add("RowKey", typeof(int));
            for (int i = 4 * rows + 1; i < 5 * rows + 1; i++)
            {
                table.Rows.Add(i);
            }
            return table;
        }

        private static XElement GetXml(int rows)
        {
            XElement xml = new XElement("root");
            for (int i = 2 * rows + 1; i < 3 * rows + 1; i++)
            {
                XElement x = new XElement("element");
                x.Add(i);
                xml.Add(x);
            }
            return xml;
        }

        private static List<TestTable> GetForLoopRows(int rows)
        {
            List<TestTable> forLoopRows = new List<TestTable>(1000);

            for (int i = 1 + rows; i < 2 * rows + 1; i++)
            {
                TestTable t = new TestTable() { RowKey = i };
                forLoopRows.Add(t);
            }
            return forLoopRows;
        }

        private static List<TestTable> GetAddRangeRows(int rows)
        {
            List<TestTable> addRangeRows = new List<TestTable>(rows);
            for (int i = 1; i < 1 + rows; i++)
            {
                TestTable t = new TestTable() { RowKey = i };
                addRangeRows.Add(t);
            }
            return addRangeRows;
        }

Results

Results 1 row

EF AddRange : 00.8875315 sec

EF for loop : 00.0109685 sec

Stored procedure : 00.0605031 sec

ADO.NET for loop : 00.0665507 sec

SqlBulkCopy : 00.0070696 sec

Results 10 rows

EF AddRange : 00.0865438 sec

EF for loop : 00.0878373 sec

Stored procedure : 00.0099671 sec

ADO.NET for loop : 00.0509694 sec

SqlBulkCopy : 00.0068580 sec

Results 100 rows

EF AddRange : 00.6969128 sec

EF for loop : 00.7041873 sec

Stored procedure : 00.0152227 sec

ADO.NET for loop : 00.3681421 sec

SqlBulkCopy : 00.0065223 sec

Results 1000 rows

EF AddRange : 06.8745769 sec

EF for loop : 08.1898612 sec

Stored procedure : 00.0192753 sec

ADO.NET for loop : 03.7156545 sec

SqlBulkCopy : 00.0163246 sec

Results 10000 rows

EF AddRange : 67.2680139 sec

EF for loop : 193.0319509 sec

Stored procedure :00.1364101 sec

ADO.NET for loop : 34.9175692 sec

SqlBulkCopy : 00.0533580 sec

OK, no surprises here, if you are not using a file based databased, like SQLite, SqlBulkCopy or Stored procedures are the way to go.

It's also worth noting how ADO.NET really outperforms the EF when inserting 10 or more rows.