« Prev 1 2 3 4 5 6 7 8 9 10 Next »

Delete data in SQL Server

Create Date: August 24, 2020 at 06:12 PM         Tag: DATABASE ADMINISTRATION         Author Name: Sun, Charles

Best Practices For Deleting SQL Server Data

Problem

Deleting data from a SQL Server database is something I really don't like to do as it's always better if it can be done through an application that has been thoroughly tested.  However, inevitably there comes a time where some backend data cleanup needs to be done. Below are some best practices I like to follow when deleting data.

Solution

Before we get into the details of this tip I just want to mention that you should ALWAYS HAVE A BACKUP when you are going to be deleting data. I know everyone knows how important backups are, but before you delete anything you probably want to verify you have a good one by restoring it somewhere first. You can never be too careful when it comes to removing data.

With that said, let's setup a couple sample tables for us to run our example queries against. The first table below is the table that contains our actual data and the second one contains a list of records to be deleted from the main table. Here is the T-SQL to create and load these tables.

--Table Setup
CREATE TABLE Main (col1 INT, col2 INT);
CREATE TABLE ToDelete (col3 INT);

DECLARE @val INT
SELECT @val=1
WHILE @val < 50000
BEGIN  
   INSERT INTO Main VALUES (@val,round(rand()*100000,0));
   IF (@val % 1000) = 0
      INSERT INTO ToDelete VALUES (@val);
   SELECT @val=@val+1;
END;

Now let's remove the records from the Main table based on the records in the ToDelete table using the following simple query.

DELETE FROM Main WHERE col1 IN (SELECT col1 From ToDelete);

(49999 row(s) affected)

Oops. It looks like a few too many rows were deleted. What happened? If we look closely at my query I put the wrong column name in the subquery by accident and because this column name exists in the primary table I removed everything. Let's look at a few different ways that we could have avoided this issue.

For our first solution had I used table aliases for each table this would not have happened as the query would error out as follows:

DELETE FROM Main WHERE col1 IN (SELECT td.col1 FROM ToDelete td);

Msg 207, Level 16, State 1, Line 2
Invalid column name 'col1'.

In this case I could then fix the query to use the correct column name and it would remove the records that were supposed to be removed in the first place.

DELETE FROM Main WHERE col1 IN (SELECT td.col3 FROM ToDelete td)
 
(49 row(s) affected)

Another solution to this issue and one I use quite often is to always check how many records are going to be deleted before running the delete. If we created and ran the script that follows when we needed to delete our records we would have noticed that it was going to delete too many records when the SELECT returned 49999. I always comment out the DELETE part of the script so it does not get executed by accident.

SELECT COUNT(1)
--DELETE 
FROM Main WHERE col1 IN (SELECT col1 From ToDelete);

(No column name)
49999

Once I see that it has returned too many records I can fix my query and then run them successfully as shown below.

SELECT COUNT(1)
--DELETE 
FROM Main WHERE col1 IN (SELECT col3 From ToDelete);

-- SELECT output
(No column name)
49

-- DELETE output
(49 row(s) affected)

One final good practice to use when deleting data from a database is to always wrap your DELETE statement in a BEGIN TRAN - COMMIT/ROLLBACK TRAN code block. With the method outlined below you can run the BEGIN TRAN and your DELETE, then verify how many records were affected before you COMMIT your changes. As I did in the previous solution I've commented out the COMMIT part of my script so it does not get run by accident.

BEGIN TRAN
DELETE FROM Main WHERE col1 IN (SELECT col1 From ToDelete);
--ROLLBACK TRAN
--COMMIT TRAN

-- DELETE output
(49999 row(s) affected)

As with the previous example once I see that too many rows were delete I can rollback the transaction and fix the script as shown below. In this case it returned the correct number and then I can run my commit statement.

BEGIN TRAN
DELETE FROM Main WHERE col1 IN (SELECT col3 From ToDelete);
--ROLLBACK TRAN
--COMMIT TRAN

-- DELETE output
(49 row(s) affected)

I understand this is a trivial example, but in the real world where related or foreign key columns are not always named to follow a standard or there is no good environment to test your scripts in, issues like this can pop up. It's always best to make sure you have a backup of your data and if possible test your scripts on a copy of the production database before running any updates or deletes. Even what you think might be a small update could end up being a big issue with the wrong syntax.

Next Steps

SQL Server: GOTO Statement 

 

This SQL Server tutorial explains how to use the GOTO statement in SQL Server (Transact-SQL) with syntax and examples.

Description

The GOTO statement causes the code to branch to the label after the GOTO statement.

Syntax

The syntax for the GOTO statement in SQL Server (Transact-SQL) consists of two parts - the GOTO statement and the Label Declaration:

GOTO statement

The GOTO statement consists of the GOTO keyword, followed by a label_name.

GOTO label_name;

Label Declaration

The Label Declaration consists of the label_name, followed by at least one statement to execute.

label_name:
 {...statements...}

Note

Example

Let's look at how to use the GOTO statement in SQL Server (Transact-SQL).

For example:

DECLARE @site_value INT;
SET @site_value = 0;

WHILE @site_value <= 10
BEGIN

   IF @site_value = 2
      GOTO TechOnTheNet;

   SET @site_value = @site_value + 1;

END;

TechOnTheNet:
   PRINT 'TechOnTheNet.com'; 

GO

In this GOTO example, we have created one GOTO statements called TechOnTheNet. If @site_value equals 2, then the code will branch to the label called TechOnTheNet.

Example:

--backup table
		set @table_name = ''
		set @backup_table_name = 'Temp_' + @table_name + '_' + @comment + '_' + @ticketID
		if object_id(@backup_table_name) is null 
		begin
			set @sql = 'select * into ' + @backup_table_name + ' from table WITH (NOLOCK) where id = ' + @id 
			exec(@sql)
		end
		else 
		begin 
			set @sql = 'insert into ' + @backup_table_name + 'select * from table WITH (NOLOCK) where id = ' + @id 
			exec(@sql)
		end 

		BEGIN TRAN T1
			BEGIN TRY
				delete from table  WITH (ROWLOCK)
				Where id = @id  
			END TRY

			BEGIN CATCH 
				ROLLBACK TRAN T1
				SET @trancode = 'T1 Broken, cannot delete records for ' + @id
				SELECT @trancode,  ERROR_NUMBER() AS ErrorNumber  
			   ,ERROR_MESSAGE() AS ErrorMessage
			   goto done
			END CATCH 

		fetch next from claim_procedure_id_cursor
		into @id
	end 
	close claim_procedure_id_cursor
	deallocate claim_procedure_id_cursor

	--
	done:
		print 'If there is an error message, please run script at off-work time. Otherwise, it means that scrinpt runs successfully.'

 

New Comment

Unique index in SQL Server

Create Date: July 08, 2020 at 03:15 PM         Tag: DATABASE ADMINISTRATION         Author Name: Sun, Charles

Create Unique Indexes

This topic describes how to create a unique index on a table in SQL Server 2019 (15.x) by using SQL Server Management Studio or Transact-SQL. A unique index guarantees that the index key contains no duplicate values and therefore every row in the table is in some way unique. There are no significant differences between creating a UNIQUE constraint and creating a unique index that is independent of a constraint. Data validation occurs in the same manner, and the query optimizer does not differentiate between a unique index created by a constraint or manually created. However, creating a UNIQUE constraint on the column makes the objective of the index clear. For more information on UNIQUE constraints, see Unique Constraints and Check Constraints.

When you create a unique index, you can set an option to ignore duplicate keys. If this option is set to Yes and you attempt to create duplicate keys by adding data that affects multiple rows (with the INSERT statement), the row containing a duplicate is not added. If it is set to No, the entire insert operation fails and all the data is rolled back.

 Note

You cannot create a unique index on a single column if that column contains NULL in more than one row. Similarly, you cannot create a unique index on multiple columns if the combination of columns contains NULL in more than one row. These are treated as duplicate values for indexing purposes.

In This Topic

Before You Begin

Benefits of a Unique Index

Typical Implementations

Unique indexes are implemented in the following ways:

Limitations and Restrictions

Security

Permissions

Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio

To create a unique index by using the Table Designer

  1. In Object Explorer, expand the database that contains the table on which you want to create a unique index.

  2. Expand the Tables folder.

  3. Right-click the table on which you want to create a unique index and select Design.

  4. On the Table Designer menu, select Indexes/Keys.

  5. In the Indexes/Keys dialog box, click Add.

  6. Select the new index in the Selected Primary/Unique Key or Index text box.

  7. In the main grid, under (General), select Type and then choose Index from the list.

  8. Select Columns, and then click the ellipsis (...).

  9. In the Index Columns dialog box, under Column Name, select the columns you want to index. You can select up to 16 columns. For optimal performance, select only one or two columns per index. For each column you select, indicate whether the index arranges values of this column in ascending or descending order.

  10. When all columns for the index are selected, click OK.

  11. In the grid, under (General), select Is Unique and then choose Yes from the list.

  12. Optional: In the main grid, under Table Designer, select Ignore Duplicate Keys and then choose Yes from the list. Do this if you want to ignore attempts to add data that would create a duplicate key in the unique index.

  13. Click Close.

  14. On the File menu, click Savetable_name.

Create a unique index by using Object Explorer

  1. In Object Explorer, expand the database that contains the table on which you want to create a unique index.

  2. Expand the Tables folder.

  3. Expand the table on which you want to create a unique index.

  4. Right-click the Indexes folder, point to New Index, and select Non-Clustered Index....

  5. In the New Index dialog box, on the General page, enter the name of the new index in the Index name box.

  6. Select the Unique check box.

  7. Under Index key columns, click Add....

  8. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table column or columns to be added to the unique index.

  9. Click OK.

  10. In the New Index dialog box, click OK.

Using Transact-SQL

To create a unique index on a table

  1. In Object Explorer, connect to an instance of Database Engine.

  2. On the Standard bar, click New Query.

  3. Copy and paste the following example into the query window and click Execute.

    Copy

    USE AdventureWorks2012;  
    GO  
    -- Find an existing index named AK_UnitMeasure_Name and delete it if found  
    IF EXISTS (SELECT name from sys.indexes  
               WHERE name = N'AK_UnitMeasure_Name')   
       DROP INDEX AK_UnitMeasure_Name ON Production.UnitMeasure;   
    GO  
    -- Create a unique index called AK_UnitMeasure_Name  
    -- on the Production.UnitMeasure table using the Name column.  
    CREATE UNIQUE INDEX AK_UnitMeasure_Name   
       ON Production.UnitMeasure (Name);   
    GO  
    

For more information, see CREATE INDEX (Transact-SQL).

SQL SERVER – Performance Impact of Unique Index

 

Today I am going to share a story of my client who has hired me over 7 times for my most popular consulting service Comprehensive Database Performance Health Check for different servers and situations. In the latest engagement, we had a very interesting situation of Performance Impact of Unique Index.

Background Story of Unique Index

In the recent engagement, the client wanted a minimum 20% improvement in the performance of their system. We had a very interesting situation as in the first round of health check we had gone through all the low hanging fruits and already had boosted SQL Server Performance by over 100%. Now in this round, they still want to push the performance further 20%. This time we had to start looking beyond the obvious issues and dig different in their system.

It would have been easy for me to help them tune their queries and get the additional performance but they always have a very straight forward requirement – No Change in Code. This is because in their business changing code would require a large amount of deployment and release cycle and they wanted to stay away from it.

Performance Impact of Unique Index

As we had gone through the server configurations and database settings along with the indexes the option was to look at the query. However, as mentioned earlier in this blog post, I was not allowed to change the code and that lead me to look at their table structure a bit more in-depth.

While looking into the table structure, I realized that most of the indexes were not marked as Unique indexes. However, after talking to a developer and senior architect, we realized that indexes can be unique. We changed the nearly 70% indexes from non-clustered indexes to non-clustered unique indexes and noticed a lot of performance improvement. At the end of the day when my client ran the final performance test, they discovered by just changing the index type we were able to get nearly 23% of performance improvement which was over 3% from what we actually wanted.

Reproduction of the Scenario

First, we will create two tables. On one table we will create a non-clustered index and on another one, we will create a non-clustered unique index.

1

2

CREATE TABLE Table1 (ID INT IDENTITY (1,1), Col1 VARCHAR(256) DEFAULT '');

CREATE INDEX IX_NC_NOT_Unique ON Table1 (Col1);

1

2

CREATE TABLE Table2 (ID INT IDENTITY (1,1), Col1 VARCHAR(256) DEFAULT '')

CREATE UNIQUE INDEX IX_NC_Unique ON Table2 (Col1);

Now let us populate both the tables with the data

1

2

3

4

5

6

7

INSERT INTO Table1 (Col1)

SELECT TOP 1000000 CAST(a1.name AS VARCHAR(100))

+ CAST(ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS VARCHAR(100))

FROM sys.all_columns a1

CROSS JOIN sys.all_columns a2

CROSS JOIN sys.all_columns a3

GO

1

2

3

4

5

6

7

INSERT INTO Table2 (Col1)

SELECT TOP 1000000 CAST(a1.name AS VARCHAR(100))

+ CAST(ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS VARCHAR(100))

FROM sys.all_columns a1

CROSS JOIN sys.all_columns a2

CROSS JOIN sys.all_columns a3

GO

Once the data is populated, let us turn on the statistics.

1

2

SET STATISTICS IO ON

GO

Additionally, let us turn on the actual execution plan. Here is how you can enable the actual execution plan in SQL Server Management Studio (SSMS).

Now run the following two statements in a single batch.

1

2

3

4

5

6

SELECT DISTINCT Col1

FROM Table1

GO

SELECT DISTINCT Col1

FROM Table2

GO

At this point in time, we will observe two things 1) Statistics 2) Execution Plan

Statistics Details in Messages

Let us see details about the statistics in the messages. I have removed details which were not relevant and kept only the logical reads.

Table ‘Table1’. Scan count 1, logical reads 5612
Table ‘Table2’. Scan count 1, logical reads 5435

You can see in the statistics details the logical reads are much lesser in table 2 where we do Unique non-clustered Indexes. This means that unique non-clustered indexes really helped in reducing the IO from the disk by few hundreds of the pages.

Winner: Unique Indexes by saving IO

Execution Plan

When we noticed the execution plan, we noticed that the execution plan of table1 (non-unique index) is 55% and for the table2 (unique index) is 45%.

SQL SERVER - Performance Impact of Unique Index unique index

It is clear that unique indexes also removed an operator from the execution plan which was Stream Aggregate and also performed better in terms of overall Query Cost.

Winner: Unique Indexes by saving overall resources

Here is the code to clean up our scenario.

1

2

DROP TABLE TABLE1;

DROP TABLE TABLE2;

Final Verdict

In the various tests, I have done in the past converting existing nonclustered index a unique nonclustered index always helped with the performance.

However, SQL Server is a huge world and it is totally possible that I have not come across the scenario where unique nonclustered indexes are slower. I am sure the scenario exists somewhere and this is where I will request you to share the details with me and I will publish on this blog with due credit to you.

Well, until we find another story unique nonclustered index is a clear winner.

Reference: Pinal Dave (https://blog.sqlauthority.com)

New Comment

Sql server automatically trim right spaces in where clause

Create Date: May 11, 2020 at 02:07 PM         Tag: SQL         Author Name: Sun, Charles

When I compared a word with right white spaces and without spaces, the sql server give the same results. It looks like sql server does automatically right trim. I found the useful post below.

Spaces in WHERE clause for SQL Server [duplicate]

I want to find out the records in which a certain column contains exactly one space and nothing else. So I wrote the first of the following queries:

select COUNT(*)
from mytable
where col = ' ' -- One space

select COUNT(*)
from mytable
where col = '  ' -- Two spaces

select COUNT(*)
from mytable
where col = '  ' -- Three spaces

However, all three queries return the exact same records. Does Microsoft SQL Server not distinguish between the amount of spaces? How can I query exactly for one, two or more spaces?

 

Yes, it ignores trailing spaces in comparisons.

You can try to append a delimiting character.

SELECT count(*)
FROM mytable
WHERE col + 'X' = ' X';
New Comment
« Prev 1 2 3 4 5 6 7 8 9 10 Next »