Shrinking large half-empty MDF

Shrinking your database is not a best practice (sometimes can be acceptable) but it might be trickier than you expect Once again I’ve found my inspiration browsing linkedin SQL Server groups. Love it!

First I have to say I’m a big fan of DBCC SHRINK, I use it all the time…

NO, that was a bad joke.

This case is possible one of the unique scenarios where you might want to SHRINK your database, because as we all know, SHRINK is evil.

The scenario

You have a large database and you’ve done a massive cleanup, so the result is that half your data file is empty and your sysadmin is pushing you to give that precious space back.

Up to here everything sounds reasonable, you’re not using that space and you don’t have plans to use it (important!!), that can be because the database is no longer active for new transactions but still need to be available or some other reason.

The problem comes when you see that SHRINKing the database does not work as you’d expect, so no much space is given back to the OS.

There is an additional thing to consider, thanks to how SHRINK works, every time you run it you get a very very fragmented database which might hurt the performance of your application.

Create the playground

I always like showing you real examples (simplified) so you can get a better understanding of what is going on, so let’s go prepare the field for the battle.

--==========================================================================
-- Create the playground
--==========================================================================
USE [master]
GO

IF DB_ID('DBCC_SHRINKFILE') IS NOT NULL BEGIN 
	ALTER DATABASE [DBCC_SHRINKFILE] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
	DROP DATABASE [DBCC_SHRINKFILE] 
END
GO

USE [master]
GO

CREATE DATABASE [DBCC_SHRINKFILE]
GO

USE [DBCC_SHRINKFILE]
GO

SELECT IDENTITY(INT, 1, 1) AS Id, o.*
	INTO dbo.random_objects
	FROM sys.objects AS o
		CROSS JOIN sys.objects AS o2
		CROSS JOIN sys.objects AS o3

ALTER TABLE dbo.random_objects ADD CONSTRAINT PK_random_objects PRIMARY KEY (Id)
GO

SELECT * 
	INTO dbo.random_objects_NEW
	FROM dbo.random_objects

ALTER TABLE dbo.random_objects_NEW ADD CONSTRAINT PK_random_objects_NEW PRIMARY KEY (Id)
GO

-- See file size and usage
SELECT CONVERT(DECIMAL(10,2), df.size * 8. / 1024) AS Size_MB
		, CONVERT(DECIMAL(10,2), FILEPROPERTY(df.name, 'SpaceUsed') * 8. / 1024) AS SpaceUsed_MB 
	FROM sys.database_files AS df
	WHERE df.type_desc = 'ROWS'
GO

Now you have 2 tables, one to be deleted and another to be kept, so let’s drop that useless table.

--==========================================================================
-- We do some cleanup, old tables are removed
--==========================================================================

DROP TABLE dbo.random_objects
-- Empty space at the begining of the file

-- See file size and usage, size still the same, usage is less so there is some empty space
SELECT CONVERT(DECIMAL(10,2), df.size * 8. / 1024) AS Size_MB
		, CONVERT(DECIMAL(10,2), FILEPROPERTY(df.name, 'SpaceUsed') * 8. / 1024) AS SpaceUsed_MB 
	FROM sys.database_files AS df
	WHERE df.type_desc = 'ROWS'
GO

First Approach

At this point if we are certainly sure we will not need that space never again, we might want to SHRINK our database (or data file) to see if it does what is says on the tin

--==========================================================================
-- First approach, 
--==========================================================================

-- SHRINK
DBCC SHRINKDATABASE([DBCC_SHRINKFILE])

-- See file size and usage, size still the same, usage is less so there is some empty space
SELECT CONVERT(DECIMAL(10,2), df.size * 8. / 1024) AS Size_MB
		, CONVERT(DECIMAL(10,2), FILEPROPERTY(df.name, 'SpaceUsed') * 8. / 1024) AS SpaceUsed_MB 
	FROM sys.database_files AS df
	WHERE df.type_desc = 'ROWS'
GO

-- SHRINK = Fragmentation
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED')
GO

So after all this effort, we need to rebuild our index

-- Remove fragmentation
ALTER INDEX PK_random_objects_NEW ON dbo.random_objects_NEW REBUILD

-- We have save spaced from the dropped table, but still 50% of the file is empty
SELECT CONVERT(DECIMAL(10,2), df.size * 8. / 1024) AS Size_MB
		, CONVERT(DECIMAL(10,2), FILEPROPERTY(df.name, 'SpaceUsed') * 8. / 1024) AS SpaceUsed_MB 
	FROM sys.database_files AS df
	WHERE df.type_desc = 'ROWS'
GO

You can see that although we managed to get some space back, we are still wasting half the size, which was used to REBUILD our index and remove the fragmentation.
That probably will happen on the next maintenance window because we are proactive DBAs and do regular index maintenance, right??

Second Approach

So how to overcome this problem, the approach above will make our data file to become a yo-yo, grow-shrink, grow-shrink and so on, and the space will never be given back.

But there is another approach you might want to try. First you might want run the first script again to reset everything.

--==========================================================================
-- Second approach, 
--==========================================================================

-- Create a new FILEGROUP
ALTER DATABASE [DBCC_SHRINKFILE] ADD FILEGROUP [USER_DATA]

-- Add a file
ALTER DATABASE [DBCC_SHRINKFILE] 
ADD FILE (
	NAME = N'DBCC_SHRINKFILE_user_data'
	, FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQL2014\MSSQL\DATA\DBCC_SHRINKFILE_user_data.ndf')
TO FILEGROUP [USER_DATA]

-- Rebuild your tables in the new filegroup
USE [DBCC_SHRINKFILE]
GO

CREATE UNIQUE CLUSTERED INDEX PK_random_objects_NEW ON dbo.random_objects_NEW ([Id] ASC)
WITH (DROP_EXISTING = ON)
ON [USER_DATA]
GO

-- Now shrink the file which is now almost empty
DBCC SHRINKFILE(DBCC_SHRINKFILE)

-- Now both files take just as much as they need
SELECT df.name
		, CONVERT(DECIMAL(10,2), df.size * 8. / 1024) AS Size_MB
		, CONVERT(DECIMAL(10,2), FILEPROPERTY(df.name, 'SpaceUsed') * 8. / 1024) AS SpaceUsed_MB 
	FROM sys.database_files AS df
	WHERE df.type_desc = 'ROWS'
GO

This approach as you can see, allows you to move all your data from one filegroup to another, so you can SHRINK the mdf file once the user data is somewhere else.

And what is better, you will get brand new indexes without any fragmentation, so this is a win-win approach I’d say.

One more little thing, it’s a good idea to leave the PRIMARY filegroup for metadata only, so you can run this in case you forget specifying the filegroup when creating tables or indexes.

ALTER DATABASE DBCC_SHRINKFILE MODIFY FILEGROUP [USER_DATA] DEFAULT

Thanks for reading and feel free to throw me any question you may have.

2 thoughts on “Shrinking large half-empty MDF”

Thomas Franz says:

Monday May 2nd, 2016 at 04:41 PM

Beside the valid point regarding the metadata-only-PRIMARY-filegroup, your approach would need a lot of scripting on real databases with many tables, indexes (maybe some SPATIAL or XML, that prevents you from dropping the PK). And as far I know NONCLUSTERED indexes will not automatical be moved to another FG, just because you move the CLUSTERED index …

It would be much easier to add another file to the existing filegroup(s) and run
DBCC SHRINKFILE (N’my_old_file’ , EMPTYFILE)
This will move the data out of the original file into a new one and you could drop the old one (or shrink it again to e.g. 1 MB and move the data back using the same command, if you had disk space issues and placed the new file temporary on another drive)

1. Raul says:
  
  Monday May 2nd, 2016 at 05:15 PM
  
  Thanks for your comment!
  
  Well, I see your point too, but I can tell from my experience that DBCC SHRINKFILE (xxx, EMPTYFILE) is quite expensive to run too, to the point I’ve never seen it finishing in large size databases with files of many GB’s
  
  For the PK constraint in my example you can see that rebuilding the clustered index with DROP_EXISTING does the job nicely, no need to drop the constraint, hence no need to drop FK’s.
  It does not move all non clustered indexes, true, but most likely you have code to script all your indexes already, or you can use SSMS to do it for you.
  
  Also if you do DBCC SHRINKFILE (N’my_old_file’ , EMPTYFILE) you will get eventually all your indexes with a nice 99% fragmentation, so in order to rebuild them, you will need extra storage and have the same problem on your new file.
  
  Cheers!

2 thoughts on “Shrinking large half-empty MDF”

Leave a Reply Cancel reply