I love SQL Server internals, I do and I just said it.
Why? because thanks to all the tools, documentation and community members that share their knowledge, folks like me can understand how a super complex piece of software like a relational database engine works (or at least a small part of it).
And I’m definitely not a genius, unfortunately 🙂
Most of us DBAs are familiar with the concept of index fragmentation and understand that in order to fix it, we need to have appropriate routines to remove it.
Also is wide known that in order to minimize it (fragmentation) we can tweak something called FILLFACTOR.
When an index is created or rebuilt, the fill-factor value determines the percentage of space on each leaf-level page to be filled with data, reserving the remainder on each page as free space for future growth. For example, specifying a fill-factor value of 80 means that 20 percent of each leaf-level page will be left empty
By reading the documentation and taking the example of FillFactor 80, you might get the impression that in the index (clustered or not) we will always keep this 20% free space to prevent fragmentation, including the new pages allocated after you create or rebuild the index.
That sounds really cool, if each page keeps 20 percent free, future rows can be allocated or current rows with variable length columns can be expanded without incurring in page splits.
Let’s see it in an example, I will create and load one table with FILLFACTOR 100 and another with FILLFACTOR 10 (to see the extreme), then we will see how the content of a page from each table are different.
USE fill_factor DROP TABLE IF EXISTS dbo.Seq_100 CREATE TABLE dbo.Seq_100 ( Id INT IDENTITY(1,10) NOT NULL, Col1 CHAR(50) DEFAULT REPLICATE('A', 50) ) ALTER TABLE dbo.Seq_100 ADD CONSTRAINT Pk_Seq_100 PRIMARY KEY (Id) WITH( FILLFACTOR = 100) GO INSERT INTO dbo.Seq_100 DEFAULT VALUES GO 1000 DROP TABLE IF EXISTS dbo.Seq_10 CREATE TABLE dbo.Seq_10 ( Id INT IDENTITY(1,10) NOT NULL, Col1 CHAR(50) DEFAULT REPLICATE('A', 50) ) ALTER TABLE dbo.Seq_10 ADD CONSTRAINT Pk_Seq_10 PRIMARY KEY (Id) WITH( FILLFACTOR = 10) GO INSERT INTO dbo.Seq_10 DEFAULT VALUES GO 1000
Tables are created and loaded, so let’s have a look to both table’s metadata:
SELECT OBJECT_NAME(object_id) AS [object_name], * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.Seq_100'),1, NULL, 'DETAILED') SELECT OBJECT_NAME(object_id) AS [object_name], * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.Seq_10'),1, NULL, 'DETAILED')
Don’t know you, but I’d say those
tables clustered indexes look very much the identical, same number of rows, fragmentation, percentage of space used? hm what is going on?
Let’s get a random page from each and compare the headers
DBCC IND('fill_factor', 'dbo.Seq_100', 1) DBCC TRACEON (3604) -- Your page number will be different DBCC PAGE (fill_factor, 1, 328, 3);
DBCC IND('fill_factor', 'dbo.Seq_10', 1) DBCC TRACEON (3604) -- Your page number will be different DBCC PAGE (fill_factor, 1, 384, 3);
m_freeCnt stores the number for bytes free in the page, in both cases 32 bytes.
So, both tables’ pages are filled until there wasn’t enough space to add another row, regardless of the FILLFACTOR, yes, not joking.
What is the point of configuring it then? Let’s rebuild the one with FILLFACTOR 10 and check again physical stats.
ALTER INDEX Pk_Seq_10 ON dbo.Seq_10 REBUILD SELECT OBJECT_NAME(object_id) AS [object_name], * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.Seq_10'),1, NULL, 'DETAILED')
Now after rebuilding, we can see that the percentage used per page matches the FILLFACTOR, about 10%.
If we add more rows, subsequent new pages created will be filled to 100% again, which makes me really sad.
And the key point was When an index is created or rebuilt, which obviously does not affect to new allocated pages and that includes all the data inserted after creating a new table for instance.
Can we do better, Microsoft?
I believe yes, we can do better, let me explain.
SQL Server index pages are a double linked chain, so each page points to the previous and to the next, this can be seen using DBCC IND or in the page header itself.
Would it be possible that the SQL Server engine honor the FILLFACTOR if the page is the last page (does not link to a next page)? I think yes, in the same way SQL Server needs to know how much free space is in a page before trying to put some data there.
Once the page is no longer the last of the table/index, SQL Server should ignore FILLFACTOR and write until 100% full or even split if that was the case.
Proceeding as explained, all pages would initially have space to prevent the terrible fragmentation we have been fighting every night since decades ago.
This is just an idea.
I whish FILLFACTOR had some sort of mechanism that will start writing new pages once we reach the magical number instead of applying only to the existing data when we create or rebuild an index, clustered or not.
Unfortunately, it is not the case and new pages will be always written until there is no more space, leading to future fragmentation, which was what we were trying to avoid by changing FILLFACTOR.
This problem is somehow exacerbated in clustered indexes or in Standard Edition, where online rebuilds are not a thing or if you cannot allow a maintenance window every so often.
Hope you enjoyed reading and please feel free to use the comments below.