{"id":342,"date":"2016-10-19T09:39:40","date_gmt":"2016-10-19T08:39:40","guid":{"rendered":"https:\/\/sqldoubleg.live-website.com\/?p=342"},"modified":"2016-10-19T09:41:43","modified_gmt":"2016-10-19T08:41:43","slug":"successful-anti-patterns-storage-requirements","status":"publish","type":"post","link":"https:\/\/www.sqldoubleg.com\/es\/2016\/10\/19\/successful-anti-patterns-storage-requirements\/","title":{"rendered":"Successful Anti-Patterns, Storage Requirements"},"content":{"rendered":"<p>One of the things you realize with experience is that there is not one absolute truth when speaking about SQL Server\u2026 that comes along with the magical answer for most of the questions, \u201cit depends\u201d. But what makes the difference it\u2019s to know what it depends on.&nbsp;One of the things you realize with experience is that there is not one absolute truth when speaking about SQL Server&#8230; that comes along with the magical answer for most of the questions, \u00abit depends\u00bb.<\/p>\n<p>But what makes the difference it&#8217;s to know what it depends on.<\/p>\n<p>Lots have been written, including myself, about the advantage and disadvantages of the different types of tables that exist in SQL Server, <a href=\"https:\/\/sqldoubleg.live-website.com\/tag\/clustered-index\/\" target=\"_blank\">Clustered<\/a> and <a href=\"https:\/\/sqldoubleg.live-website.com\/tag\/heap-tables\/\" target=\"_blank\">Heaps<\/a>.<\/p>\n<p>Today I will  again, because at the end of the day, a table is the most basic structure in a database and we need them to store the data, which is indeed the purpose of having a database, right?<\/p>\n<p>When explaining Heaps, <a href=\"http:\/\/https:\/\/msdn.microsoft.com\/en-us\/library\/hh213609.aspx\" target=\"_blank\">Books online<\/a> specifies that <\/p>\n<blockquote><p>Most tables should have a carefully chosen clustered index unless a good reason exists for leaving the table as a heap.<\/p><\/blockquote>\n<p>So if we see what a clustered index is, we also find in <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms190457.aspx\" target=\"_blank\">Books online<\/a> the following definition:<\/p>\n<blockquote><p>Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.<\/p><\/blockquote>\n<p>I explained before that <a href=\"https:\/\/sqldoubleg.live-website.com\/2016\/02\/29\/clustered-index-row-order\/\" target=\"_blank\">this order might be a bit different than you think<\/a>, but it&#8217;s pretty similar, and that bring some benefits, but also some limitations.<\/p>\n<p>&nbsp;<br \/>\n<strong>Clustered index and page free space<\/strong><\/p>\n<p>On one hand we have the order of the rows within a clustered index, which is enforced by the clustering key, and on the other we have that each page in SQL Server is exactly 8192 bytes, from those there are 8096 bytes usable and 96 bytes reserved for internal use, not much.<\/p>\n<p>That shouldn&#8217;t be a problem when we have narrow tables that mostly use fixed length data types, but if we need to store very variable length columns your rows, your rows can vary from a few bytes to hundreds or thousands. Think about &#8216;Comments&#8217; or &#8216;Long description&#8217; we&#8217;ve all seen in the wild.<\/p>\n<p>Let&#8217;s have a look to some queries <\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nUSE master\r\nGO\r\nIF DB_ID('test') IS NOT NULL BEGIN\r\n\tALTER DATABASE test SET SINGLE_USER WITH ROLLBACK IMMEDIATE\r\n\tDROP DATABASE test \r\nEND\r\nGO\r\nCREATE DATABASE test\r\nGO \r\n\r\nUSE test\r\nGO\r\n\r\nCREATE TABLE dbo.clustered_table(\r\nID INT NOT NULL IDENTITY PRIMARY KEY CLUSTERED \r\n, Col1 NVARCHAR(4000) NOT NULL)\r\nGO \r\n\r\nINSERT INTO dbo.clustered_table (Col1) VALUES (REPLICATE (N'A', 3000))\r\nGO 1000\r\n\r\nDBCC IND('test', 'dbo.clustered_table', 1)\r\nGO\r\n\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table.png\" table=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table.png\" alt=\"01_dbcc_ind_clustered_table\" width=\"1152\" height=\"245\" class=\"aligncenter size-full wp-image-343\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table.png 1152w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table-300x64.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table-1024x218.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/01_DBCC_IND_clustered_table-150x32.png 150w\" sizes=\"(max-width: 1152px) 100vw, 1152px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nIf we inspect one of the data pages we can see interesting information on its header.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nDBCC TRACEON(3604)\r\nGO\r\n\r\nDBCC PAGE('test',1,144, 3)\r\nGO\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace.png\" alt=\"02_dbcc_page_freespace\" width=\"1150\" height=\"317\" class=\"aligncenter size-full wp-image-344\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace.png 1150w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace-300x83.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace-1024x282.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/02_DBCC_Page_freeSpace-150x41.png 150w\" sizes=\"(max-width: 1150px) 100vw, 1150px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nSo we have that for each page the free space is 2079 bytes which makes more than 25% of the usable space (8096 bytes) not being used.<\/p>\n<p>&nbsp;<br \/>\n<strong>Heap Table and page free space<\/strong><\/p>\n<p>If now we do the same with a HEAP table, what&#8217;s going to happen?<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nUSE test\r\nGO\r\n\r\nCREATE TABLE dbo.heap_table(\r\nID INT NOT NULL IDENTITY  \r\n, Col1 NVARCHAR(4000) NOT NULL)\r\n\r\nGO \r\n\r\nINSERT INTO dbo.heap_table (Col1) VALUES (REPLICATE (N'A', 3000))\r\nGO 1000\r\n\r\nDBCC IND('test', 'dbo.heap_table', 1)\r\nGO\r\n\r\nDBCC TRACEON(3604)\r\n\r\nDBCC PAGE('test',1,179, 3)\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/03_DBCC_Page_freeSpace_heap.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/03_DBCC_Page_freeSpace_heap.png\" alt=\"03_dbcc_page_freespace_heap\" width=\"833\" height=\"312\" class=\"aligncenter size-full wp-image-345\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/03_DBCC_Page_freeSpace_heap.png 833w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/03_DBCC_Page_freeSpace_heap-300x112.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/03_DBCC_Page_freeSpace_heap-150x56.png 150w\" sizes=\"(max-width: 833px) 100vw, 833px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nWe can also see that the number of pages allocated for each table are pretty similar, <\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nSELECT OBJECT_NAME(ix.object_id)\r\n\t\t, ix.name\r\n\t\t, au.*\r\n\t\t, p.rows\r\n\tFROM sys.indexes AS ix\r\n\t\tINNER JOIN sys.partitions AS p\r\n\t\t\tON p.object_id = ix.object_id \r\n\t\t\t\tAND p.index_id = ix.index_id\r\n\t\tINNER JOIN sys.allocation_units AS au\r\n\t\t\tON au.container_id = p.hobt_id\t\r\nWHERE ix.object_id IN (OBJECT_ID('dbo.heap_table'), OBJECT_ID('dbo.clustered_table'))\r\nGO\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages.png\" alt=\"04_allocated_pages\" width=\"1052\" height=\"153\" class=\"aligncenter size-full wp-image-346\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages.png 1052w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages-300x44.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages-1024x149.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/04_allocated_pages-150x22.png 150w\" sizes=\"(max-width: 1052px) 100vw, 1052px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nNot very different, right? so where is the benefit? Why am I writing this then? \ud83d\ude42<\/p>\n<p>&nbsp;<br \/>\n<strong>Reusing empty space<\/strong><\/p>\n<p>So far, we&#8217;ve seen a similar behavior on both, clustered tables and heap, but we&#8217;ll soon see that they behave completely different for the future inserts we have going to make.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nUSE test \r\nGO\r\n\r\nINSERT INTO dbo.heap_table\t\t(Col1) VALUES (REPLICATE (N'A', 500))\r\nINSERT INTO dbo.clustered_table (Col1) VALUES (REPLICATE (N'A', 500))\r\nGO 1000\r\n<\/pre>\n<p>Let&#8217;s first have a look to the number of allocated pages <\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nSELECT OBJECT_NAME(ix.object_id)\r\n\t\t, ix.name\r\n\t\t, au.*\r\n\t\t, p.rows\r\n\tFROM sys.indexes AS ix\r\n\t\tINNER JOIN sys.partitions AS p\r\n\t\t\tON p.object_id = ix.object_id \r\n\t\t\t\tAND p.index_id = ix.index_id\r\n\t\tINNER JOIN sys.allocation_units AS au\r\n\t\t\tON au.container_id = p.hobt_id\t\r\nWHERE ix.object_id IN (OBJECT_ID('dbo.heap_table'), OBJECT_ID('dbo.clustered_table'))\r\nGO\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows.png\" alt=\"05_allocated_pages_2000_rows\" width=\"1049\" height=\"153\" class=\"aligncenter size-full wp-image-347\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows.png 1049w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows-300x44.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows-1024x149.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/05_allocated_pages_2000_rows-150x22.png 150w\" sizes=\"(max-width: 1049px) 100vw, 1049px\" \/><\/a><\/p>\n<p>Now the fun starts, we can observe that while the Clustered index data pages have grown by 143 pages, the Heap has done it for just 2. Where is the trick then?<\/p>\n<p>There is an undocumented function called <a href=\"http:\/\/www.sqlskills.com\/blogs\/paul\/sql-server-2008-new-undocumented-physical-row-locator-function\/\" target=\"_blank\">sys.fn_PhysLocFormatter<\/a> which does what it says on the tin and returns the formatted physical location of any specific row within a table. (FileId:Page:SlotId)<\/p>\n<p>Let&#8217;s have a look to the clustered index first:<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nSELECT *\r\n\t\t, sys.fn_PhysLocFormatter (%%physloc%%) AS row_location \r\n\tFROM dbo.clustered_table\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/06_physlocformatter_clustered_index.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/06_physlocformatter_clustered_index.png\" alt=\"06_physlocformatter_clustered_index\" width=\"877\" height=\"323\" class=\"aligncenter size-full wp-image-348\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/06_physlocformatter_clustered_index.png 877w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/06_physlocformatter_clustered_index-300x110.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/06_physlocformatter_clustered_index-150x55.png 150w\" sizes=\"(max-width: 877px) 100vw, 877px\" \/><\/a><\/p>\n<p>Every row we were inserting was (at the first thousand) getting a new page for itself and if we navigate till the end we can see how rows are consecutively placed one after another, because at the end of the day, that what a clustered index is.<\/p>\n<p>Now have a look to the Heap:<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nSELECT *\r\n\t\t, sys.fn_PhysLocFormatter (%%physloc%%) AS row_location \r\n\tFROM dbo.heap_table\t\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/07_physlocformatter_heap.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/10\/07_physlocformatter_heap.png\" alt=\"07_physlocformatter_heap\" width=\"850\" height=\"323\" class=\"aligncenter size-full wp-image-349\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/07_physlocformatter_heap.png 850w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/07_physlocformatter_heap-300x114.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/10\/07_physlocformatter_heap-150x57.png 150w\" sizes=\"(max-width: 850px) 100vw, 850px\" \/><\/a><\/p>\n<p>See how it&#8217;s true and the heap behaves completely different, <\/p>\n<p>&nbsp;Id&#8217;s are completely scrambled. This is because new rows are not forced to occupy any specific place within the table, so they can freely go to whichever page, as far as there is enough free space for them to fit.<\/p>\n<p>And that is how space can be reused for Heaps but not for clustered indexes.<\/p>\n<p>&nbsp;<br \/>\n<strong>Conclusion<\/strong><\/p>\n<p>We have seen that in general we should create our tables as Clustered index, but Heaps exist and can be useful in the right situation, and the case I exposed above well might be one of those special cases where it can make sense store our data in a Heap if we have storage constraints. <\/p>\n<p>Obviously I wouldn&#8217;t recommend go wild and start dropping your clustered indexes to save a few bytes, <strong>you should deeply understand your data and how it is accessed<\/strong> before taking such decision.<\/p>\n<p>And as always, thanks for reading and please ask any question you may have!<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the things you realize with experience is that there is not one absolute truth when speaking about SQL Server\u2026 that comes along with the magical answer for most of the&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[35,36,12,25,49],"_links":{"self":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/342"}],"collection":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/comments?post=342"}],"version-history":[{"count":1,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/342\/revisions"}],"predecessor-version":[{"id":350,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/342\/revisions\/350"}],"wp:attachment":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/media?parent=342"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/categories?post=342"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/tags?post=342"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}