{"id":289,"date":"2016-09-06T11:05:48","date_gmt":"2016-09-06T10:05:48","guid":{"rendered":"https:\/\/sqldoubleg.live-website.com\/?p=289"},"modified":"2016-09-08T14:09:10","modified_gmt":"2016-09-08T13:09:10","slug":"database-design-matters-choosing-the-right-data-type","status":"publish","type":"post","link":"https:\/\/www.sqldoubleg.com\/es\/2016\/09\/06\/database-design-matters-choosing-the-right-data-type\/","title":{"rendered":"Database design Matters, choosing the right data type"},"content":{"rendered":"<p>This is not the first time I write about best practices or how a good design can help even from a performance point of view, but today I\u2019ll go to an earlier stage in database design, choosing data types&nbsp;This is not the first time I write about best practices or how <a href=\"https:\/\/sqldoubleg.live-website.com\/2016\/06\/16\/database-design-matters-seriously\/\" target=\"_blank\">a good design can help even from a performance point of view<\/a>, but today I\u2019ll go to an earlier stage in database design, choosing data types.<\/p>\n<p>In general, we want always to use the smallest data type possible in order to minimize storage requirements.<br \/>\nNumeric or date\/time data types might be a bit more complicated since there are several and depend on the different ranges and precision, we can find the reference in <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms187752.aspx\" target=\"_blank\">Books Online<\/a>.<\/p>\n<p>But if we speak about alphanumeric data, the choices are more limited, we can only choose between fixed or variable length and to allow UNICODE or not. Even though it may sound not a big deal, the consequences can lead us to&#8230; downtime.<\/p>\n<p>&nbsp;<br \/>\n<strong>Set up the playground<\/strong><\/p>\n<p>As I said choosing the smallest data type is usually a good idea, but we need to know well our data and foresee future requirements.<\/p>\n<p>It\u2019s also a common believe that we only will have problems if we go the way from a big data type to a smaller one. This is not true, we can have problems the other way too due to how SQL Server internally will handle the change. <\/p>\n<p>Let me show you from the beginning so let\u2019s create a table with a VARCHAR column that later we will convert to NVARCHAR<\/p>\n<p><strong>**All these demos are run in SQL Server 2016 Developer Edition.<\/strong><\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nCREATE DATABASE var_to_nvar\r\nGO\r\nUSE var_to_nvar\r\nGO\r\n\r\nCREATE TABLE dbo.t (\r\nID INT IDENTITY PRIMARY KEY\r\n, var_col VARCHAR(255) )\r\nGO\r\n\r\nINSERT INTO dbo.t\r\nSELECT TOP (100) 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'\r\nFROM sys.all_columns AS t1\r\nGO \r\n\r\n<\/pre>\n<p>Now that we have our table, we can see some internals like how many pages are allocated to the clustered index.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\n\r\nDBCC IND ('var_to_nvar', 'dbo.t', 1)\r\n-- 1 data page, 224\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND.png\" alt=\"01_DBCC_IND\" width=\"1154\" height=\"205\" class=\"aligncenter size-full wp-image-290\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND.png 1154w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND-300x53.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND-1024x182.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/01_DBCC_IND-150x27.png 150w\" sizes=\"(max-width: 1154px) 100vw, 1154px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nIf we look inside this page, we can see how the data is internally stored. We need to know the database id and activate trace flag 3604 for the current session to see the output of DBCC PAGE()<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nSELECT DB_ID()\r\n-- 10\r\n\r\nDBCC TRACEON (3604) \r\nDBCC PAGE(10, 1, 224, 3)\r\n\r\n\/*\r\nSlot 0 Offset 0x60 Length 53\r\n\r\nRecord Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS\r\nRecord Size = 53                    \r\nMemory Dump @0x0000000CDDE7A060\r\n\r\n0000000000000000:   30000800 01000000 02000001 00350061 61616161  0............5.aaaaa\r\n0000000000000014:   61616161 61616161 61616161 61616161 61616161  aaaaaaaaaaaaaaaaaaaa\r\n0000000000000028:   61616161 61616161 61616161 61                 aaaaaaaaaaaaa\r\n\r\nSlot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4\r\n\r\nID = 1                              \r\n\r\nSlot 0 Column 2 Offset 0xf Length 38 Length (physical) 38\r\n\r\nvar_col = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa                         \r\n\r\nSlot 0 Offset 0x0 Length 0 Length (physical) 0\r\n*\/\r\n<\/pre>\n<p>See how every &#8216;a&#8217; is represented by the byte &#8217;61&#8217; so we have as many as thirty eight of them.<\/p>\n<p>&nbsp;<br \/>\n<strong>Changing the data type<\/strong><\/p>\n<p>If it wasn&#8217;t the need of storing UNICODE or you didn&#8217;t ask the right questions before creating your table, you can find yourself having to change the data type.<\/p>\n<p>You&#8217;ll see soon how this is a problem.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nUSE var_to_nvar\r\nGO\r\n\r\nBEGIN TRAN\r\n\r\nALTER TABLE dbo.t ALTER COLUMN var_col NVARCHAR(255)\r\n<\/pre>\n<p>&nbsp;<br \/>\nWhile our transaction is running the table is not accessible by any other connection, so the rest of your users have to wait.<\/p>\n<p>In another window you can run <\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nUSE var_to_nvar\r\nGO\r\nSELECT * FROM sys.dm_tran_locks\r\nWHERE resource_associated_entity_id = 565577053\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks.png\" alt=\"02_dm_tran_locks\" width=\"1160\" height=\"151\" class=\"aligncenter size-full wp-image-294\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks.png 1160w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks-300x39.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks-1024x133.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/02_dm_tran_locks-150x20.png 150w\" sizes=\"(max-width: 1160px) 100vw, 1160px\" \/><\/a><\/p>\n<p>&nbsp;<br \/>\nSchema Modification Lock (Sch-M) is not compatible with any other lock, so bad luck if someone wants to get any of that data.<\/p>\n<p>But you can say \u00abhey, you didn&#8217;t commit your transaction!\u00bb and that is true, only for the purpose of seeing the lock, because to be honest, 100 rows won&#8217;t make your life complicated, but if you keep reading I&#8217;ll show you all the changes behind the scenes and you can see by yourself how much work for this little number of rows.<\/p>\n<p>Let&#8217;s get back and COMMIT our transaction and then check again how our clustered index look like.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nCOMMIT\r\n\r\nDBCC IND ('var_to_nvar', 'dbo.t', 1)\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE.png\" alt=\"03_DBCC_IND_after_ALTER_TABLE\" width=\"1146\" height=\"184\" class=\"aligncenter size-full wp-image-292\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE.png 1146w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE-300x48.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE-1024x164.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/03_DBCC_IND_after_ALTER_TABLE-150x24.png 150w\" sizes=\"(max-width: 1146px) 100vw, 1146px\" \/><\/a><\/p>\n<p>Now we see there are 2 levels in our index and 3 data pages (page type 1) as opposed to one page we had before, and if we check one of those, we can see how a row looks like after the change.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nDBCC TRACEON (3604) \r\nDBCC PAGE(10, 1, 224, 3)\r\n\r\n\/*\r\nSlot 0 Offset 0xe3f Length 131\r\n\r\nRecord Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS\r\nRecord Size = 131                   \r\nMemory Dump @0x00000026EE67AE3F\r\n\r\n0000000000000000:   30000800 01000000 03000002 00370083 00616161  0............7.\u0083.aaa\r\n0000000000000014:   61616161 61616161 61616161 61616161 61616161  aaaaaaaaaaaaaaaaaaaa\r\n0000000000000028:   61616161 61616161 61616161 61616161 00610061  aaaaaaaaaaaaaaaa.a.a\r\n000000000000003C:   00610061 00610061 00610061 00610061 00610061  .a.a.a.a.a.a.a.a.a.a\r\n0000000000000050:   00610061 00610061 00610061 00610061 00610061  .a.a.a.a.a.a.a.a.a.a\r\n0000000000000064:   00610061 00610061 00610061 00610061 00610061  .a.a.a.a.a.a.a.a.a.a\r\n0000000000000078:   00610061 00610061 006100                      .a.a.a.a.a.\r\n\r\nSlot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4\r\n\r\nID = 1                              \r\n\r\nSlot 0 Column 67108865 Offset 0x11 Length 0 Length (physical) 38\r\n\r\nDROPPED = NULL                      \r\n\r\nSlot 0 Column 2 Offset 0x37 Length 76 Length (physical) 76\r\n\r\nvar_col = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa                         \r\n*\/\r\n<\/pre>\n<p>Here we can see something very important to be taken in mind, after issuing ALTER TABLE and changing the data type, the column with the new data type is added after the old one, which is left behind and only logically removed (see DROPPED = NULL), that makes our storage requirements grow quite a lot (hence now we have 3 data pages)<\/p>\n<p>If we then rebuild the index we can see how the column with the old data type is not populated to the new structure and therefore we only use the minimum space needed.<\/p>\n<p>&nbsp;<br \/>\n<strong>SQL Server 2016 to the rescue<\/strong><\/p>\n<p>If you are running SQL Server 2016 (I suspect only Enterprise or Developer Edition) you will have some relieve for this pain, there is an ONLINE option, which does not do what is says on the tin, but somehow helps a lot.<\/p>\n<p>Let&#8217;s try it, you can drop the database and get to this point to see it.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nBEGIN TRAN \r\n\r\nALTER TABLE dbo.t ALTER COLUMN var_col NVARCHAR(255) WITH (ONLINE = ON)\r\n<\/pre>\n<p>The operation I said IT IS NOT ONLINE, as the we still have the Schema Modification lock, furthermore, we have two<\/p>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON.png\" alt=\"04_dm_tran_locks_online_on\" width=\"1183\" height=\"168\" class=\"aligncenter size-full wp-image-296\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON.png 1183w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON-300x43.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON-1024x145.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/04_dm_tran_locks_ONLINE_ON-150x21.png 150w\" sizes=\"(max-width: 1183px) 100vw, 1183px\" \/><\/a><\/p>\n<p>And this is because this way (ONLINE) it will rebuild the index, so we don&#8217;t leave behind all that junk in our data pages.<\/p>\n<p>See how the index is just one page as it used to be (because 100 rows still fit in one page), <\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nCOMMIT\r\n\r\nDBCC IND ('var_to_nvar', 'dbo.t', 1)\r\n<\/pre>\n<p><a href=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online.png\" target=\"_blank\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqldoubleg.live-website.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online.png\" alt=\"05_dbcc_ind_after_alter_table_online\" width=\"1128\" height=\"156\" class=\"aligncenter size-full wp-image-297\" srcset=\"https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online.png 1128w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online-300x41.png 300w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online-1024x142.png 1024w, https:\/\/www.sqldoubleg.com\/wp-content\/uploads\/2016\/09\/05_DBCC_IND_after_ALTER_TABLE_online-150x21.png 150w\" sizes=\"(max-width: 1128px) 100vw, 1128px\" \/><\/a><\/p>\n<p>And this page is a new page (see page Id) and looks brand new if we look inside.<\/p>\n<pre class=\"brush: tsql; title: ; notranslate\" title=\"\">\r\nDBCC TRACEON (3604) \r\nDBCC PAGE(10, 1, 256, 3)\r\n\r\n\/*\r\nSlot 0 Offset 0x60 Length 35\r\n\r\nRecord Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS\r\nRecord Size = 35                    \r\nMemory Dump @0x000000D25C9FA060\r\n\r\n0000000000000000:   30000800 01000000 02000001 00230061 00610061  0............#.a.a.a\r\n0000000000000014:   00610061 00610061 00610061 006100             .a.a.a.a.a.a.a.\r\n\r\nSlot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4\r\n\r\nID = 1                              \r\n\r\nSlot 0 Column 2 Offset 0xf Length 20 Length (physical) 20\r\n\r\nvar_col = aaaaaaaaaa                \r\n\r\nSlot 0 Offset 0x0 Length 0 Length (physical) 0\r\n\r\nKeyHashValue = (8194443284a0)   \r\n*\/    \r\n<\/pre>\n<p>Completely clean.<\/p>\n<p>&nbsp;<br \/>\n<strong>Workaround<\/strong><\/p>\n<p>If you want to avoid some of these problems my recommendation would be to create a new table, dump the data, rename both old and new one and then remove the old one, which is basically what you get with the 2016 online hint, but better since you won&#8217;t be locking the table for longer than the milliseconds to change both names.<\/p>\n<p>&nbsp;<br \/>\n<strong>Conclusion<\/strong><\/p>\n<p>We have seen there are some problems when data exists in our table and we want to change a column&#8217;s data type (the more rows the bigger problem we might have). These are:<\/p>\n<ul>\n<li>Concurrency, ALTER TABLE will hold a Schema Modification lock until completes, so if we have a lot of data we will lock the whole table until the operation is completed, not allowing any access.<\/li>\n<li>Transaction Log Generation, there is a <a href=\"http:\/\/www.sqlskills.com\/blogs\/paul\/how-expensive-are-page-splits-in-terms-of-transaction-log\/\" target=\"_blank\">fantastic article by Paul Randal<\/a> where is explained how expensive page splits are and this operation since we are increasing the length of every and each row in the table (if has data for that column) will result in page splits.<\/li>\n<li>Network, If you have AG or another HA\/DR solution, all that generated transaction log will need to be moved across the network.<\/li>\n<li>Storage, this point might be the least critical as our databases usually are in constant growth, but good if you take it in mind.<\/li>\n<li>Fragmentation, on top of the above, once we have finished, our clustered index will be heavily fragmented, so an INDEX REBUILD would be necessary, adding more CPU, concurrency and storage requirements to clean up all that mess.<\/li>\n<\/ul>\n<p>&nbsp;<br \/>\nAs always thanks for reading and feel free to ask any question you may have.<\/p>\n<p>&nbsp;<br \/>\n&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is not the first time I write about best practices or how a good design can help even from a performance point of view, but today I\u2019ll go to an earlier&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[40,5,12],"_links":{"self":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/289"}],"collection":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/comments?post=289"}],"version-history":[{"count":1,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/289\/revisions"}],"predecessor-version":[{"id":295,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/posts\/289\/revisions\/295"}],"wp:attachment":[{"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/media?parent=289"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/categories?post=289"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqldoubleg.com\/es\/wp-json\/wp\/v2\/tags?post=289"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}