// Week 2: Data Pages - the Foundation of SQL Server
Last week we have laid out the foundation about how SQL Server executes queries. I have also already talked here a little bit about pages that are buffers of 8kb. Today we are further concentrating on these pages and drill into more details and what it means from a performance tuning perspective.
Pages are the foundation of SQL Server, everything in SQL Server is about pages. When we want to improve the performance of our queries, we try to lower down the page reads SQL Server needs for a specific query. When we are talking about indexing in the 2nd month, that even index structures are composed of pages. When you don’t know what a page is, you can’t tune and troubleshoot your SQL Server.
Data Page Structure
A page in SQL Server is always 8kb large, and there are different types of pages, like data pages, index pages, system pages, etc. Today we have a more detailed look on data types, where SQL Server stores our table data. A data page always consists of 3 parts:
- Page Header
- Row Offset Array
The page header is always 96 bytes long in SQL Server (independent from the type of the page), and stores general information like the Page ID, Object ID, etc. The most interesting part of the data page is the payload area, because our records are stored in that area. SQL Server gives you from the 8192 bytes (8kb), 8060 bytes for the payload section. Therefore it’s an easy task to calculate how many rows for a given table fit on a page - just divide 8060 by the record size (incl. the internal overhead of at least 7 bytes). If you round down the result, you have the number of records you can store on a data page.
The goal is always to have as much records as possible on a page, because SQL Server has to read and write complete pages. SQL Server isn’t able to read a part of a page from your storage, or write part of a page out to your storage. I/O operations are always done (at least) on a page level.
And finally at the end of the page you have the so-called Row Offset Array. The Row-Offset Array just stores with 2 bytes for every record at which offset on the page the record is located. The first record always begins at the decimal offset of 96 - directly after the page header. The following picture gives you an overview about the described structure of the data page.
Data Page Internals
Let’s have a look on a simple table definition, like the following one:
CREATE TABLE Customers ( FirstName CHAR(50) NOT NULL, LastName CHAR(50) NOT NULL, Address CHAR(100) NOT NULL, ZipCode CHAR(5) NOT NULL, Rating INT NOT NULL, ModifiedDate DATETIME NOT NULL, ) GO
With such a table definition it’s now very easy to calculate how many records we can store on one page. The size of a record is here 217 bytes long (50 + 50 + 100 + 5 + 4 + 8). When you now divide 8060 by 217, you get 37.14, mean you are able to store 37 records of that table on one data page. The other remaining space of the table - in our case 31 bytes (8060 - 217 * 37) are just waisted, because a data page always belongs to a specific database object, and can’t be shared amoung other objects. In the worst case, when your table definition has a record size of 4031 bytes, you are waisting 4029 bytes on every page. Things will change here when you introduce variable length data types, like VARCHAR, because SQL Server is then able to store these column on different pages.
If you want to know how much space on your pages is waisted by your table design, you can query the buffer pool through the Dynamic Management View sys.dm_os_buffer_descriptors. Every record from this DMV represents one page that you are currently storinng in the buffer pool, so please be aware of this, when you are querying this DMV on machines with a larger amount of RAM. The column free_space_in_bytes tells you how much space is currently free on the specific page. The following query returns you, how much space is waisted by every database on your SQL Server instance.
SELECT DB_NAME(database_id), SUM(free_space_in_bytes) / 1024 AS 'Free_KB' FROM sys.dm_os_buffer_descriptors WHERE database_id <> 32767 GROUP BY database_id ORDER BY SUM(free_space_in_bytes) DESC GO
This is always a query that I’m running on a system (e.g. during a SQL Server Health Check), to find out which database maybe has a bad table design.
I hope that I have given you with the 2nd newsletter of the SQL Server Performance Tuning series a better understanding about data pages in SQL Server, and why they are important for performance tuning. As you also have seen you can directly influence how many data pages a given table needs by concentrating on the table design.