I dag bliver emnet omkring de begrænsninger der er implementeret omkring data pages. Mange har delte meninger om disse begræsninger, og måske du også har din. Skriv den gerne som kommentar nedenfor, når du har læst indlægget.
Data Pages er, som kendt fra uge 2, altid 8kb data og man kan gemme 8060 bytes data på dem. Og størrelsen på den enkelte række i datasættet afgør hvor mange rækker, der kan være på en enkelt data page. Når man arbejder med faste datalængder (Eks. CHAR, INT, DATETIME osv) så er der en fast begræsning på størrelsen af en række, som ikke kan overstige 8060 bytes (inkl. den interne overhead fra SQL serveren).
Begrænsninger – de gode af dem
Lad os tage et eksempel med en tabel med mindre end 8 kolonner. Her skal man lægge 7 bytes til for den interne overhead for SQL serveren. Og for hver 8 kolonner herefter skal man lægge 1 byte. Så ved 17 kolonner skal man have 9 bytes som overhead.
Hvis man forsøger at lave en tabel, som indeholder rækker med mere data end der kan være på en database (inkl. overhead), så returner SQL serveren en besked til dig – eks:
Som du kan se nedenfor, så kræver denne tabel 8061 bytes for hver række (5000+3000+54+7). Det er mere end der kan være på en Data Page.
Msg 1701, Level 16, State 1, Line 1 Creating or altering table ‘LargeTable’ failed because the minimum row size would be 8061, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.
Hvis du laver en tabel med mere end 8 kolonner, så skal man huske at tillægge de ekstra bytes for hver 8 kolonner. Eks:
CREATE TABLE LargeTable
(
Kolonne1 CHAR(1000) NOT NULL,
Kolonne2 CHAR(1000) NOT NULL,
Kolonne3 CHAR(1000) NOT NULL,
Kolonne4 CHAR(1000) NOT NULL,
Kolonne5 CHAR(1000) NOT NULL,
Kolonne6 CHAR(1000) NOT NULL,
Kolonne7 CHAR(1000) NOT NULL,
Kolonne8 CHAR(1000) NOT NULL,
Kolonne9 CHAR(53) NOT NULL
)
Igen giver dette mere end de 8060 bytes der er tilladt på en data page og samme fejl som tidligere kommer fra SQL serveren:
Msg 1701, Level 16, State 1, Line 1 Creating or altering table ‘LargeTable’ failed because the minimum row size would be 8061, including 8 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.
Begræsninger – de knapt så gode
I det ovenstående afsnit, var de gode begrænsninger. De er gode, for SQL serveren viser og fortæller dig allerede ved oprettelse af tabellen, at der er noget galt.
Men der er også nogen som du vil komme til at synes er knapt så gode. For her vil SQL serveren fint tillade dig at oprette tabellen, men idet du begyndet at indsætte data i den, vil fejlene komme væltende. Nogen gange kan du være glad for at det lykkes, andre gange fejler det.
Problemet er datatyper med variabel længde, som VARCHAR. Når disse felter ikke passer på en enkelt data page alene, så forsøger SQL serveren at offloade dem til en såkaldt off-row location på andre data pages. Dette kales også for Row-Overflow Page (en ny page type til listen…). På den originale data page sættes nu en markør på 24 bytes som peger på den nye row-overflow page. I nogen tilfælde kan denne markør sammen med længden på de andre kolonner godt overskride de 8060 tilladte bytes.
Et eksempel på en sådan tabeldefinition kunne være:
Bemærk den sidste kolonne er VARCHAR(3000). Her giver SQL serveren dig en advarsel om at tabellen godt kan oprettes, men at INSERT/UPDATE handlinger måske kan fejle.
Warning: The table “LargeTable” has been created, but its maximum row size exceeds the allowed maximum of 8060 bytes. INSERT or UPDATE to this table will fail if the resulting row exceeds the size limit.
Den følgende insert statement vil kunne lade sig gøre:
Her skal SQL Serveren flytte data fra den 4 kolonne til en row-overflow data page, fordi de 3000 bytes, sammen med de første kolonner, ikke kan være på den samme data page. Så her efterlader SQL serveren en markør på data pagen på 245 bytes som peger på den ekstra data page. Rækken er derfor nu 8061 bytes lang (5000+3000+30+24+7 bytes).
Så nu fylder datarækken mere end 8060 bytes og INSERT statement fejler.
Dette er den dårlige begræsning på data pages, som først rammer dig når de endelige transaktioner foregår på databasen.
Det er en god ide at huske på denne begrænsning når du opretter tabeller på din database.
Opsummering – Data Page begræsninger
Når man designer tabeller på SQL serveren, bliver man virkelig nødt til at tænke over hvad man laver. Som gennemgået ovenfor, så er der nogen begræsninger som gør at man kan ramme en masse fejl når først dataflow og det hele ruller.
Nogen gange er man heldig at det kan køre igennem, og andre gange hjælper SQL serveren med en fejlmeddelelse.
Selv om man bliver advaret, har jeg set installationer, som stadig har disse fejl. Så husk det nu 🙂
Næste gang vender jeg HEAP tables og hvad det er for en størrelse på en SQL database.
Husk at skrive dig på maillisten nedenfor, så du ikke misser næste udgave af denne blogserie. Jeg lover dig kun at sende en mail, når der er nyt i denne serie…
På et overordnet plan er Extents en samling af 8 pages på hver 8kb. Dermed er de dataelementer på 64kb. SQL server har to forskellige typer Extents.
Mixed Extents
Uniform Extents
Mixed og Uniform Extents
I en Mixed Extent er de 8 pages fra forskellige databaseobjekter, såsom tabeller og indexes. Det betyder også at en Mixed Extent kan pege på 8 forskellige objekter. Modsat en Uniform Extent, som kun indeholder pages fra et og samme databaseobjekt.
Det spændende er nu, hvordan SQL Serveren arbejder med denne forskel og hvorfor det er sådan.
Før årtusindeskiftet var diskkapacitet dyrt, meget dyrt. Målet var den gang at optimere så meget som muligt på anvendelsen af denne kapacitet. Derfor bliver de første 8 pages af nye tabeller eller indexes altid allokeret til Mixed Extents. Det betyder også at tabeller og indexes vokser i størrelse med 8kb ad gangen. Resultatet bliver at små tabeller forbliver meget små – forbruget af kapaciteten er så effektiv som muligt.
Så snart et databaseobjekt overstiger 8 pages og den 9. page skal allokeres, bliver der også allokeret en hel Extent til det objekt. Størrelsen stiger derfor fra 8kb til 72kb (8kb + 64kb). Med den 17 page stiger den til 136kb (8kb + 64kb + 64kb) osv.
Med de nuværende priser på diskkapacitet, så er det lidt hovedrystende at have dette designvalg, men set historisk har det været et vigtigt valg for at spare diskplads.
Extent Management
Men hvordan håndterer SQL Serveren så alle disse Extents? Forestil dig en database på 1 TB – det er en kæmpe bunke Extents og endnu flere pages. SQL Serveren anvender her 2 særlige pages, som igen (og som altid) er 8kb:
Global Allocation Map Pages (GAM)
Shared Global Allocation Map Pages (SGAM)
Uniform Extents er altid håndteret af GAM pages. SQL serveren bruger GAM pages 8.000 bytes (64.000 bits) til at danne et bitmap som repræsenterer en Uniform Extent. Hvis en bit er tændt bliver den specifikke Extent ikke brugt (den er dermed ledig), hvis den er slukket bliver den brugt. Det betyder at en GAM page kan håndtere 4 GB data (64.000 x 64 / 1024 / 1024). Derfor ses det også at GAM pages ligger vel spredt ud over datafilen til databasen – for hver 4 GB. Det samme er gældende for SGAM pages. Den håndterer også 4 GB data, men for Mixed Extents.
Når man indsætter en række i en tabel, finder SQL serveren, via SGAM pages, de Mixed Extents som har mindst en ledig data page. Hvis en tabel eller index er større end 64kb, så finder SQL Serveren den næste ledige Uniform Extent via gennemsøgning af GAM pages.
Når jeg senere kommer til at tale om TempDb, så kommer jeg også til at tale om seriøse performanceproblemer, når det workload man arbejder med opretter store mængder data i TempDb. Jeg kommer også med tips og tricks til hvordan man kan komme omkring det med konfiguration af TempDb (on-premise og managed instance i Azure).
Opsummering – Extent Management
Med denne gennemgang af Extent Management og de tidligere indlæg om Pages og motorrummet, begynder grundlaget for performancetuning at være på plads.
Har du fået blod på tanden til at vide mere om Extents på SQL Serveren, så er der lidt links nedenfor.
Sidste uge var det hårde indløb og alle de nye termer omkring SQL serveren. Jeg skrev om hvordan en SQL server eksekverer de queries, vi sender afsted til den. Gik du glip af det afsnit, kan du læse det her:
Jeg berørte også meget kort omkring Data Pages, som er buffere på 8kb (kilobyte). I dette indlæg tager jeg fat i dette begreb yderligere og får beskrevet hvad de betyder for performance på SQL serveren.
Pages er fundamentet for en SQL server, alt omkring en SQL server handler i bund og grund om Pages. Når man gerne vil forbedre performance af de queries, der arbejdes med, forsøger man altid at mindske det antal Page reads SQL serveren skal foretage. Senere i denne blogserie kommer jeg ind på index og at de også er baseret på Pages.
Så at forstå hvad en Page er, giver grobund for god performance tuning på SQL serveren.
Data Pages og deres struktur
En Page på SQL serveren fylder altid 8kb, og der er forskellige typer Pages, som f.eks. Data Pages, Index Pages, System Pages osv. I dette indlæg vil jeg kigge i detaljer på Data Pages, som SQL serveren anvender til at gemme data fra tabellerne. En Data Page består af 3 dele:
Page Header
Payload
Row Offset Array
Page Header er altid 96 bytes i længde på en SQL server (også uanset hvilken type Page det er). Header indeholder oplysninger om f.eks. Page ID og Object ID. Payload området i en Data Page er der hvor data fra tabellerne bliver fysisk gemt. SQL serveren har allokeret 8060 bytes til dette formål. Den skarpe læser, vil nu kunne se, at man kan regne på hvor mange rækkker der kan være i en enkelt Data Page, ved at se på hvor meget en række fylder (alle kolonnernes datatyper i længde (bytes)) og dele den værdi op i de 8060 bytes. Evt. decimaler i den udregning “kappes af”.
Målet med opbygningen af data på en SQL server, er altid at have så mange rækker som muligt i en enkelt Data Page, for som du måske husker fra sidste indlæg, så skal SQL serveren have fat i alle de Data Pages, der skal til for at arbejde med det fornødne data i forbindelse med en given forespørgsel. SQL Serveren kan ikke kun læse en delmængde af en Data Page. I/O operationer sker altid (som minimum) på Page niveau.
Den sidste del af en Data Page er Row Offset Array. Denne lille del af en Data Page indeholder 2 bytes for hver række i pagen med henvisning til hvor i pagen den givne række er placeret. Lidt som et mini-index i en bog.
Den første række på en page begynder altid umiddelbart efter Page Headeren – altså efter byte placering 96. De efterfølgende rækkers placering er givet ud fra rækkernes længde, som beskrevet lidt tidligere.
På nedenstående billede, kan du se opbygningen af en Data Page, som jeg har forsøgt at beskrive ovenfor.
Data Page Internals
En nogen lunde simpel tabel på SQL server kunne se ud som nedenfor.
Ud fra ovennævnte tabel struktur kan vi nu regne på hvor mange rækker der kan være på en Data Page.
En rækker fylder 217 bytes (50+50+100+5+4+8) – int bruger 4 bytes og datetime gemmes reelt som bigint på en DataPage og bruger 8 bytes.
Ved at dele de allokerede 8060 bytes med 217, får vi 37 hele rækker og 31 bytes i overskud. Disse 31 bytes kan ikke bruges til andet, da SQL serveren ikke kan fylde andet i en specifik Data Page, end det er hører til en specifik tabel. Så hvis en række fylde 4031 bytes ville man have 4029 bytes som spildt plads i en Data Page – og denne fylder stadig 8kb på disk.
Tingene ændrer sig, hvis man anvender datatyper med variabel længde eks VARCHAR (alle dem med “var” foran), for her kan SQL serveren gemme kolonnerne på forskellige Data Pages og dermed optimere lidt på Page forbruget.
SQL serveren kan faktisk hjælpe med et samlet overblik over hvor meget plads der er spildt i alle Data Pages, og dermed spildt i det tabel design. Via det såkalde Dynamic Management View (DMV) sys.dm_os_buffer_descriptors, kan man se alle rækker fra resultatsættet som repræsenterer en Data Page (Pas dog på med at eksekvere denne på systemer med meget ram, da resultatsættet vil være tilsvarende stort). I kolonnen free_space_in_bytes vises hvor meget ledig plads der i hver enkelt Data Page.
En god standard forespørgsel at have ved hånden kunne være nedenstående:
select db_name(database_id), sum(free_space_in_bytes)/1024 as 'FreeSpaceInKb'
from sys.dm_os_buffer_descriptors
where database_id <> 32767
group by database_id
order by 2 desc
Opsummering – Data Pages
Jeg håber det har givet lidt mere klarhed over hvad en Data Page på en SQL server er, hvordan man bør tænke over sine tabellers struktur ifm. oprettelsen af dem og hvordan SQL serveren bruger dem til at gemme data på disk.
Skulle du have fået mere blod på tanden for at læse om Data Pages – kan jeg anbefale følgende sider:
Dette er første indlæg i en længere serie om SQL Performnace Tuning. Før vi hopper direkte ud i alle de lækre detaljer omkring performance tuning på en SQL server, er det vigtigt at have grundlaget på plads.
I dette indlæg kigger jeg derfor nærmere på hvordan en SQL server eksekverer en forespørgsel. Det er en vigtig del af det at forstå SQL serverens metoder, da det vil være herpå de næste indlæg vil bygge videre.
Som vist ovenfor er SQL serveren internt delt i en Relational Engine og en Storage Engine. Den største del af Relational Engine er Query Optimizeren (ofte blot kald optimizeren). Det eneste opgave for optimizeren er at generere en fysisk execution plan (mere herom i et senere indlæg) for de forespørgsler vi sender til SQL serveren.
Læsning af data
Alle forespørgsler – eller queries – som vi sender til SQL Serveren passerer igennem Protocol Layer til Command Parser. Command Parser kontrollerer den kode vi sender afsted – f.eks om det er valid TSQL kode og eksiterer alle anvendte tabeller og kolonner i databasen. Resultatet af denne opgave er et såkaldt Query Tree, en træstruktur som repræsenterer den afsendte query. Denne træstruktur bliver så anvendt af optimizeren til at generere en execution plan.
Den færdige execution plan bliver herefter sendt videre til Query Executor. Her er opgaven at udføre de handlinger som execution planen foreskriver. Inden dette sker, gemmes den modtage execution plan i Plan Chachen – her kan SQL serveren nemlig genbruge execution planer fra tidligere. Denne metode med at gemme og genbruge execution plans er på samme tid et meget stærkt og meget farligt koncept af SQL serveren. Mere herom i et senere indlæg omkring Plan Cache.
Efter execution plan er gemt i cachen, begynder query executor at kommunikere med storage engine og eksekverer hver en lille del af execution plan – kaldet operators.
Når data tilgås fra execution plan (dette er det eneste sted vi kan få fat i data) er det Access Methods som kommunikerer med Buffer Manager for at læse specifikke pages – mere om pages i de næste indlæg, og lige nu er det nok at vide at en page er en buffer med data der fylder 8kb hvori data og index er gemt. Det er Buffer Manager der styrer Buffer Pool hvor pages er gemt. Det er Buffer Pool der er den fysiske anvendelse af memory (ram) som man kan se SQL serveren anvender i operativsystemet.
Når en page allerede er gemt i buffer pool, så bliver denne page øjeblikkeligt returneret. Når dette sker er det en Logical Read i SQL serveren. Hvis en page ikke allerede er gemt i buffer pool, så udfører buffer manager en asynkron I/O forespørgsel mod den fysiske disk og ind i buffer pool. Dette er en Physical Read. Under denne asynkrone handling venter querien indtil handlingen er færdig. Dette kaldes Waits og senere i et indlæg kommer jeg ind på dette emne og Wait Statistics.
Så snart en page er læst ind til buffer pool, bliver denne page sendt videre til den proces som forespurgte på den. Når execution plan er færdig, vil resultatet af det behandlede data blive returneret til brugeren eller applikationen gennem Protocol Layer.
Ændring af data
Når man arbejder med TSQL udtryk som ændrer på det eksisterende data (INSERT, DELTE, UPDATE og MERGE), så arbejder storage engine også med Transaction Manageren. Opgaven for denne proces er at skrive records til transaktionsloggen som beskriver de handlinger der sker i transaktionen. Så snart disse records er skrevet til loggen, kan transaktionen blive kørt færdig. Dette betyder også at SQL serveren kun kan være så hurtig som den tilhørende transaktionlog.
Pages som er ændret i hukommelsen på serveren bliver skrevet til disk gennem den såkaldte CHECKPOINT proces. Som udgangspunkt eksekveres checkpoint processen hvert minut og skriver dirty pages fra buffer manageren til disk. En dirty page er en page med ændringer som endnu ikke er skrevet til disk.
Når en page er blevet skrevet til disk markeres den som “clean” igen.
Opsummering – Performance Tuning
Som du sikkert allerede har erfaret af denne indledning til SQL Performance Tuning, så er der rigtig mange ting der sker på SQL serveren, når man eksekverer en forespørgsel. Mange af de ord og termer der er brugt i dette indlæg, vil blive gjort mere klar senere i andre indlæg.
Har du fået mere blod på tanden til at læse videre om hvordan SQL serveren håndterer forespørgsler og databehandling, så vil jeg anbefale dig at læse nogen af nedenstående blogs:
This
blogpost will cover some of the basics in recursive CTE’s and explain the
approach done by the SQL Server engine.
First of
all, a quick recap on what a recursive query is.
Recursive
queries are useful when building hierarchies, traverse datasets and generate
arbitrary rowsets etc. The recursive part (simply) means joining a rowset with
itself an arbitrary number of times.
A recursive
query is defined by an anchor set (the base rowset of the recursion) and a
recursive part (the operation that should be done over the previous rowset).
The basics in recursive CTE
A recursive
query helps in a lot of scenarios. For instance, where a dataset is built as a
parent-child relationship and the requirement is to “unfold” this dataset and
show the hierarchy in a ragged format.
A recursive
CTE has a defined syntax – and can be written in general terms like this – and
don’t run way because of the general syntax – a lot of examples (in real code)
will come:
select result_from_previous.*
from result_from_previous
union all
select result_from_current.*
from set_operation(result_from_previous, mytable) as result_from_current
Or
rewritten in another way:
select result_from_previous.*
from result_from_previous
union all
select result_from_current.*
from result_from_previous.*
join mytable
on condition(result_from_previous)
Another way
to write the query (using cross apply):
select result_from_current.*
from result_from_previous
cross apply (
select result_from_previous.*
union all
select *
from mytable
where condition(result_from_previous.*)
) as result_from_current
The last
one – with the cross apply – is row based and a lot slower than the other two.
It iterates over every row from the previous result and computes the scalar
condition (which returns true or false). The same row then gets compared to
each row in mytable and the current row of result_from_previous. When these
conditions are real – the query can be rewritten as a join. Why you should not
use the cross apply for recursive queries.
The reverse
– from join to cross apply – is not always true. To know this, we need to look
at the algebra of distributivity.
Distributivity algebra
Most of us
have already learned that below mathematics is true:
X x (Y + Z) = (X x Y) + (X x Z)
But below
is not always true:
X ^ (Y x Z) = (X ^ Z) x (X ^ Y)
Or said
with words, distributivity means that the order of operations is not important.
The multiplication can be done after the addition and the addition can be done
after the multiplication. The result will be the same no matter what.
This
arithmetic can be used to generate the relational algebra – it’s pretty
straight forward:
set_operation(A union all B, C) = set_operation(A, C) union all set_operation(B, C)
The
condition above is true as with the first condition in the arithmetic.
So the
union all over the operations is the same as the operations over the union all.
This also implies that you cannot use operators like top, distinct, outer join
(more exceptions here). The distribution is not the same
between top over union all and union all over top. Microsoft has done a lot of
good thinking in the recursive approach to reach one ultimate goal – forbid operators
that do not distribute over union all.
With this
information and knowledge our baseline for building a recursive CTE is now in
place.
The first recursive query
Based on
the intro and the above algebra we can now begin to build our first recursive
CTE.
Consider a
sample rowset (sampletree):
id
parentId
name
1
NULL
Ditlev
2
NULL
Claus
3
1
Jane
4
2
John
5
3
Brian
From above
we can see that Brian refers to Jane who refers to Ditlev. And John refers to
Claus. This is fairly easy to read from this rowset – but what if the hierarchy
is more complex and unreadable?
A sample
requirement could be to “unfold” the hierarchy in a ragged hierarchy so it is
directly readable.
The anchor
We start
with the anchor set (Ditlev and Claus). In this dataset the anchor is defined
by parentId is null.
This gives
us an anchor-query like below:
Now on to
the next part.
The recursive
After the anchor part, we are ready to build
the recursive part of the query.
The
recursive part is actually the same query with small differences. The main
select is the same as the anchor part. We need to make a self join in the
select statement for the recursive part.
Before we
dive more into the total statement – I’ll show the statement below. Then I’ll
run through the details.
Back to the
self-reference. Notice the two red underlines in the code. The top one
indicates the CTE’s name and the second line indicates the self-reference. This
is joined directly in the recursive part in order to do the arithmetic logic in
the statement. The join is done between the recursive results parentId and the
id in the anchor result. This gives us the possibility to get the name column
from the anchor statement.
Notice that
I’ve also put in another blank field in the anchor statement and added the
parentName field in the recursive statement. This gives us the “human readable”
output where I can find the hierarchy directly by reading from left to right.
To get data
from the above CTE I just have to make a select statement from this:
And the
results:
I can now
directly read that Jane refers to Ditlev and Brian refers to Jane.
But how is
this done when the SQL engine executes the query – the next part tries to
explain that.
The SQL engines handling
Given the
full CTE statement above I’ll try to explain what the SQL engine does to handle
this.
Run the anchor member creating the
first base result set (T0)
Run the recursive member with Ti
as an input and Ti+1 as an output
Repeat step 3 until an empty result
set is returned
Return the result set. This is a
union all set of T0 to Tn
So let me try
to rewrite the above query to match this sequence.
The anchor
statement we already know:
First
recursive query:
Second
recursive query:
The n
recursive query:
The union
all statement:
This gives
us the exactly same result as we saw before with the rewrite:
Notice that
the statement that I’ve put in above named Tn is actually empty. This to give
the example of the empty statement that makes the SQL engine stop its execution
in the recursive CTE.
This is how
I would describe the SQL engines handling of a recursive CTE.
Based on
this very simple example, I guess you already can think of ways to use this in
your projects and daily tasks.
But what
about the performance and execution plan?
Performance
The
execution plan for the original recursive CTE looks like this:
The top
part of this execution plan is the anchor statement and the bottom part is the
recursive statement.
Notice that
I haven’t made any indexes in the table, so we are reading on heaps here.
But what if
the data is more complex in structure and depth. Let’s try to base the answer
on an example:
From the attached sql code you’ll find a script to generate +20.000 rows in a new table called complextree. This data is from a live solution and contains medical procedure names in a hierarchy. The data is used to show the relationships in medical procedures done by the Danish hospital system. It is both deep and complex in structure. (Sorry for the Danish letters in the data…).
When we run
a recursive CTE on this data – we get the exactly same execution plan:
This is
also what I would expect as the amount of data when read from heaps very seldom
impact on the generated execution plan.
The query
runs on my PC for 25 seconds.
Now let me
put an index in the table and let’s see the performance and execution plan.
The index
is only put on the parentDwId as, according to our knowledge from this article
is the recursive parts join column.
The query
now runs 1 second to completion and generates this execution plan:
The top
line is still the anchor and the bottom part is the recursive part. Notice now
the SQL engine uses the non-clustered index to perform the execution and the
performance gain is noticeable.
Conclusion
I hope that
you’ve now become more familiar with the recursive CTE statement and are
willing to try it on your own projects and tasks.
The basics
is somewhat straight forward – but beware that the query can become complex and
hard to debug as the demand for data and output becomes stronger. But don’t be
scared. As I always say – “Don’t do a complex query all at once, start small
and build it up as you go along”.
I attended
a TDWI conference in May 2016 in Chicago. Here I got a hint about the datatype
hierarchyid in SQL Server which could optimize and eliminate the good old
parent/child hierarchy.
Until then
I (and several other in the class) haven’t heard about the hierarchyid datatype
in SQL Server. So I had to find out and learn this.
Here’s a
blogpost covering some of the aspects of the datatype hierarchyid – including:
Introduction
How
to use it
How
to optimize data in the table
How
to work with data in the hierarchy-structure
Goodies
Introduction
The
datatype hierarchyid was introduced in the SQL Server as from version 2008. It
is a variable length system datatype. The datatype can be used to represent a
given element’s position in a hierarchy – e.g. an employee’s position within an
organization.
The
datatype is extremely compact. The storage is dependent in the average fanout
(fanout = the number of children in all nodes). For smaller fanouts (0-7) the
typical storage is about 6 x Log A * n bits. Where A is the average fanout and
n in the total number of nodes in the tree. Given above formula an organization
with 100,000 employees and a fanout of 6 levels will take around 38 bits –
rounded to 5 bytes of total storage for the hierarchy structure.
Though the
limitation of the datatype is 892 bytes there is a lot of room for extremely
complex and deep structures.
When
representing the values to and from the hierarchyid datatype the syntax is:
[level id 1]/[level id 2]/..[level id n]
Example:
1/7/3
The data
between the ‘/ can be of decimal types e.g. 0.1, 2.3 etc.
Given two
specific levels in the hierarchy a and b given that a < b means that b comes
after a in a depth first order of comparison traversing the tree structure. Any
search and comparison on the tree is done this way by the SQL engine.
The
datatype directly supports deletions and inserts through the GetDescendant
method (see later for full list of methods using this feature). This method
enables generation of siblings to the right of any given node and to the left
of any given node. Even between two siblings. NOTE: when inserting a new node
between two siblings will produce values that are slightly less compact.
Hierarchyid in SQL Server how to use it
Given an
example of data – see compete sql script at the end of this post to generate
the example used in this post.
The Num
field is a simple ascending counter for each level member in the hierarchy.
There are
some basic methods to be used in order to build the hierarchy using the hierarchy
datatype.
GetRoot method
The GetRoot method gives the hierarchyid of the rootnode
in the hierarchy. Represented by the EmployeeId 1 in above example.
The code
and result could look like this:
The value
‘0x’ from the OrgPath field is the representation of the string ‘/’ giving the
root of the hierarchy. This can be seen using a simple cast to varchar
statement:
Building
the new structure with the hierarchyid dataype using a recursive SQL statement:
Notice the
building of the path after the union all. This complies to the above mentioned
syntax for building the hierarchy structure to convert to a hierarchyid
datatype.
If I was to
build the path for the EmployeeId 10 (Name = ‘Mads’) in above example it would
look like this: ‘/2/2/’. A select statement converting the hierarchyid field
OrgPath for the same record, reveals the same thing:
Notice the
use of the ToString method here. Another build in method to use for the
hierarchyid in SQL Server.
GetLevel method
The GetLevel method returns the current nodes level with an
index of 0 from the top:
GetDescendant method
This method
returns a new hierarchyid based on the two parameters child1 and child2.
The use of
these parameters is described in the BOL HERE.
Below is
showed some short examples on the usage.
Getting a
new hierarchyid when a new employee referring to top manager is hired:
Getting a
new hierarchyid when a new hire is referring to Jane on the hierarchy:
Dynamic
insert new records in the hierarchy table – this can easily be converted into a
stored procedure:
Notice the
new GetAncestor method which takes one variable (the number of steps up the
hierarchy) and returns that levels Hierarchyid. In this case just 1 step up the
hierarchy.
More
methods
There are
several more methods to use when working on a hierarchy table – as found on
BOL:
GetDescendant – returns a new child node of a given parent.
Takes to parameters.
GetLevel – returns the given level for a node (0 index)
Read – is used implicit in the ToString method. Cannot
be called by the T-SQL statement
GetParentedValue – returns node from new root in case of moving
a given node
Write – returns a binary representation of the
hierarchyid. Cannot be called by the T-SQL statement.
Optimization
As in many
other scenarios of the SQL Server the usual approach to indexing and
optimization can be used.
To help on
the usual and most used queries I would make below two indexes on the example
table:
But with
this like with any other indexing strategy – base it on the given scenario and
usage.
Goodies
So why use
this feature and all the coding work that comes with it?
Well – from
my perspective – it has just become very easy to quickly get all elements
either up or down from a given node in the hierarchy.
Get all descendants from a
specific node
If I would
like to get all elements below Jane in the hierarchy I just have to run this
command:
Think of
the work you would have to do if this was a non hierarchy structured table
using only parent/child and recursice sql if the structure was very complex and
deep.
I know what
I would choose.
Conclusion
As seen
above the datatype hierarchyid can be used to give order to the structure of a
hierarchy in a way that is both efficient and fairly easy maintained.
If one
should optimize the structure even further, then the EmployeeId and the
ManagerId could be dropped as the EmployeeId is now as distinct as the OrgPath
and can be replaced by this. The ManagerId is only used to build the structure
– but this is now also given by the OrgPath.
Just recently I had to have a big datawarehouse solution to test some performance optimization using BIML. I could use the AdventureWorks2012 database, but I needed the clean datawarehouse tables in order to have minimum data maintennance when testing the BIML scripts.
I could not find it, and figures out it was faster to make my own.
So heavily inspired by this post from Jonathan Kehayias (blog), I’ve made a script that can be used to enlarge the dbo.FactInternetSales table.
The script creates a new table called dbo.FactInternetSalesEnlarged and copies data from dbo.FactInternetSales into it with a randomizer. Exploding the data to a 100 times bigger table – est. 6 mio rows.
On several occasions I’ve been assigned the task to split delimited string into rows.
I’ve done this in different ways, but never thought about the performance or stability of the different approaches for doing this.
So here’s my 25 cents and findings.
My first solution to this was to code a function to traverse through the string and insert a new value to a temp table for every delimiter found in the string:
CREATE FUNCTION [dbo].[list_to_table] (
@list varchar(4000)
)
RETURNS @tab TABLE (
item int
)
BEGIN
IF CHARINDEX(',',@list) = 0 or CHARINDEX(',',@list) is null
BEGIN
INSERT INTO @tab (item) VALUES (@list);
RETURN;
END
DECLARE @c_pos INT;
DECLARE @n_pos INT;
DECLARE @l_pos INt;
SET @c_pos = 0;
SET @n_pos = CHARINDEX(',',@list,@c_pos);
WHILE @n_pos > 0
BEGIN
INSERT INTO @tab (item) VALUES (CAST(SUBSTRING(@list,@c_pos+1,@n_pos - @c_pos-1) as int));
SET @c_pos = @n_pos;
SET @l_pos = @n_pos;
SET @n_pos = CHARINDEX(',',@list,@c_pos+1);
END;
INSERT INTO @tab (item) VALUES (CAST(SUBSTRING(@list,@l_pos+1,4000) as int));
RETURN;
END
Then I ran into performance issues with very long strings – pushing me to find a better solution with more performance. I began to look into the xml-aproach for solving the issue – and ended up with this:
declare @string as nvarchar(4000)
select
r.value('@value','int') as Kategori
from (
select cast('<A value = "'+ replace(@string,',','"/><A value = "')+ '"/>' as xml
) as xml_str) xml_cte
cross apply xml_str.nodes('/A') as x(r)
Performance for the two different solutions is shown below:
String contains all numbers from 0 to 100 with comma as delimiter, machine 4 cores 16 gb ram and ssd.
Function: 7 ms (on average) XML: 3 ms (on average)
No matter how long a string I send to the XML code it runs around 3 ms – the function just climbs and climbs in time (naturally).
Anybody who has done the same and found an even better way to dynamically split delimited string into rows and want to share?
In every project on Business Intelligence there comes a time when the code needs to be deplyed to the production environment. No more development, no more manual work. It is time for dynamic partitioning.
But what about the partitions on the tabular cube? Do we really need to tell and learn the DBA how to handle that on a periodic plan?
The answer is simple: No!
Thanks to the XMLA language, the DMV’s for SSAS instances (both tabular og multidimensionel) and SSIS we can do the partitioning dynamic based on the current data in the datawarehouse.
In below examble I’ve made partitions for every month, but as always you need to take the current architecture, dataflow, data deliveries etc into account when creating your sollution.
Here we go:
We need a table in order to keep track on the partitions and their metadata. Also a table to hold data from existing partitions in the Tabular cube:
In SSIS I’ve created a dataflow from the SSAS Tabular instance with the list of partitions on the table I want to partition to the table ExitingPartitionsList. We can do this thanks to the DMV’s for SSAS which also works for Tabular instances – see this link for further information.
The tableID is found in the tables properties (right-click and choose ‘Properties’):
The controlflow looks like this. I hope it is somewhat self-explainable:
Now we need to take a look at the naming convention of the partitions. There is a need for the partition-name to have a suffix that tells the span of the partition. I’ve choosen [Tablename]-[fromdate]-[todate]. And in order to get usable data from the previous dataflow, I’ve made a view as below:
Giving this output:
My facttable (Måleraflæsninger) has a field that is used to filter the partition. In this case it is ‘datekey’. In order to get all possible partitions that are needed for the project I’ve made this view:
Take notice that I’ve made the dates end at last day of the month. This is going to be used to generate the view for the partitions later on. Based on these two views we can now generate the list of partitions that needs to be created in the Tabular project:
Now I need to generate a set of XMLA’s. One to create a partition and one to process a partition. The easiest way to get these is to script them from the GUI in the SSAS Tabular instance.
1: Right-click the table that I’m working with and selecting ‘Partitions…’
2: Click ‘New partition’
3: Here specify the name – remember the convention – and the SQL statement. Here it is important to remember the filter criteria for the partitions. In this case it is one partition for every month.
4. Finally click the Script buttom and ‘Script Action to New Query Window’
Result – the areas I’ve highlighted are the ones that we need to parameterize in SSIS – more to come on that part in a bit.
The same way I’ve made a XMLA script to process the partition – the highlighted area is, again, to be parameterized later:
Now we need to go to SSIS and make the logic and steps to accuire the dynamic partitioning.
First of all we need to make sure that all partitions that should be in the model also exists, and if they do not exists, we’øll create them.
The SSIS project now needs some variables as listed below:
The scope varies from project to project.
After all the variables has been defined, we need to make the two XMLA-variables as Expressions.
In the Expression builder add below codes to the respective variable:
CreatePartitionXMLA – replace the meta-data with the one that matches your sollution:
ProcessPartitionXMLA – replace the meta-data with the one that matches your sollution:
Now we are all set and just need to build the package.
The focus is now on a container like this:
The steps are to get all the missing partitions from our defined view and loop the result with both Create partition, Process partition and Insert data to PartitionLog.
Step 1:
Define a SQL task with the following statement:
Define the output to ‘Full resultset’ and map the result to the variable MissingPartitions.
Add a ForeahLoopContainer and define the container to reference the MissingPartitions object as a ADO source and map the data to the variables as below:
Inside this container add two Analysis Services Execetute DDL Tasks and define a Analysis Services Connectione that matches the environment.
The first DDL (Create Partition) is defined like this:
The second DDL (Process Partition) is defined like this:
The last step is to add a record in the PartitionLog table – add a SQL task with this statement:
And map the parameters like this:
Now, when the package runs, it will get a dataset of missing partitions, loop the dataset and create and process the partitions dynamically. At the end it creates a record in the partitionlog to keep track of this.
The last thing we need to do is to add a container to process the latest partition every time the package executes.
We need to build this:
Again, add a SQL task and define it to get data from the view with current partition we created earlier:
Map the ‘Full resulset’ to the variable PartitionName.
The foreach loop container must be defined to use this variable and the data mapped like this:
Add a Anaysis Services Exectute DDL task and define it to use the variable ProcesPartionXMLA. We can reuse the expression as it is defined as expression and uses the same logic in the expression.
Finally add a SQL task with below code:
Map the parameters like this:
And there it is. All done.
Now every time the package is executed it will see to that missing partitions is created (in this case for every month start) and processed. And it will make sure that the latest partition is updated with the latest data.
The processing time now takes very short time to do, as the only data that is processed is the latest one. Of course the first time the package is run it will create all the partitions and process them.
Ever had that awesome SQL tracer build up that does just the right thing for your system to do some performance monitoring – well I know that I had. And someday you might need just the same trace again. But now you need to build it again…
Here comes the feature Extended Events in place. It was first introduced in the SQL 2008 version. The feature is a good and lightweight event-driven mechanism for collecting information about your SQL server. The Extended Events has a lighter footprint than the old Trace Events. It also has a more programmatic approach to get the events and information that they respond to. The Extended Events has another cool feature – they are stored inside the SQL server and can be turned on and off very simple – even on a job-vise level, rather than the old-fasion way to recreate and rethink the whole semantic for every trance you need.
Also the Trace Events is not guaranteed to exist on future versions of SQL server – so now is the time to learn it before you are forced to do it.
But but, it was all T-SQL-vise to use the feature – for me that was to many arguments and variables to know them by heart. The output from the collection was XML and therefore needed more decoding to be used for optimization. So I never got to use them on a daily basis.
And along came Polly the Extended Events GUI with SQL 2012.
Finally a GUI to the Extended Events handler that ease up your daily work with the SQL server. A set of GUI interfaces have been introduced for dealing with Extended Events: a Wizard, an Extended Events Properties Editor, and a Data Viewer. The Wizard is a handy way to get walked through creating an Extended Events Session, but I’m going to skip past that and talk about the Properties Editor and the Data Viewer. These two interfaces are where you’re going to spend most of your time.
You can find the Extended Events Sessions under the Management folder in SQL Server Management Studio (SSMS). As a replacement (and enhancement, because it does more) for the default trace, a Session comes installed, system_health. You can use this as a great way to learn how Extended Events Sessions are set up because the Session includes many different types of Events, Targets, Filters, and Actions. The same set of windows we’re about to go over can also be used to create new Sessions in addition to edit existing ones.
Defining a Session
For illustration purposes, I’m going to stop the system_health Session while we examine its properties, just so that they’re mostly accessible. All I have to do to make this happen is right-click the session and select Stop Session from the context menu. Right-click the Session again and select Properties, and the Editor opens.
Even though there is details in the window, most of them are self-explainable.
It’s a matter of supplying a Session name and, when you are creating a new Session, you can pull from a list of Session Templates.
The nifty feature is that you have full control of when the Session starts. When I start from scratch, I normally start the event immediately and watches the LIVE DATA in the Data Viewer window (more to come on this later).
Selecting Events
The next page down is the Events page. Clicking it on the left you’ll see all the power and flexibility of Extended Events on display.
Yup, there’s a lot of things and corners here. Let’s take them one at a time. The left side of the screen is where the main work is done – at the top there is a text field and a drop-down selection box. This lets you search the collection of events in the system. For example, in the above screenshot there’s no search in place, so all events are displayed. If I typed something in the textbox, the events below the bo would the filtered according to my search. This is a great way to quickly find the specific event you might want to use. The drop-down feature can then help you on the way by selecting e.g. ‘Event names only’, ‘Event Fields only’, ‘Event names and Description’ or ‘All’.
On the list itself you can further filter the Extended Events you are interested in. The headings on the columns can be sortet just by clicking on them. On the above picture the Name is sorted (the little triangle to the right of ‘Name’ indicates the sorting order. The columns Category and Channel are drop-downs that let you filter the list even further.
The names of the Extended Events can be very cryptic. Therefore you’ll find additional description of the selected Event at the lower left corner of the window. Right next to this, there is a list with the fields for the selected Extended Event. These fields can be carefully compared to the old fashion Columns in the Tracer; they are somewhat more inherent and unique to each Event. When capturing and Event you also by default capture most of its Fields. More on these exceptions later.
Once you are satisfied with your selection of Event, you click the large right arrow in the center of the screen to move that Event to the Selected Events list. Removal of items from this list is easy-peasy. Again below this list, you’ll get a second description of the Extended Event you’ve selected.
Configuring Events
There’s even more functionality on this page. Notice that button in the top right corner that says Configure and points to the right? Click that and you get to a whole new set of functionality.
Here you configure the Events. If you want to go back – just click the Select button.
On the left side of the window you have a list of Events, and some extra goodies. First the name, and here you can, again, sort as you like. Below a description of the selected Event. Right next to the name column there is a column (the one with the lightning) showing how many Global Fields, also known as Actions, have been selected. The last column shows whether the Event is filtered. This is a great way to quick and easy identify the Filters and Actions you have. In the above picture, you can see that error_reported has a filter, and none of the other Events have.
To the right there are 3 tabs: Global Fields (Actions), Filter (Predicate) and Event Fields. The first one – an Action, a global fieldm or an additional column that you can add to any event that you want to capture. An example is the error_reported event that is currently highlighted does not have a database_id Event Field. If you want to capture that field when an error occurs (might be a good idea), you will have to use an Action.
The thing is, that an Action is captured after the Event and is executed synchronously. This means that if there is anything that might cause some performance bottlenecks as a part of your Extended Event capture, here is a likely candidate (among a few others). So instead of calling them Global Fields, which can sound a little to attractive, I would prefer the name i parentheses Actions. This way it is clear that they are different and that you should use them with caution. Selecting a particular Action’s check box adds it to the Extended Event selected on the left side of the screen.
As shown above, clicking the Events (Predicate) tab lets you see and control which events gets captured. Also, obvious, you can add more lines in an easy-access manner. Each of the columns provides you with a drop-down list, except the final one where you’re expected to enter information. The Fields are from the event and a selection of operating system and SQL Server Fields that you can filter on. The comparison operators are the standard set of equals, less than, and so on, divided up into int and int64. The one thing I’d add is that the more immediate your first filter, the less load these will place on the system. Eliminating all errors below 20 as the first criteria is a good example.
The last tab shows the Event Fields, fields that are unique to this event. Don’t misunderstand me: events have lots of fields in common (such as session_id and database_id, because these are common values within SQL Server), but each event has a preselected list of Fields that apply to it.
Mostly this is just a listing of the Fields and their data types for the selected Event. However, note the event at the top of the list with the check box. By default, all Fields are included, except a few that are more expensive to collect. You have to decide whether you need these Fields when you’re setting up your Extended Events Session. Selecting the check box will include them in the Session.
Data Storage
And that’s it. You’ve now seen how to select a list of Extended Events and to configure those events with Actions, Filters, and Event Fields. Up to now I haven’t talked about where all this information goes. That’s the next page. Clicking the Data Storage page on the left side of the screen via the page listing there opens up a screen like this:
You can define a number of different Targets for your Extended Events Sessions. It really depends on how you’re trying to consume these events. The interesting thing is, if you just want to watch the Session live, you don’t actually have to designate a target here. I don’t want to try to describe what all these are for and what they can do; there’s better documentation for that.
But for most situations, the likely target will be the one selected, event_file. This puts all the output into a file. When you select this Target, you get several properties that you must define at the bottom of the screen. For example, the file name and location are naturally included. You also get to decide how large you want the files to be, if you want them to rollover as they’re filled, and, if they’re rolling over, what the maximum number of files ought to be. It’s a great way to capture information like query performance metrics so that you can later load them into tables and start running reports to identify the longest running query, for example.
Advanced features
The last page is the Advanced features
I’ll not go deep into this except to say that here there is a lot of control over how much impact you’ll allow, or force to your setup of Extended Events. By making adjustments here, you can ensure that you have no losses to Events (and probably a much higher load on your system) or a very lossy process (with a lower impact). Look to Books Online and other resources for when and how to adjust these.
That’s it! Done. Click OK, and now you have a new Session (or you’ve updated an existing one) No T-SQL required. However, it is still possible to script your own Session once its created to be used on other servers if applicable.
Time to watch SQL-telly
Now I can watch queries as they go by. All I have to do is right-click the Session and select Watch Live Data, and the Data Viewer window opens as in Figure 8. What’s more, you can use the File, Open, and find *.xel files and open them directly into the same viewer.
The window is split in two. At the top are all the events and the timestamp for when they occurred. At the bottom are the Fields for the selected Event in the window above. You can scroll through the various data, and you’ll see everything you need, no XML required.
Scrolling around to find the data you want can be a pain, so, if you like, you can right-click a Field in the lower window and select Show Column In Table from the context menu. Below a few columns displayed in the grid.
There’s a bunch more functionality built into the Data Viewer. You can double-click a Field to open in a new window, which is handy for viewing long T-SQL strings or XML output. If you’re looking at either a stopped Session or a file, you can sort the grid by columns. You can’t do that while watching Live data. Best of all, and I really love this, you can toggle a bookmark on an event so that you can find your way back to that event quickly. OK, maybe that isn’t best, but it’s pretty good. You can also apply a filter based on a value in a Field to show only that one. So, for example, if I only wanted to look at the error_reported Events from above example, I could right-click that column where that is the value and select Filter On This Value from the context menu.
If you’re looking at a Session, not a file, you can start and stop the session, pick which window you want shown, edit your filters, and do grouping and aggregation. Actually that’s pretty slick, too. What I’ve done in Figure 11 is group by the Event name Field so that I’m seeing all of a particular event as a set.
From my small run of events I have three different types: error_reported with 18 Events, rpc_completed with 4, and sql_batch_completd with 41. I’ve expanded the rpc_completed to show the individual calls. This does bring out one issue with Extended Events that I found to be a little problematic for gathering query metrics. Note that the batch_text Field is NULL for all the rpc_completed events. This is because the rpc_completed Event has a statement Field that is the equivalent to the batch_text field in the sql_batch_completed Event. A slight pain, and something to be aware of. However, you’re compensated by being able to get the object_name Field, which means you can immediately group all your stored procedure calls by the name of the procedure, no worries about trying to parse out the T-SQL information to remove parameters. That’s a huge win.
A little extra feature is the ability search in the results as shown below.
You can define what field or fields you want to search through. In my case, I chose Table Columns. You can put different criteria in and even make use of wildcards and regular expressions. I left that off in my case so that I could find the literal string ‘SELECT *’.
There’s still more that I haven’t covered, but you get the idea. With SQL Server 2012 you get the ability to do fully fledged data exploration through your Extended Event output.
Summary of performance monitoring
There’s much more to learn about Extended Events, and you’re probably still going to use a lot of T-SQL code when working with them. But to get started, you now have a GUI that has everything you need to set up, control, and maintain Extended Events. You also have a GUI that allows you to consume and report on the data collected from Extended Events. Many of the reasons people had in the past for not using Extended Events should now be eliminated.