|
|||||||||||||||||||||
|
|||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionI've always liked tree structures. I have worked with them in many situations, and I have searched efficient ways to manage them in SQL Server. As the limit of 32 nested levels in the SQL Server (both 2000 and 2005 versions) impedes us to attack this issue recursively, we have to allot an overhead in storing and accessing nested structures in database tables. In this article, I will try to explain how to efficiently store a tree structure in a SQL Server table, access it easily from T-SQL and clients, and transform it in to a structure recognized by OLAP Dimensions in Microsoft Analysis Services. The question is how to efficiently and easily store and access tree structures, and how to transform them as in the pictures below:
Tree structure storageThe well-known method to store a tree in a relational table is to have a link between a node and its parent. So, the Using the last technique, we will have two supplementary columns in the table: SELECT C.ID, C.P_ID, C.NAME
FROM TREE C
INNER JOIN TREE P ON C.NLEFT BETWEEN P.NLEFT AND P.NRIGHT
WHERE P.ID = 1
The CREATE TABLE TREE (
ID int NOT NULL ,
P_ID int NULL ,
NAME varchar (100),
NLEFT int NOT NULL CONSTRAINT DF_TREE_NLEFT DEFAULT (-1),
NRIGHT int NOT NULL CONSTRAINT DF_TREE_NRIGHT DEFAULT (-1),
NLEVEL int NULL CONSTRAINT DF_TREE_NLEVEL DEFAULT (-1),
CONSTRAINT PK_TREE PRIMARY KEY CLUSTERED (ID)
)
I have added the The access will not be so difficult, but how can we set the appropriate values for I present only the trigger for insert action, but it can be done in the same manner for all update IF @P_ID IS NULL
BEGIN
SELECT @MAX_VALUE = ISNULL(MAX(NLEFT), 0) + 10000
FROM TREE WHERE P_ID IS NULL
SELECT @NLEVEL = 1
END
ELSE
BEGIN
SELECT @MAX_VALUE = ISNULL(MAX(NRIGHT), 0)
FROM TREE WHERE P_ID = @P_ID AND ID < @ID
SELECT @NLEVEL = ISNULL(NLEVEL, 0)
FROM TREE WHERE P_ID = @P_ID AND ID < @ID
IF @MAX_VALUE = 0
BEGIN
SELECT @MAX_VALUE = ISNULL(NLEFT, 0) FROM TREE WHERE ID = @P_ID
SELECT @NLEVEL = ISNULL(NLEVEL, 0) + 1 FROM TREE WHERE ID = @P_ID
END
END
UPDATE TREE SET NLEFT = @MAX_VALUE + 1,
NRIGHT = @MAX_VALUE + 2,
NLEVEL = @NLEVEL WHERE ID = @ID
After the values are stored, the trigger ensures that the corresponding ancestors have appropriate IF @P_ID IS NOT NULL
BEGIN
SELECT @ROOT_ID = ISNULL(P.ID, -1)
FROM TREE C
INNER JOIN TREE P ON C.NLEFT BETWEEN P.NLEFT AND P.NRIGHT
WHERE C.ID = @P_ID AND P.P_ID IS NULL
-- EXEC TREE_RECALC @ROOT_ID -- RECURSIVE APPROACH
EXEC TREE_RECALC_ITER @ROOT_ID -- ITERATIVE APPROACH
END
The recalculation or numbering algorithm is:
The algorithm is implemented in two manners: recursive and iterative. The recursive method is simple, and it is found in the SELECT @NCOUNT = COUNT(*), @NLEVEL2 = @NLEVEL + 1
FROM TREE
WHERE P_ID = @NPARENT
IF @NCOUNT > 0
BEGIN
SELECT @NNEWLEFT = @NLEFT + 1
SELECT @NTEMPID = 0
SELECT @NTEMPID = MIN(ID), @NCOUNTER = COUNT(*)
FROM TREE
WHERE P_ID = @NPARENT AND ID > @NTEMPID
WHILE (@NCOUNTER > 0)
BEGIN
EXEC TREE_RECALC_REC @NNEWLEFT,
@NTEMPID, @NLEVEL2, @NRIGHT OUTPUT
SELECT @NNEWLEFT = @NRIGHT + 1
SELECT @NTEMPID = MIN(ID), @NCOUNTER = COUNT(*)
FROM TREE
WHERE P_ID = @NPARENT AND ID > @NTEMPID
END
UPDATE TREE SET NLEFT = @NLEFT,
NRIGHT = @NRIGHT + 1, NLEVEL = @NLEVEL
WHERE ID = @NPARENT
SELECT @NRIGHT = @NRIGHT + 1
END
ELSE
BEGIN
SELECT @NRIGHT = @NLEFT + 1
UPDATE TREE SET NLEFT = @NLEFT,
NRIGHT = @NRIGHT,
NLEVEL = @NLEVEL WHERE ID = @NPARENT
END
As it is known, the 32 nested levels limit forces us to not store more than 32 levels on a hierarchy. The iterative manner solves the 32 nested levels limit issue. The workaround is to use a "stack" as it is explained in the Expanding Hierarchies MSDN article. The "stack" is a temporary table which stores all the nodes which belong to the same parent, starting with the root. The nodes are processed in a loop using a level variable, and when there is no node with the current level in the "stack", the level is decreased with 1 (the previous parent level is processed). The level variable takes values form 1 to WHILE @NLEVEL > 0
BEGIN
IF EXISTS(SELECT * FROM #TPC WHERE NLEVEL = @NLEVEL)
BEGIN
SELECT TOP 1 @ID = ID, @P_ID = P_ID,
@NAME = NAME FROM #TPC WHERE NLEVEL = @NLEVEL ORDER BY ID
INSERT INTO #T VALUES(@ID, @P_ID, @NAME, @NLEFT, @NRIGHT, @NLEVEL)
DELETE FROM #TPC WHERE NLEVEL = @NLEVEL AND ID = @ID
INSERT INTO #TPC SELECT ID, P_ID, NAME,
@NLEVEL + 1 FROM TREE WHERE P_ID = @ID
IF @@ROWCOUNT > 0
SET @NLEVEL = @NLEVEL + 1
IF EXISTS(SELECT ID FROM TREE WHERE P_ID = @ID)
BEGIN
SET @NLEFT = @NLEFT + 1
SET @NRIGHT = @NLEFT + 1
END
ELSE
BEGIN
IF @ID = (SELECT MAX(ID) FROM TREE WHERE
NLEVEL = @NLEVEL AND P_ID = @P_ID)
BEGIN
PRINT LTRIM(RTRIM(STR(@ID))) + ' ' +
@NAME + ' ' + LTRIM(RTRIM(STR(@NRIGHT)))
SET @NLEVEL2 = 1
TRUNCATE TABLE #TCP
INSERT INTO #TCP SELECT ID, P_ID, NAME,
@NLEVEL2 FROM #T WHERE ID = @P_ID
WHILE @NLEVEL2 > 0
BEGIN
IF EXISTS(SELECT * FROM #TCP WHERE NLEVEL = @NLEVEL2)
BEGIN
SELECT TOP 1 @ID2 = ID, @P_ID2 = P_ID,
@NAME2 = NAME FROM #TCP
WHERE NLEVEL = @NLEVEL2 ORDER BY ID
SET @NRIGHT = @NRIGHT + 1
UPDATE #T SET NRIGHT = @NRIGHT WHERE ID = @ID2
DELETE FROM #TCP WHERE NLEVEL = @NLEVEL2 AND ID = @ID2
INSERT INTO #TCP SELECT ID, P_ID, NAME,
@NLEVEL2 + 1 FROM #T WHERE ID = @P_ID2
IF @@ROWCOUNT > 0
SET @NLEVEL2 = @NLEVEL2 + 1
END
ELSE
BEGIN
SET @NLEVEL2 = @NLEVEL2 - 1
END
END
END
ELSE
BEGIN
SET @NLEFT = @NRIGHT + 1
SET @NRIGHT = @NLEFT + 1
END
END
END
ELSE
BEGIN
SET @NLEVEL = @NLEVEL - 1
SELECT @NLEFT = MAX(NRIGHT) + 1 FROM #T WHERE NLEVEL = @NLEVEL
SET @NRIGHT = @NLEFT + 1
END
END
Tree structure accessThe hierarchy can now be accessed easily using simple SQL statements:
The last XML output is not so useful for the clients. They probably need the underlying tree structure which SQL Server XML features can't obtain, and not this "fake" ID-parent ID implementation. So, the next idea will be to re-arrange the "fake" tree structure in a XML nested hierarchy. This will be done using the <?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates
select="//*[name() = 'node' and not(@p_id)]"/>
</xsl:template>
<xsl:template match="*">
<xsl:variable name="id" select="@id"/>
<xsl:element name="node">
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="//*[@p_id = $id]"/>
</xsl:element>
</xsl:template>
<xsl:template match="@*">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
The
Tree to OLAP transformationSuppose that this tree structure will serve as a data source for an OLAP cube dimension. The dimensions and the cube can be created easily using Analysis Services. But what happens when the tree structure has changed, having one level more? Is it enough just to refresh the dimensions and the cube? The answer is no, the cube will not reflect the changes. Let's imagine that we have a data cube which has a
The The SET @SQL = N'INSERT INTO TREE_OLAP VALUES(@ID'
SET @I = 1
WHILE @I <= @MAX_LEVEL
BEGIN
IF @I = @NLEVEL
SET @SQL = @SQL + N', @NAME'
ELSE
SET @SQL = @SQL + N', ''<none>'''
SET @I = @I + 1
END
SET @SQL = @SQL + N')'
EXEC sp_executesql @SQL, N'@ID INT,
@NAME VARCHAR(100)', @ID = @ID, @NAME = @NAME
After that, for each ancestor, perform the update for the inserted record on the corresponding ancestor level column: SET @SQL = N'UPDATE TREE_OLAP SET N' +
LTRIM(RTRIM(STR(@NLEVEL2))) + ' = @NAME WHERE ID = @ID'
EXEC sp_executesql @SQL, N'@ID INT, @NAME VARCHAR(100)',
@ID = @ID, @NAME = @NAME2
Creating the Sales cube and dimensionsIf we add some records in the The time dimension is general, and implemented as it is in the Analysis Services "FoodMart 2000" sample database. The The cube data visualization will be:
The RefreshSalesCube applicationThe changes in the dimension and in the cube database can be done manually, using Analysis Services user interfaces, and programmatically using the DSO (Decision Support Objects – interop assembly) namespace for .NET framework 1.1, and AMO namespace for .NET framework 2.0. The
Updating the Sales cube and the TREE_OLAP dimensionLet's add a new record in the INSERT INTO TREE(ID, P_ID, NAME) VALUES(21, 20, 'MamaW')
The new record will have a correspondent in the fact table, in a day in 2006, let's say, '2006-07-18', with a value for DECLARE @TIME_ID INT
SELECT @TIME_ID = ID FROM TIME_BY_DAY WHERE T_DATE = '2006-07-18'
INSERT INTO SALES VALUES(21, @TIME_ID, 1000)
We can just re-create the EXEC TREE_2_OLAP 1
and run the RefreshSalesCube application. The new cube data visualisation will reflect the changes without a manual update:
ConclusionUsing the implementation on the OLTP database, and a few lines of code to re-create and re-process the cube, the structure transformation and data changing is transparent for the reporting clients which consume the cube data. The process can be automated and scheduled using SQL Server jobs, and the time and work performance can be improved. | ||||||||||||||||||||