Have you ever encountered a situation where you needed to analyze hierarchical data in SQL Server, only to find that your query fails due to circular references? Circular references occur when a group is a member of itself either directly or through inheritance. In this blog post, we will discuss how to identify and resolve circular references in SQL Server.
Let’s start by creating a simple table that represents the relationships between groups:
CREATE TABLE GroupHierarchy (
Child VARCHAR(255) NOT NULL,
Parent VARCHAR(255) NOT NULL
);
INSERT INTO GroupHierarchy (Child, Parent) VALUES ('a', 'b');
INSERT INTO GroupHierarchy (Child, Parent) VALUES ('b', 'c');
INSERT INTO GroupHierarchy (Child, Parent) VALUES ('c', 'd');
INSERT INTO GroupHierarchy (Child, Parent) VALUES ('e', 'b');
Now, we can use a common table expression (CTE) to generate the hierarchy of group memberships:
WITH GroupMembers (Child, ParentGroup, Level, HierarchyPath)
AS
(
-- Anchor member definition
SELECT g.Child, g.Parent, 0 AS Level, CONVERT(VARCHAR(MAX), g.Child + '->' + g.Parent) AS HierarchyPath
FROM GroupHierarchy AS g
WHERE Child NOT IN (SELECT Parent FROM GroupHierarchy)
UNION ALL
-- Recursive member definition
SELECT g.Child, g.Parent, Level + 1, HierarchyPath + '->' + g.Parent
FROM GroupHierarchy AS g
INNER JOIN GroupMembers AS gm
ON gm.ParentGroup = g.Child
)
SELECT LEFT(HierarchyPath, CHARINDEX('->', HierarchyPath) - 1) AS Child, ParentGroup, Level, HierarchyPath
FROM GroupMembers
OPTION (MAXRECURSION 5);
This query will give us the hierarchy of group memberships:
Child ParentGroup Level HierarchyPath
------ ------------ ------ --------------------
a b 0 a->b
e b 0 e->b
e c 1 e->b->c
e d 2 e->b->c->d
a c 1 a->b->c
a d 2 a->b->c->d
Now, let’s introduce a circular reference by adding a record where ‘d’ is the child and ‘b’ is the parent:
INSERT INTO GroupHierarchy (Child, Parent) VALUES ('d', 'b');
If we rerun the previous query, we will encounter an error message indicating that the maximum recursion level has been exceeded. This is because the circular reference creates an infinite loop in the hierarchy.
To identify the circular reference, we can modify the CTE by excluding records where the parent has already appeared in the hierarchy path:
WITH GroupMembers (Child, ParentGroup, Level, HierarchyPath)
AS
(
-- Anchor member definition
SELECT g.Child, g.Parent, 0 AS Level, CONVERT(VARCHAR(MAX), g.Child + '->' + g.Parent) AS HierarchyPath
FROM GroupHierarchy AS g
WHERE Child NOT IN (SELECT Parent FROM GroupHierarchy)
UNION ALL
-- Recursive member definition
SELECT g.Child, g.Parent, Level + 1, HierarchyPath + '->' + g.Parent
FROM GroupHierarchy AS g
INNER JOIN GroupMembers AS gm
ON gm.ParentGroup = g.Child
WHERE gm.HierarchyPath NOT LIKE '%->' + g.Parent + '->%'
)
SELECT LEFT(HierarchyPath, CHARINDEX('->', HierarchyPath) - 1) AS Child, ParentGroup, Level, HierarchyPath
FROM GroupMembers
ORDER BY Level DESC
OPTION (MAXRECURSION 5);
This modified query will give us the hierarchy excluding the circular reference:
Child ParentGroup Level HierarchyPath
------ ------------ ------ --------------------
e d 2 e->b->c->d
a d 2 a->b->c->d
e c 1 e->b->c
a c 1 a->b->c
a b 0 a->b
e b 0 e->b
By examining the hierarchy path, we can identify the circular reference. In this case, the pattern ‘b -> c -> d’ occurs twice, indicating that the ‘d -> b’ relationship is causing the circular reference.
Now that we have identified the problem record, we can correct it in the source data to resolve the circular reference.
In conclusion, circular references can be a challenge when working with hierarchical data in SQL Server. By using a CTE and excluding records with existing parent references, we can identify and resolve circular references. This allows us to analyze hierarchical data without encountering errors due to infinite loops.