Have you ever wanted to visualize the relationships between nodes in your SQL Server database? In this article, we will explore a generic stored procedure for SQL Server that utilizes R packages to create AT&T network diagrams. These diagrams provide a clear representation of the connections between nodes, allowing for better understanding and analysis of your database structure.
The Power of R Packages
One of the great things about the R packages used in this stored procedure is their ability to handle network hierarchies with multiple edges for each node. This is something that many similar R packages and SQL Server features struggle with. Even when dealing with thousands of nodes, these diagrams scale well and are still laid out nicely in most cases.
Let’s take a look at an example network diagram generated using this stored procedure:
This diagram showcases a machine-generated network topology consisting of 36 nodes. Each node is referenced by the nodes immediately left, above left, and below left. By selecting specific anchor nodes, we can focus on a subset of the network and include only the nodes that are directly referred to or referred to by the anchors.
Executing the Stored Procedure
To generate your own network diagrams, you can execute the provided stored procedure. Here is a simple example of how to use it:
USE tempdb;
GO
DECLARE @Node AS GraphNELNodeTableType;
DECLARE @NodeReferedTo AS GraphNELNodeRefersToTableType;
DECLARE @AnchorNodes GraphNELAnchorNodeTableType;
INSERT INTO @Node (NodeName, NodeLabel) SELECT '1','Kirk';
INSERT INTO @Node (NodeName, NodeLabel) SELECT '2','Spock';
INSERT INTO @Node (NodeName, NodeLabel) SELECT '3','Scotty';
INSERT INTO @NodeReferedTo SELECT '1','2';
INSERT INTO @NodeReferedTo SELECT '2','3';
INSERT INTO @NodeReferedTo SELECT '3','1';
INSERT INTO @AnchorNodes SELECT '1';
EXECUTE dbo.sp_GraphNELGene
@Node,
@NodeReferedTo,
@AnchorNodes,
@IncludeLabels = 1;
The stored procedure will generate and return an R script. You can run this script in R or Visual Studio to produce the desired network diagram.
Conclusion
The Bioconductor packages used in this stored procedure were originally developed for genome mapping in wet lab environments. However, their capabilities extend beyond that and can be useful for visualizing SQL Server network diagrams. The level of complexity, scaling, and support for different network types and relationships is truly impressive.
Limitations
While the R script works well in R Studio and Visual Studio, there are a few limitations to be aware of. Firstly, when the diagrams scale, there is limited space for text descriptions inside each node. Therefore, shorter labels are preferred. Additionally, parsing large R scripts in R and Visual Studio can be slow, although the execution itself is quick.
Furthermore, the R Server installed with SQL Server does not easily support the execution of the R scripts created by this procedure. To automate the process and publish the graphs, it is recommended to save the R script to a text file and execute it using a Windows batch file or PowerShell.
Nonfunctional Requirements
The stored procedure has been tested to generate a 1,000 node, 3,000 edge diagram with multiple anchor nodes in less than 5 minutes. The resulting PDF file can be found here.
Source Code and Example Execution
If you are interested in exploring the source code and example execution of this stored procedure, you can find it in the following files:
- 01 – Creates the GRAPHNEL table types and stored procedure in the tempdb database
- 02 – Creates the R scripts to produce the first two graphs in this post
Features
The stored procedure, sp_GraphNELGene
, accepts the following parameters:
- Nodes – Table variable with a list of Node Names and Labels
- Node Refers To – Table variable with a list of Node Names and the name of the Referred to Node
- Anchor Nodes – Table variable with a list of the Anchor Node Names
- Include Labels – Defaults to No, displays Node Label rather than Node Name in the graph
- Include Refers To – Defaults to Yes
- Include Referred To By – Defaults to Yes
- PDF File – Path and file name for the generated graph
Bioconductor References
If you are interested in learning more about Bioconductor and the R packages used in this stored procedure, you can find more information at the following links: