In an Enterprise Data Warehouse (EDW) environment, it is common to have exact copies of source systems. When a clinical system is upgraded, its database undergoes changes that are reflected in the EDW. This can cause many SSRS reports to break. To address this issue, a solution can be implemented to identify the database objects referenced by each report and match them to the actual objects in the database.
Step 1: Extract Report Catalog from Report Server
The first step is to extract the report data from the report server’s catalog table using SQL Server Integration Services (SSIS). The report catalog contains the RDL XML file for each report. By storing this data in the EDW, it becomes possible to use XML methods to extract the T-SQL queries from the reports.
Step 2: Extract Queries from SSRS Reports
Using SQL Server’s built-in XML capabilities, the T-SQL queries can be extracted from the RDL XML file. This can be done by using the XML nodes() method to return each DataSet in a report as a row of data. The XML value() method is then used to extract the source query from each DataSet’s XML.
Step 3: Extract Any String that Resembles a Database Object
To identify the database objects referenced in the queries, a regular expression can be used. The regular expression should look for strings that resemble a database object in the form of “database_name.schema_name.object_name”. This will capture many strings that may not actually be database objects, but it provides a starting point for further analysis.
Step 4: Resolve List of Strings to Actual Database Objects
Using the extracted list of potential database objects, an inner join can be performed with the system catalog views in the current database to match them to the actual objects. This can be done by comparing the schema name and object name from the potential objects to the schema and object names in the catalog views.
Conclusion
By following these steps, it is possible to extract a reasonably accurate list of database objects from SSRS reports. This can be useful for identifying reports that may be impacted by table changes, determining the most commonly used tables in reports, and identifying tables that have not been used recently. While there are limitations to this approach, such as resolving objects in multiple databases and handling complex queries, it provides a valuable starting point for understanding the objects referenced by reports.
Overall, using SQL Server Integration Services, basic SQL, and an open source component like RegExtractor, it is possible to extract and analyze database objects from SSRS reports. This approach can help improve the management and maintenance of reports in an EDW environment.
Technical Environment:
- Report Server and EDW Databases: SQL Server 2008 R2
- Microsoft Tools: SQL, SSIS, SSRS
- Open Source SSIS DFT Component: RegExtractor
Acknowledgements:
RegExtractor – open source SSIS DFT Component created by Eric Just. http://regextractor.codeplex.com/