Data professionals often need to import and export data in various formats when working with SQL Server. While SQL Server Integration Services (SSIS) is commonly used for these tasks, there are alternative methods available. In this article, we will explore the Rio package, developed by Thomas J. Leeper, which simplifies the data import and export process.
Environment Requirements
Before we begin, make sure you have the following:
- SQL Server environment (version: SQL Server 2019)
- R language support enabled in SQL Server
- SQL Server Management Studio
Data Import and Export using SQL Machine Learning R Scripts
R Script in SQL Server allows us to perform advanced analytics and predictive applications using the Microsoft R runtime. We can execute R code directly from SQL Server using the sp_execute_external_script stored procedure. The Rio package provides functions for importing and exporting data in various formats using R scripts.
To check if the Rio package is installed in your environment, you can run the following script:
EXECUTE sp_execute_external_script
@language = N'R',
@script = N'library(rio)
print(packageVersion("rio"))'If the Rio package is not installed, you can install it by opening the administrative R console from the path C:\Program Files\Microsoft SQL Server\MSSQL15.INST1\R_SERVICES\bin and running the following command:
install.packages("rio")Once the installation is complete, you can verify the package version by rerunning the stored procedure.
Importing a CSV File
Let’s start by downloading a sample CSV file from a web URL using the download.file() function:
download.file("http://bit.ly/BostonSnowfallCSV", "WinterSnowfalls.csv")The downloaded CSV file will be placed in the default R directory. We can use the import() function from the Rio package to read the CSV file:
Mydata <- rio::import("WinterSnowfalls.csv")
MydataYou can see the data in the R client console.
To import the CSV file using R script in SQL Server, you can use the following code:
EXECUTE sp_execute_external_script
@language = N'R',
@script = N'library(rio)
CSV <- import("C://Program Files//Microsoft SQL Server//MSSQL15.INST1//R_SERVICES//bin//WinterSnowfalls.csv")
OutputDataSet <- CSV'This script will import the data from the CSV file into SQL Server.
Converting CSVs into Excel Files
If you need to convert a CSV file into an Excel file, you can use the Rio package in SQL Server. Here’s an example:
CSV <- import("WinterSnowfalls.csv")
OutputDataSet <- export(CSV, "WinterSnowfalls.xlsx")To execute this R script from SQL Server, you can embed it into the sp_execute_external_script stored procedure and specify the path to save the Excel file:
EXECUTE sp_execute_external_script
@language = N'R',
@script = N'library(rio)
CSV <- import("C://Program Files//Microsoft SQL Server//MSSQL15.INST1//R_SERVICES//bin//WinterSnowfalls.csv")
OutputDataSet <- export(CSV, "C://temp//WinterSnowfalls.xlsx")'Importing an Excel File
The Rio package can also be used to import data from Excel files. Here’s an example:
Excel <- import("C://Temp//WinterSnowfalls.xlsx")
OutputDataSet <- ExcelThis script will import the data from the Excel file.
Importing JSON Files
The Rio package supports importing JSON files as well. Here’s an example:
JSON <- import("C://Temp//AdventureWorks.json")
OutputDataSet <- JSONYou can export the JSON data into CSV or Excel files as needed.
Importing Data from Compressed Files
The Rio package allows you to directly import data from compressed files without the need to extract them first. Here’s an example:
Excel <- import("C://Temp//AdventureWorks.zip")
OutputDataSet <- ExcelThis script will import the data from the compressed file.
Importing Data into SQL Server Tables
To import data directly into SQL Server tables, you need to create the table with the appropriate columns and data types. Here’s an example:
CREATE TABLE dataimportR (
[BusinessEntityID] varchar(100),
[Title] varchar(100),
[FirstName] varchar(100),
[MiddleName] varchar(100),
[LastName] varchar(100),
[Suffix] varchar(100),
[JobTitle] varchar(100),
[PhoneNumber] varchar(100),
[PhoneNumberType] varchar(100),
[EmailAddress] varchar(100),
[EmailPromotion] varchar(100),
[AddressLine1] varchar(100),
[AddressLine2] varchar(100),
[City] varchar(100),
[StateProvinceName] varchar(100),
[PostalCode] varchar(100),
[CountryRegionName] varchar(100),
[AdditionalContactInfo] varchar(100)
)You can then use the INSERT INTO statement before executing the sp_execute_external_script stored procedure to import the data into the SQL table:
INSERT INTO AdventureWorks2019.dbo.dataimportR
EXECUTE sp_execute_external_script
@language = N'R',
@script = N'library(rio)
Excel <- import("C://Temp//AdventureWorks.zip")
OutputDataSet <- Excel'This script will import the data directly into the specified SQL table.
Conclusion
In this article, we explored the Rio package, which simplifies the data import and export process in SQL Server using R scripts. We learned how to import data from various file formats, convert CSVs into Excel files, import JSON files, import data from compressed files, and import data into SQL Server tables. The Rio package provides a convenient and efficient way to work with different data formats in SQL Server.