As predictive modeling becomes mainstream, we are finding that more and more analytics projects require capabilities that are only available in specialized tools such as R. One of the teams at CitiusTech had similar requirements; to utilize R’s features within Netezza. This post describes how one of the teams at CitiusTech went about invoking R AE (Analytics executables) through Netezza.
R AE (Analytics Executable) allows us to use open source R function within Netezza environment. User-defined functions are written in R and can be invoked from Netezza NZSQL query or Netezza stored procedure
The R AE code needs to be compiled and registered in the Netezza environment before it can be invoked by the Netezza NZSQL query or from Netezza stored procedure.
When an R UDA (User-defined aggregate) is invoked within the Netezza environment, Netezza searches for the registered R UDA, passes the parameters to the R UDA, waits for the UDA to perform the necessary computation and return the appropriate results back to the Netezza output window.
To call the R AE through Netezza Aginity workbench, the following steps
- Install open source R on Netezza (NPS)
- Also, install R package fitdistrplus on Netezza
o InstallOnSpus - optional argument. FALSE implies the package is not installed on SPUs
- Compile the R AE code with compile_ae command
- Register the R AE code with the register_ae. This command registers the specified AE function in given database
- Call R AE function from query (from Netezza SP) by giving input from table Table_Name
- The R AE function (citius_fitBeta) calculates alpha beta values as per the disease, measure, subset and reporting period
- Calculate mean and Standard deviation from alpha and beta values using mathematical functions in Netezza Stored procedure
o After calculating values, insert calculated values to Netezza table Table_Name
As demonstrated above, it is quite straightforward to integrate R with Netezza. This enables utilizing R’s capabilities within an environment like Netezza.