OTHER DATA PROTECTION CASE STUDIES
Data Protection platform integration with Snowflake
Data Protection platform integration with Hadoop Hive
SparkSQL integration with Leading Data Privacy platform
Data Protection platform integration with Amazon Redshift
Google Cloud Monitoring Solution (Stackdriver) integration with Data Protection Platform
Categories
Customer
Customer is a leading Personal Data Privacy and Protection provider.
It enables advanced machine learning and identity intelligence to help enterprises better protect their customer and employee data at petabyte scale.
It identifies all PII across structured, unstructured, cloud & Big Data.
Requirement
Customer demanded a connector app to integrate their platform with SparkSQL. Connector app will parse data from SparkSQL and normalize it in the required format.
Technology Solution
Spark SQL brings native support for SQL to Spark and streamlines the process of querying data stored both in RDDs (Spark’s distributed datasets) and in external sources. Spark SQL conveniently blurs the lines between RDDs and relational tables.
Sacumen developed the connector app to integrate Spark SQL with Hive using java. The connector app performs the following actions:
Set up the prerequisites
Install Hive and setup a metastore
Create databases and tables in Hive
Configure Hive settings using the hive-site.xml
Use the custom parameters in the DS form to test connection
Scan the schema/ tables and normalize the data.
There are no reviews yet.