Connect Ideata analytics application to Amazon Redshift

If you are using Ideata analytics for Spark and Redshift, you will need to connect to Amazon Redshift before you can start using the application. Follow this tutorial to see how you can connect to your Redshift instance from Ideata analytics.

Configure Redshift

After logging in for the first time to the application, or when your existing Redshift cluster is not reachable, you will be shown a prompt in the application asking you to configure your Redshift cluster. After you have your Redshift cluster details (follow these steps to create a new Redshift cluster) click on the Configure Redshift button.

Redshift connection details

Using the connection details page, you need to provide connection details to your Redshift cluster. Following details are required to complete the form:

  • Hostname or IP Address: Provide either the Hostname or IT address of your Redshift cluster.

  • Port: Specify the database port for your Redshift cluster. Default port for Redshift is 5439

  • Username & Password: Provide username and password for your user that you want Ideata to use to connect to your Redshift user.

  • AWS Access Key ID & AWS Access Key Secret: Specify AWS access key ID and Secret to allow Ideata to connect to your S3 location. Ideata uses a Amazon S3 as temporary\/staging storage before exporting the data to Amazon Redshift. We recommend that you use a dedicated temporary Amazon S3 bucket (with an object lifecycle configuration to ensure that temporary files are automatically deleted after a specified expiration period). Here is policy that should be added in the AWS credentials you are using:

    {```` "Version": "2012-10-17",```` "Statement": [```` {```` "Effect": "Allow",```` "Action": [```` "s3:GetObject",```` "s3:PutObject"```` ],```` "Resource": [```` "arn:aws:s3:::<your_bucketName>/<directory>/",```` "arn:aws:s3:::<your_bucketName>/<directory>/*"```` ]```` }```` ]````}

  • Bucket Name: Specify Amazon S3 bucket name that you want Ideata to use as the temporary location in S3 to store data.

  • Database: Provide Redshift database name

  • Schema for Imports: Provide a schema in your Redshift cluster where Ideata will store your imported data. The schema must be created before connecting to Redshift.
  • SSL Connection: Select the check box for SSL connection if you want Ideata to connect to Redshift using a SSL connection.

After you have provided these details, click on Connect button to complete your Redshift connection. Application will test and verify connection to your Redshift cluster and establish the connection. If the connection is successful, you will be able to start using Ideata for your data analysis.