Sections in this category

Connecting to Amazon Athena

  • Updated

Sisu allows you to connect to an Amazon Athena service.

 

Related Articles:

 

Step 1: Create an IAM Policy

In order for Sisu to query via the Athena service, it needs to be granted an IAM role that is capable of accessing the database. This custom role must be created within the AWS Organization hosting the Athena service. 

info_icon.png You will need the Account ID of the AWS Organization. If this is not known, please reference the instructions in the AWS Account ID documentation.

However, before creating the user role (which we’ll do in Step 2 below), a policy that grants the appropriate access capabilities must first be created. These five capabilities are:

  • Read access to the Glue data catalog
  • The ability to read the Athena service’s metadata, and to run queries using the Athena service
  • The ability to read from the S3 buckets that backs the Athena tables
  • The ability to write results to the S3 bucket used to store query results
  • The ability to list workgroups and data catalogs

The policy below is a template with five statements, each of which corresponds to one of the capabilities above. Placeholders enclosed in the brackets (< >) need to be filled in and provided with the relevant values.

{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Sid": "GlueAccess",

            "Effect": "Allow",

            "Action": [

                "glue:GetDatabase",

                "glue:GetCatalogImportStatus",

                "glue:GetPartition",

                "glue:GetTables",

                "glue:GetPartitions",

                "glue:BatchGetPartition",

                "glue:GetDatabases",

                "glue:GetTable"

            ],

            "Resource": "*"

        },

        {

            "Sid": "AthenaDatabaseAccess",

            "Effect": "Allow",

            "Action": [

                "athena:GetTableMetadata",

                "athena:StartQueryExecution",

                "athena:GetQueryResultsStream",

                "athena:GetQueryResults",

                "athena:GetDatabase",

                "athena:GetDataCatalog",

                "athena:ListQueryExecutions",

                "athena:GetWorkGroup",

                "athena:StopQueryExecution",

                "athena:GetQueryExecution",

                "athena:BatchGetQueryExecution"

            ],

            "Resource": [

                "arn:aws:athena:<AWS REGION>:<ACCOUNT #>:workgroup/<WORKGROUP NAME>",

                "arn:aws:athena:<AWS REGION>:<ACCOUNT #>:datacatalog/<DATA CATALOG NAME>"

            ]

        },

        {

            "Sid": "SourceBucketReadAccess",

            "Effect": "Allow",

            "Action": [

                "s3:ListBucketMultipartUploads",

                "s3:ListBucket",

                "s3:ListMultipartUploadParts",

                "s3:GetObject",

                "s3:GetBucketLocation"

            ],

            "Resource": [

                "arn:aws:s3:::<BUCKET 1 NAME>",

                "arn:aws:s3:::<BUCKET 1 NAME>/*",

                "arn:aws:s3:::<BUCKET 2 NAME>",

                "arn:aws:s3:::<BUCKET 2 NAME>/*",

                "arn:aws:s3:::<BUCKET 3 NAME>",

                "arn:aws:s3:::<BUCKET 3 NAME>/*",

...

"arn:aws:s3:::<BUCKET N NAME>",

"arn:aws:s3:::<BUCKET N NAME>/*"

            ]

        },

        {

            "Sid": "DestBucketReadWriteAccess",

            "Effect": "Allow",

            "Action": [

                "s3:ListBucketMultipartUploads",

                "s3:ListBucket",

                "s3:ListMultipartUploadParts",

                "s3:PutObject",

                "s3:GetObject",

                "s3:AbortMultipartUpload",

                "s3:GetBucketLocation"

            ],

            "Resource": [

                "arn:aws:s3:::<RESULTS BUCKET NAME>",

                "arn:aws:s3:::<RESULTS BUCKET NAME>/*"

            ]

        },

        {

            "Sid": "AthenaListAccess",

            "Effect": "Allow",

            "Action": [

                "athena:ListDataCatalogs",

                "athena:ListWorkGroups"

            ],

            "Resource": "*"

        }

    ]

 

Step 2:  Grant an IAM Role in Athena

Once the IAM policy has been created, the next step involves creating a role that Sisu can assume. It is recommended to create this role via the AWS Console. The following information should be selected.

Type of Trusted Entity

“Another AWS Account”

ID

855250002930

Required External ID

Check

External ID

sisu-<your org name>-<a random 12-character alphanumeric string>

Role Name

sisu_integration_role

 

warning_icon.png IMPORTANT:  This is a required step, and the exact Role Name of “sisu_integration_role” must be used. If the role cannot be named “sisu_integration_role” due to organizational naming policies, please contact Sisu for assistance.

 

Once the trust relationship has been established, the Console will ask to attach a policy to the newly created role. Attach the policy created in the previous step. 

 

info_icon.png

The trust relationship must not require multi-factor authentication. Sisu does not currently support cross-account delegation with multi-factor authentication. This would be evident if the trust relationship has the following component in its conditions:

{"aws:MultiFactorAuthPresent": "true"}

 

In the console, increase the default “Maximum session duration” of the new role to “12 hours”.  This will ensure that long-running queries do not time out.  

 

Step 3:  Handling Encryption

If the buckets backing the Athena tables are using SSE-S3 for server-side encryption, no further configuration is required. 

If those buckets are using KMS, the role created in Step 2 above needs to be added as a Key User for the customer-managed key.

To add the IAM Role:

  1. Open KMS in the AWS console.
  2. Find the key being used for the encryption and select the Add option under Key Users

If you prefer to edit your key’s access policy directly, you can add the following two statements to the access policy (replacing <your account number> with your AWS account number):

{

  "Sid": "Allow use of the key",

  "Effect": "Allow",

  "Principal": {

    "AWS": [

      "arn:aws:iam::<your account number>:role/sisu_integration_role",

    ]

  },

  "Action": [

    "kms:Encrypt",

    "kms:Decrypt",

    "kms:ReEncrypt*",

    "kms:GenerateDataKey*",

    "kms:DescribeKey"

  ],

  "Resource": "*"

},

{

  "Sid": "Allow attachment of persistent resources",

  "Effect": "Allow",

  "Principal": {

    "AWS": [

      "arn:aws:iam::<your account number>:role/sisu_integration_role",

    ]

  },

  "Action": [

    "kms:CreateGrant",

    "kms:ListGrants",

    "kms:RevokeGrant"

  ],

  "Resource": "*",

  "Condition": {

    "Bool": {

      "kms:GrantIsForAWSResource": "true"

    }

  }

}

 

Step 4:  Add Athena Data Source in Sisu

To add Athena as a Data Source in Sisu:

1.  In the main Sisu menu at the top of the screen, click Data sources page, then click "+ New data source".

Data_tab_-_New_data_source.png

2.  Select “Athena” from the Select data source dropdown, and fill in the following information as shown in the image below.

info_icon.png

External AWS role ARN:  arn:aws:iam::<account #>:role/sisu_integration_role
S3 output location: s3://<bucket name>/<path to output location>


Add_Athena.png

tip_icon_-_small.png Check Restrict Access if you want to restrict this data source to specific users.

 

3.  Click Add.

 

warning_icon.png If you experience any issues, please contact Sisu at support@sisudata.com