Data Sharing & Cross-Account in Amazon Redshift
Introduction
Data sharing in Amazon Redshift allows different AWS accounts and Redshift clusters to share data securely without needing to copy or move data. This capability simplifies data collaboration and optimizes resource usage.
Key Concepts
- Data Sharing: Enables sharing of database objects (tables, schemas) across different Redshift clusters.
- Cross-Account Access: Allows Redshift clusters in different AWS accounts to access shared data without duplicating it.
- Data Provider: The account that shares the data.
- Data Consumer: The account that receives access to the shared data.
Step-by-Step Process
1. Enable Data Sharing
To enable data sharing, you must be the data provider. Ensure your Amazon Redshift cluster has data sharing enabled.
2. Create a Data Share
CREATE DATASHARE my_data_share;
3. Add Database Objects to Data Share
ALTER DATASHARE my_data_share ADD SCHEMA public;
ALTER DATASHARE my_data_share ADD TABLE public.my_table;
4. Grant Access to Data Consumer
GRANT USAGE ON DATASHARE my_data_share TO ACCOUNT 'data_consumer_account_id';
5. Access Shared Data in Consumer Account
In the consumer account, you can create an external schema to access the shared data.
CREATE EXTERNAL SCHEMA my_external_schema
FROM DATA SHARE my_data_share;
Best Practices
- Always use least privilege access when granting data share permissions.
- Regularly review and audit data share permissions.
- Consider performance implications and optimize table design for shared data.
- Implement encryption for sensitive data before sharing.
FAQ
What is the maximum number of data shares allowed?
Each AWS account can have up to 100 data shares.
Can I share data from a Redshift RA3 cluster?
Yes, data sharing is supported on RA3 instance types in Amazon Redshift.
Is there any cost associated with data sharing?
Data sharing itself does not incur additional charges, but data transfer costs may apply based on your usage.