databricks-data-handling

Featured

Implement Delta Lake data management patterns including GDPR, PII handling, and data lifecycle. Use when implementing data retention, handling GDPR requests, or managing data lifecycle in Delta Lake. Trigger with phrases like "databricks GDPR", "databricks PII", "databricks data retention", "databricks data lifecycle", "delete user data".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Databricks Data Handling ## Overview Implement GDPR compliance, PII masking, data retention, and row-level security in Delta Lake with Unity Catalog. Covers data classification tagging, right-to-deletion workflows, automated retention enforcement, column-level masking functions, and subject access request (SAR) reporting. ## Prerequisites - Unity Catalog enabled - Understanding of data classification requirements (GDPR, CCPA, HIPAA) - Admin access for tags and masking functions ## Instructions ### Step 1: Classify and Tag Data Use Unity Catalog tags to classify tables and columns for automated compliance enforcement. ```sql -- Tag tables with classification and retention ALTER TABLE prod_catalog.silver.customers SET TAGS ('data_classification' = 'PII', 'retention_days' = '730'); ALTER TABLE prod_catalog.silver.orders SET TAGS ('data_classification' = 'CONFIDENTIAL', 'retention_days' = '365'); ALTER TABLE prod_catalog.gold.metrics SET TAGS ('data_classification' = 'INTERNAL', 'retention_days' = '1825'); -- Tag PII columns ALTER TABLE prod_catalog.silver.customers ALTER COLUMN email SET TAGS ('pii_type' = 'email'); ALTER TABLE prod_catalog.silver.customers ALTER COLUMN phone SET TAGS ('pii_type' = 'phone'); ALTER TABLE prod_catalog.silver.customers ALTER COLUMN full_name SET TAGS ('pii_type' = 'name'); ``` ### Step 2: GDPR Right-to-Deletion Delete all user data across PII-tagged tables with audit logging. ```python from pyspark.sql import SparkSession from datetim...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

Data & Documents Listed

data-governance

Data lineage tracking, PII tagging, access control policies, data catalog metadata standards, retention policies, and audit logging for regulatory compliance. Use this skill whenever the company is subject to PDPA, GDPR, HIPAA, or any data privacy regulation, when an audit requires proof of who accesses what data, when PII fields need to be identified and classified in a dataset, when setting up column-level access control, or when building a data catalog. Also trigger when someone asks about data masking, anonymization, right-to-erasure workflows, role-based data access, or data lineage from source to BI tool. If the word "compliance", "audit", "PII", "sensitive data", or "regulation" appears, this skill should be active.

0 Updated 4 days ago
Methasit-Pun
AI & Automation Solid

supabase-data-handling

Implement GDPR/CCPA compliance with Supabase: RLS for data isolation, user deletion via auth.admin.deleteUser(), data export via SQL, PII column management, backup/restore workflows, and retention policies. Use when handling sensitive data, implementing right-to-deletion, configuring data retention, or auditing PII in Supabase database columns. Trigger: "supabase GDPR", "supabase data handling", "supabase PII", "supabase compliance", "supabase data retention", "supabase delete user", "supabase data export".

2,266 Updated today
jeremylongshore
AI & Automation Featured

snowflake-data-handling

Implement Snowflake data governance with masking policies, row access policies, tagging, and GDPR/CCPA compliance patterns. Use when handling PII, implementing column masking, configuring data classification, or ensuring compliance with privacy regulations in Snowflake. Trigger with phrases like "snowflake data governance", "snowflake masking", "snowflake PII", "snowflake GDPR", "snowflake row access policy", "snowflake tags".

2,266 Updated today
jeremylongshore
AI & Automation Featured

clickhouse-data-handling

Handle data lifecycle in ClickHouse — TTL expiration, data deletion (GDPR), column-level encryption, and audit logging with real ClickHouse SQL. Use when implementing data retention, GDPR deletion requests, or managing sensitive data in ClickHouse. Trigger: "clickhouse data retention", "clickhouse TTL", "clickhouse GDPR", "delete data clickhouse", "clickhouse data lifecycle", "clickhouse PII".

2,266 Updated today
jeremylongshore
AI & Automation Featured

gamma-data-handling

Handle data privacy, retention, and compliance for Gamma integrations. Use when implementing GDPR compliance, data retention policies, or managing user data within Gamma workflows. Trigger with phrases like "gamma data", "gamma privacy", "gamma GDPR", "gamma data retention", "gamma compliance".

2,266 Updated today
jeremylongshore