-
Notifications
You must be signed in to change notification settings - Fork 1
Partitioning the Notification Table
This document was created as apart of Ticket #97
This technical document outlines the implementation of partitioning for the ENP Notification
table. It provides an overview of partitioning, a proposed approach for partitioning the Notification
table, and key considerations for the ongoing maintenance and management of the partitioned structure.
Currently, in our Notification-API application, the Notification
table stores a significant and growing volume of data, making queries slower and increasing the complexity of managing the data. With BLANK number of rows, this table is becoming a bottleneck, especially for queries involving time-based filters.
We'd like to avoid this bottleneck when building our ENP application by implementing partitioning on the Notification table.
Partitioning is a database optimization technique that divides a table into smaller, more manageable pieces (partitions) while preserving the logic of the table.
- Range Partitioning: Data is divided into ranges based on a column's value. (This is the proposed method for our use case).
- List Partitioning: Data is divided into partitions based on specific column values (category, type, color, etc.).
- Hash Partitioning: Data is distributed into partitions based on a hash function. Ensures data distributed evenly.
For our scenario, range partitioning by year using the created_at
column is the most appropriate choice to support time-based queries efficiently.
We propose partitioning the Notification
table by year using the created_at
column as the partition key. Each partition will hold data for a specific year (2024, 2025, etc.). There will be an event listener on the Notification attach partitioned tables to the base Notification table as they are generated.
This solution will:
- Reduce the volume of data scanned for time-based queries.
- Simplify data management by organizing data into logical yearly partitions.
- Eliminate the need NotificationHistory table (used in our Notification-API application)
Note: SQLAlchemy does not support partitioning directly. This will need to be implmentmented thru...
- Alter the table to use
PARTITION BY RANGE
on thecreated_at
column. 2.** Create yearly partitions (notifications_2024
,notifications_2025
, etc.). ** - Add event listeners to attached newly partitioned table to base Notification table.
- Update application code to ensure proper handling of partitioned data. Modify queries to include
created_at
filters to benefit from partition pruning. - Develop and apply migration scripts to implement partitioning.
-
Query Adjustments: Modify queries to include
created_at
filters to benefit from partition pruning. - Partition Management: Automate the creation of new partitions for future years. Do we need a ticket for a script?
- Indexing: Optimize indexing strategy for the parent and child tables.
- Rollback Plan: Define a strategy to revert changes if issues arise during migration.
- Increased complexity in database maintenance and management.
- ** Queries spanning multiple partitions could introduce overhead. **
Partitioning the Notification
table is a critical step to ensure our system scales efficiently while maintaining performance. By organizing data into yearly partitions, we address current challenges and position the system for sustainable growth. Proper planning, testing, and monitoring will be essential to a successful implementation.
- Update queries to ensure they include date filters to benefit from partition pruning.
- Test queries to confirm expected behavior across partitions.
- Automate the creation of new partitions annually via scheduled tasks or cron jobs. Script?
- Monitor partition sizes and enforce any retention policies.
- Evaluate indexing needs for both the parent and child tables.
- Avoid excessive indexing to prevent write performance issues.
- Confirm that backup and restore processes are compatible with partitioned tables.