-
Notifications
You must be signed in to change notification settings - Fork 288
Expand file tree
/
Copy pathbeyond-the-orm-from-postgres-to-opensearch.json
More file actions
26 lines (26 loc) · 4.3 KB
/
beyond-the-orm-from-postgres-to-opensearch.json
File metadata and controls
26 lines (26 loc) · 4.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
"description": "Our team at Energy Solutions spent a year building a robust data ingestion and query pipeline using OpenSearch to provide centralized data to a distributed suite of applications. Along the way, we learned to question and rethink a lot of our relational database assumptions and take fuzzy search customization and accuracy to the next level. Meanwhile, we implemented Pydantic wrappers around JSON responses so we could continue to handle responses like native Python objects (along with other benefits we\u2019ll discuss). We addressed long-standing challenges, such as:\n\n- Improving the performance of the per-row create/update/delete paradigm (in one case, leading to a ~9x faster data ingest + load!)\n- Putting OpenSearch \u201caliases\u201d to work to help track current vs archival data\n- Improving search relevancy\n- PII exposure reduction\n\nIn this presentation, we\u2019ll walk through the decisions that led us to moving to an OpenSearch-based solution that works within a traditional Django framework, how we tackled advanced topics like token analysis, and how we put OpenSearch aliases to work. We\u2019ll also cover some of the cost-benefit equations, summarize our next phase of work in the project, and include real-time demonstrations of some concepts.\n\n### Background on Energy Efficiency\n\nUtilities, which are often regulated, are responsible for generating a consistent supply of energy for their consumers at a stable price. Efficiency incentive programs that manage demand help insulate utilities from the costs associated with purchasing raw materials (coal, gas, etc.) and help reduce the need to build new power plants that are expensive to build, staff, insure, and supply with consumable, non-renewable resources. \n\nUtilities are also usually required by law to spend part of the budget they collect on customer bills on something to reduce the demand for energy in their territory. For instance, a utility might offer a $1,500 rebate (AKA incentive) for the sale of an industrial heat pump that is far more efficient than other industrial heat pumps, via an energy efficiency (EE) program. By incentivizing high efficiency equipment, utilities help move the market towards ever more efficient versions of equipment, thus locking in energy savings even after the EE programs end. In this way, we can help utilities and their customers save energy and move the needle away from climate change.\n\n#### Cosmos Project Background\nWhat Cosmos is, a service for our other projects to access larger datasets such as locations and equipment, and why we needed a new service.\n\n#### Motivations for moving from Postgres\n- Elimininate redundancy\n- Reduce PII storage\n- Improve load speed\n- Faster, more accurate fuzzy search\n\n#### Why OpenSearch?\n- Super-fast, built on Lucene\n- Open source but supported by AWS\n- Features like aggregation, tokenization well-documented\n- Ranked results (hits sorted by _score)\n- Intuitive and flexible searching for addresses, equipment, etc\n- Performance trade-off between ingesting data and searching data (optimize query performance at the expense of slower indexing)\n\n#### Helper Tools\n- opensearch-py \n- Kibana\n\n#### Cost Breakdown\n- AWS cost vs our own standalone\n\n#### API access\n- Not presenting raw output - transformed for readability, consistency \n- Data passed through Pydantic models \n- Strict access controls for PII \n- \u201cSearch Helper\u201d for for non-tech staff\n\n#### Load and Query Demos\n- Comparison of a complex PG query vs equivalent OS query\n\n#### Django Integration\n- Pydantic models rather than Django models - consistency with old code \n- Validation, dot notation",
"language": "eng",
"recorded": "2025-09-09",
"related_urls": [
{
"label": "Conference Website",
"url": "https://2025.djangocon.us"
},
{
"label": "Talk Webpage",
"url": "https://2025.djangocon.us/talks/beyond-the-orm-from-postgres-to-opensearch/"
}
],
"speakers": [
"Andrew Mshar"
],
"thumbnail_url": "https://i.ytimg.com/vi/_iB7ET5rt7s/maxresdefault.jpg",
"title": "Beyond the ORM: from Postgres to OpenSearch",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=_iB7ET5rt7s"
}
]
}