Skip to content

[Website] add akvelon case study #34943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 164 additions & 4 deletions website/www/site/content/en/case-studies/akvelon.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,17 @@
---
title: "Akvelon"
icon: /images/logos/powered-by/akvelon.png
hasNav: true
cardDescription: "<p><a href='https://akvelon.com/' target='_blank' rel='noopener noreferrer'>Akvelon</a> is a software engineering company that helps start-ups, SMBs, and Fortune 500 companies unlock the full potential of cloud, data, and AI/ML to empower their strategic advantage. Akvelon team has deep expertise in integrating Apache Beam with diverse data processing ecosystems and is an enthusiastic Apache Beam community contributor.</p>"
title: "Secure and Interoperable Apache Beam Pipelines by Akvelon"
name: "Akvelon"
icon: "/images/logos/powered-by/akvelon.png"
category: "study"
cardTitle: "Secure and Interoperable Apache Beam Pipelines by Akvelon"
cardDescription: "To support data privacy and pipeline reusability at scale, Akvelon developed Beam-based solutions for Protegrity and a major North American credit reporting company, enabling tokenization with Dataflow Flex Templates. Akvelon also built a CDAP Connector to integrate CDAP plugins with Apache Beam, enabling plugin reuse and multi-runtime compatibility."
authorName: "Vitaly Terentyev"
coauthorName: "Ashley Pikle"
authorPosition: "Software Engineer @Akvelon"
coauthorPosition: "Director of AI Business Development @Akvelon"
authorImg: /images/case-study/akvelon/terentyev.png
coauthorImg: /images/case-study/akvelon/pikle.png
publishDate: 2024-11-25T00:12:00+00:00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we update the date here?

---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
Expand All @@ -17,3 +26,154 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<div class="case-study-opinion">
<div class="case-study-opinion-img">
<img src="/images/logos/powered-by/akvelon.png"/>
</div>
<blockquote class="case-study-quote-block">
<p class="case-study-quote-text">
“To support data privacy and pipeline reusability at scale, Akvelon developed Beam-based solutions for Protegrity and a major North American credit reporting company, enabling tokenization with Dataflow Flex Templates. Akvelon also built a CDAP Connector to integrate CDAP plugins with Apache Beam, enabling plugin reuse and multi-runtime compatibility.”
</p>
<div class="case-study-quote-author">
<div class="case-study-quote-author-img">
<img src="/images/case-study/akvelon/pikle.png">
</div>
<div class="case-study-quote-author-info">
<div class="case-study-quote-author-name">
Ashley Pikle
</div>
<div class="case-study-quote-author-position">
Director of AI Business Development @Akvelon
</div>
</div>
</div>
</blockquote>
</div>
<div class="case-study-post">

# Secure and Interoperable Apache Beam Pipelines by Akvelon

## Background

To meet growing enterprise needs for secure, scalable, and interoperable data processing pipelines, **Akvelon** developed multiple Apache Beam-powered solutions tailored for real-world production environments:
- Data tokenization and detokenization capabilities for **Protegrity** and a leading North American credit reporting company
- A connector layer to integrate **CDAP** plugins into Apache Beam pipelines

By leveraging [Apache Beam](https://beam.apache.org/) and [Google Cloud Dataflow](https://cloud.google.com/products/dataflow?hl=en), Akvelon enabled its clients to achieve scalable data protection, regulatory compliance, and platform interoperability through reusable, open-source pipeline components.

## Use Case 1: Data Tokenization for Protegrity and a Leading Credit Reporting Company

### The Challenge

**Protegrity**, a leading enterprise data-security vendor, sought to enhance its data protection platform with scalable tokenization support for batch and streaming data. Their goal: allow customers such as a major North American credit reporting company to tokenize sensitive data using Google Cloud Dataflow. The solution needed to be fast, secure, reusable, and compliant with privacy regulations (e.g., HIPAA, GDPR).

### The Solution

Akvelon designed and implemented a **Dataflow Flex Template** using Apache Beam that allows users to tokenize and detokenize sensitive data within both batch and streaming pipelines.

<div class="post-scheme">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Beam Runner -> Dataflow Runner

<a href="/images/case-study/akvelon/diagram-01.png" target="_blank" title="Click to enlarge">
<img src="/images/case-study/akvelon/diagram-01.png" alt="Protegrity & Equifax Tokenization Pipeline">
</a>
</div>

### Key features
- **Seamless integration with Protegrity UDFs**, enabling native tokenization directly within Beam transforms without requiring external service orchestration
- **Support for multiple data formats** such as CSV, JSON, Parquet, allowing flexible deployment across diverse data pipelines
- **Stateful processing with `DoFn` and timers**, which improves streaming reliability and reduces overall pipeline latency
- **Full compatibility with Google Cloud Dataflow**, ensuring autoscaling, fault tolerance, and operational simplicity through managed Apache Beam execution

This design provided both Protegrity and its enterprise clients with a reusable, open-source architecture for scalable data privacy and processing.

### The Results
- Enabled data tokenization at scale for regulated industries
- Accelerated adoption of Dataflow templates across Protegrity’s customer base
- Delivered an open-source Flex Template that benefits the entire Apache Beam community
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the github link for this template?


<blockquote class="case-study-quote-block case-study-quote-wrapped">
<p class="case-study-quote-text">
In collaboration with Akvelon, Protegrity utilized a Dataflow Flex template that helps us enable customers to tokenize and detokenize streaming and batch data from a fully managed Google Cloud Dataflow service. We appreciate Akvelon’s support as a trusted partner with Google Cloud expertise.
</p>
<div class="case-study-quote-author">
<div class="case-study-quote-author-img">
<img src="/images/case-study/akvelon/chitnis.png">
</div>
<div class="case-study-quote-author-info">
<div class="case-study-quote-author-name">
Jay Chitnis
</div>
<div class="case-study-quote-author-position">
VP of Partners and Business Development @Protegrity
</div>
</div>
</div>
</blockquote>

## Use Case 2: CDAP Connector for Apache Beam

### The Challenge

**CDAP** had extensive plugin support for Spark but lacked native compatibility with Apache Beam. This limitation prevented organizations from reusing CDAP's rich ecosystem of data connectors (e.g., Salesforce, HubSpot, ServiceNow) within Beam-based pipelines, constraining cross-platform integration.

### The Solution

Akvelon engineered a **shim layer** (CDAP Connector) that bridges CDAP plugins with Apache Beam. This innovation enables CDAP source and sink plugins to operate seamlessly within Beam pipelines.

<div class="post-scheme">
<a href="/images/case-study/akvelon/diagram-02.png" target="_blank" title="Click to enlarge">
<img src="/images/case-study/akvelon/diagram-02.png" alt="CDAP Connector Integration with Apache Beam">
</a>
</div>

### Highlights

- Supports `StructuredRecord` format conversion to Beam schema (`BeamRow`)
- Enables CDAP plugins to run seamlessly in both Spark and Beam pipelines
- Facilitates integration testing across third-party data sources (e.g., Salesforce, Zendesk)
- Complies with Beam’s development and style guide for open-source contributions

The project included prototyping, test infrastructure, and Salesforce plugin pipelines to ensure robustness.

### The Results

- Made **CDAP plugins reusable in Beam pipelines**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both cases, adding quantifiable metrics, if available (e.g., percentage increase or number of customers, data volume), will be much better.

- Unlocked **interoperability** between Spark and Beam runtimes
- Enabled **rapid prototyping** and plug-and-play connector reuse for Google Cloud
customers

## Technology Stack

- Apache Beam
- Google Cloud Dataflow
- Protegrity Data Protection Platform
- CDAP (Cloud Data Fusion)
- BigQuery
- Salesforce, Zendesk, HubSpot, ServiceNow plugins

## Final words

Akvelon’s contributions to Apache Beam-based solutions - from advanced tokenization for Protegrity and its enterprise customers to enabling plugin interoperability through the CDAP Connector - demonstrate the value of open-source, cloud-native data engineering. By delivering reusable and secure components, Akvelon supports enterprises in modernizing and unifying their data infrastructure.

## Watch the Solution in Action

[Architecture Walkthrough Video ](https://www.youtube.com/watch?v=IQIzdfNIAHk)

## About Akvelon, Inc.

Akvelon guides enterprises through digital transformation on Google Cloud - applying deep expertise in data engineering, AI/ML, cloud infrastructure, and custom application development to design, deploy, and scale modern workloads.

At Akvelon, we’ve built a long-standing partnership with Google Cloud—helping software-driven organizations implement, migrate, modernize, automate, and optimize their systems while making the most of cloud technologies.

As a **Google Cloud Service** and **Build Partner**, we contribute actively to the ecosystem:
- Contributing code and guidance to **Apache Beam**—including Playground, Tour of Beam, and the Duet AI training set
- Improving project infrastructure and supporting the Apache Beam community—now with an official Apache Beam Committer on our team

Backed by deep expertise in data engineering, AI/ML, cloud architecture, and application development, our engineers deliver reusable, secure, and production-ready solutions on Google Cloud for enterprises worldwide.

- [Akvelon on Google Cloud](https://cloud.google.com/find-a-partner/partner/akvelon)
- [Akvelon Data and Analytics Accelerators](https://github.com/akvelon/DnA_accelerators)

{{< case_study_feedback "Akvelon" >}}

</div>
<div class="clear-nav"></div>
5 changes: 5 additions & 0 deletions website/www/site/data/en/quotes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@
logoUrl: images/logos/powered-by/linkedin.png
linkUrl: case-studies/linkedin/index.html
linkText: Learn more
- text: Akvelon built Beam-based solutions for Protegrity and a major North American credit reporting company, enabling tokenization with Dataflow Flex Templates and reducing infrastructure and deployment complexity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can move this after the Accenture case.

icon: icons/quote-icon.svg
logoUrl: /images/logos/powered-by/akvelon.png
linkUrl: case-studies/akvelon/index.html
linkText: Learn more
- text: With Apache Beam, OCTO accelerated the migration of one of France’s largest grocery retailers to streaming processing for transactional data, achieving 5x reduced infrastructure costs and 4x improved performance.
icon: icons/quote-icon.svg
logoUrl: images/logos/powered-by/octo.png
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading