Skip to content

[FEATURE]: Use Context Manager based rules for easy monkey patching with Existing source Code #94

Open
@SomanathSankaran

Description

@SomanathSankaran

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

Currently we need code changes in existing code to accomodate the dq rules and add additional validaion.

Proposed Solution

Use Context Manager and re-direct reading and writing with the context Manager

class DQS_ContextManager():
  def __init__(self,rule_tbl,name=None):
    self.name=name
    self.rule_tbl=rule_tbl
    
  def __enter__(self):
    pass

  def __exit__(self, exc_type, exc_value, exc_traceback):
    if exc_value is not None:
      print("there is some error in business logic and we cant proceed further",exc_traceback)
    else:
      print(f"running the quarantine  for {self.name} !! hold tight...")
      rule_obj=rule_generator(self.rule_tbl,self.name)
      df_write_list=list(rule_obj.rule_df_tuple)
      #print(df_write_list)
      rule_executor.rule_runner(df_write_list)

Output :
Existing code

#without dqs
df_raw=spark.table("raw_tbl_taxi")
#business logic df as of now
df=df_raw.drop_duplicates()
df.write.mode("overwrite").saveAsTable("curated_without_dqs")

With Context Manager

with DQS_ContextManager("rule_tbl","df","tbl"):
  df_raw=spark.table("raw_tbl_taxi")
#business logic df as of now
  df=df_raw.drop_duplicates()
  #df.write.saveAsTable("curated_without_dqs")
  #read and write handled inside context Manager

Cleaner read and write

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions