Omkar Jadhav - Personal Website

My Working Experience

Join me as I share my journey with you!

My journey from Associate Software Engineer to Data Platform Engineer

Associate Software Engineer (Feb-2021 to May-2023)

Developed PySpark pipelines to extract daily billing data from Oracle and load validated datasets into Hive.

Implemented PL/SQL logic for telecom billing configuration and automated operational tasks using Unix shell scripts.

I have hands on experience on: Oracle PLSQL, Shell Scripting, Python

I have worked on: the Devops part too we use Jenkins for automated Builds and Sonar for the code quality check

I have worked with: GIT for source control

I have worked on Dynatrace tool too

Senior Software Engineer (May-2023 to – JUL-2025)

- I built Spark SQL / PySpark pipelines to transform Oracle billing data and store optimized Parquet datasets for Hive analytics.

- Implemented validation logic and unit tests using pytest; optimized joins, partitions, and aggregations.

- Created Jenkins CI/CD pipelines and managed Git-based version control for scheduled data jobs.

Data Platform Engineer (JUL-2025 to – CURRENT)

A] Cloud Migration & Operations

- Supported on-prem to AWS data warehouse migration; validated re-modeled datasets in Redshift populated via S3-based ingestion pipelines.

- Verified ingestion pipelines using ETL (Glue-based Python) and ELT (Redshift SQL) approaches triggered via Lambda and managed schedulers.

B] Data Reconciliation & Validation

- Performed schema, data type, column count, record count,and sample data reconciliation between Oracle and AWS datasets using SQL and Python (Jupyter notebooks).

- Validated data quality checks, load-status audit tables, and supported downstream consumer cutover during migration.

C] Database Optimization & Decommissioning

- Executed Oracle database reorganization activities, reducing tablespace usage from 18.9 TB to 11.5 TB, delivering significant cost savings.

- Segregated tablespaces by schema to enable phased decommissioning as data feeds.

Intern

Software Developer Intern

I was working as python Developer

Involved in planning and development of Cloud Native Back-Up Automation Tool.

As a part of this organization, I participated in various phases of software development like Requirement Analysis, Planning & designing , Development , Testing and Deployment.

I was exposed to various technologies like Cloud, Python, Shell Scripting, GIT

Portfolio

Explore some of my featured projects that showcase my skills and expertise:

Enterprise Data Platform Migration & Validation (AWS):

- Supported migration from on-prem Oracle to AWS-based data platform.

- Validated re-modeled datasets loaded into AWS Redshift via S3-based ingestion pipelines.

- Performed schema reconciliation, data type checks, record count validation, and sample data comparison.

- Queried Athena and Redshift using SQL and Python (Jupyter notebooks).

- Verified data quality checks and load-status audit tables.

- Coordinated with upstream and downstream teams during data cutover.

Key Outcome: Ensured accurate and reliable data availability during cloud migration and enabled safe decommissioning of legacy Oracle systems.

Technologies used: AWS S3, Redshift, Athena, Glue (ETL exposure), Lambda, Python, SQL, Jupyter

Oracle Database Optimization & Decommissioning:

- Executed Oracle database reorganization activities.

- Reduced tablespace usage from 18.9 TB to 11.5 TB, achieving significant cost savings.

- Segregated tablespaces by schema to enable phased decommissioning.

- Supported planning for safe shutdown of legacy data as feeds moved to AWS.

Key Outcome: Lowered infrastructure costs and reduced operational risk during cloud migration.

Technologies Used: Oracle Database, PL/SQL, SQL, UNIX

Spark-Based Telecom Billing Data Pipeline:

- Built Spark SQL / PySpark pipelines to transform billing and pricing data.

- Applied joins, filters, aggregations, and partitioning.

- Stored optimized datasets in Parquet format for Hive analytics.

- Implemented data validation logic and unit tests using pytest.

- Automated scheduled runs using Jenkins CI/CD pipelines.

Key Outcome: Delivered clean, analytics-ready datasets for reporting with improved performance and reliability.

Technologies Used: Apache Spark, PySpark, Spark SQL, Hive, Parquet, Python, Jenkins, Git

Project Saikat – GenAI-Based Secure Document Q&A:

- Built a private document Q&A system using Python and open-source LLMs.

- Enabled secure querying of sensitive internal documents.

- Focused on privacy, local execution, and access control.

Key Outcome: Delivered a tool for quick issue resolution and improved learning.

Technologies Used: Python, LLMs, UNIX

Project: Backup Automation Solution:

This was a POC ,the aim of this tool was to automate the backup of machines/servers, this was being developed as a cloud native tool

Technologies Used: AWS Cloud, Python

Skills

Technologies and tools I have used while building, supporting, and validating data platforms.

Data Engineering

Apache Spark, Spark SQL, PySpark

ETL / ELT Pipelines, Data Validation, Data Quality Checks

Hive, Parquet, Batch Data Processing

Databases

Oracle Database, Oracle PL/SQL

Stored Procedures, Packages, Anonymous Blocks

Schema Design, Data Reconciliation, Database Optimization

Programming & Scripting

Python (data processing, validation, automation)

Shell Scripting (Linux/UNIX automation)

SQL (Oracle, Redshift, Athena)

Cloud & Data Platforms (AWS)

Amazon S3 (data lake storage)

Amazon Redshift (data warehouse)

Amazon Athena (interactive querying)

AWS Glue (ETL framework exposure), AWS Lambda (pipeline triggers)

DevOps & Monitoring

Git (Version Control)

Jenkins (CI/CD for data pipelines)

SonarQube (Code Quality)

Dynatrace (Application & Infrastructure Monitoring)

Tools & Platforms

Netcracker Revenue Management (Telecom Billing)

Postman (API Testing)

Jupyter Notebook (Data Validation & Analysis)

Certification

During Covid Lockdown I invested my time in learning various technologies

Course	Topics Covered	Certificates
AWS Fundamentals Specialization	Cloud native, Cloud migration, security, serverless	Download Certificate
Blockchain Basics Specialization	DAps, Platform, Smart Contracts	Download Certificate
Introduction to Data Science	Introduction	Download Certificate

Class	Year Of Passing	Board/University	Percentage / CGPA	Grade/ Class
BE	2020	SPPU	7.4	First Class
XII	2016	CBSE	71.2%	Distinction
X	2014	CBSE	8.4	Distinction

Introduction

Hello, I'm Omkar Jadhav