{"id":4108,"date":"2025-08-10T04:46:00","date_gmt":"2025-08-10T00:46:00","guid":{"rendered":"https:\/\/jobs.pactemployment.ae\/?candidate=krish-shanthaa"},"modified":"2025-08-10T05:00:53","modified_gmt":"2025-08-10T01:00:53","slug":"krish-shanthaa","status":"publish","type":"candidate","link":"https:\/\/jobs.pactemployment.ae\/?candidate=krish-shanthaa","title":{"rendered":"Veerabagu Krishnasamy"},"content":{"rendered":"","protected":false},"featured_media":0,"comment_status":"open","ping_status":"closed","template":"","candidate_category":[],"candidate_location":[40],"candidate_tag":[],"class_list":["post-4108","candidate","type-candidate","status-publish","hentry","candidate_location-canada"],"metas":{"_candidate_founded_date":"1981-03-15","_candidate_phone":"","_candidate_email":"krish.shanthaa@gmail.com","_candidate_gender":"Male","_candidate_age":"","_candidate_salary":"","_candidate_salary_type":"","_candidate_qualification":"","_candidate_experience_time":"","_candidate_category":[],"_candidate_languages":["English"],"_candidate_job_title":"Data Engineer","_candidate_tag":[],"_candidate_show_profile":"show","_candidate_profile_url":"","_candidate_socials":[{"network":"linkedin","url":"https:\/\/www.linkedin.com\/in\/krish-shan-15811a251\/"}],"_candidate_address":"","_candidate_location":{"40":"Canada"},"_candidate_map_location":{"address":"","latitude":"","longitude":""},"_candidate_cv_attachment":{"4109":"https:\/\/jobs.pactemployment.ae\/wp-content\/uploads\/wp-job-board-pro-uploads\/_candidate_cv_attachment\/2025\/08\/Krish_Azure_Data_Engineer.docx"},"_candidate_portfolio_photos":"","_candidate_video_url":"","_candidate_education":[{"title":"Diploma in Electronics and Communications EN=ngineering","year":"1999"}],"_candidate_experience":[{"title":"Python Developer","start_date":"Nov 2022","end_date":"Oct 2024","company":"ScotiaBank","description":"    \u2022 Hands-on experience in developing data integration solutions using Azure Data Factory (ADF), Azure SQL Server, and Azure Logic App Service.\r\n    \u2022 Proficient in building and optimizing Data Flow Activities within ADF for efficient data transformations.\r\n    \u2022 Skilled in migrating pipelines across environments to support continuous deployment and integration efforts.\r\n    \u2022 Strong understanding of Azure SQL Database and SQL Data Warehouse, with the ability to translate logical models into physical implementations.\r\n    \u2022 Collaborated with Cloud and DevOps teams to manage IAM policies and role-based access control across various Azure services.\r\n    \u2022 Adept at gathering, analyzing, and documenting business requirements to align technical solutions with stakeholder goals.\r\n    \u2022 Designed and developed end-to-end ETL pipelines using ADF to extract, transform, and load data from on-premises and cloud sources into Azure Data Lake Storage Gen2.\r\n    \u2022 Managed and resolved production issues on Apache NiFi, ensuring uninterrupted data flow, swift troubleshooting, and minimal system downtime.\r\n    \u2022 Designed, developed, and optimized ETL pipelines for banking domains, including Trades, Orders, and Positions, leveraging pandas and PySpark for efficient data processing and analytics.\r\n    \u2022 Streamlined and standardized modules by adhering to industry standards using PEP8 and black, enhancing code readability and maintainability.\r\n    \u2022 Created comprehensive documentation using Google docstring and Sphinx, enabling the generation of HTML pages for modules and improving developer onboarding and module integration.\r\n    \u2022 Identified and rectified redundant code snippets, significantly reducing technical debt and improving codebase efficiency.\r\n    \u2022 Led the migration of existing code from pandas to PySpark, enhancing scalability and performance of data processing tasks.\r\n    \u2022 Creating expectations using great expectations and validating new data against the expectations in real time to generate exception reports. (great expectation)\r\n    \u2022 Packaged applications using Docker and managed deployment over Kubernetes clusters, utilizing JFrog and Rancher for efficient container orchestration and version control.\r\n    \u2022 Implemented robust code versioning practices using Git and Bitbucket, employing a branching strategy to ensure smooth collaboration and codebase management.\r\n    \u2022 Utilized Jira for effective project management, tracking progress, managing tasks, and ensuring timely delivery of project milestones.\r\n    \u2022 Retrieved time series data from Oracle DB using cx_Oracle and pandas.\r\n    \u2022 Pre-processed data and engineered features using pandas.\r\n    \u2022 Trained anomaly detection models with Sklearn's Isolation Forest and performed hyperparameter tuning. (Isolation Forest)\r\n    \u2022 Logged model details with MLFlow, saving the path and run ID in a config file.\r\n    \u2022 Designed and implemented GenAI-driven data pipelines, incorporating chunking, Retrieval-Augmented Generation (RAG) on vector databases, and Azure OpenAI\u2019s ChatCompletion API (GPT-3.5 Turbo, GPT-4, GPT-4o) for advanced insights and keyword extraction. (GenAI, RAG, Vector Databases, Azure OpenAI, GPT Models, Keyword Extraction)\r\n    \u2022 Deployed scalable AI workflows on Azure, utilizing AKS and WebApps, and built RESTful APIs with Flask and FastAPI for seamless integration and performance optimization. (Azure, AKS, WebApps, Flask, FastAPI)\r\n    \u2022 Applied the logged MLFlow model for real-time anomaly detection.\r\n    \u2022 Deployed training and scoring pipelines on a Linux server using crontab, mlpipeline, and scoring pipeline.\r\n    \u2022 Developed robust ETL pipelines in Databricks to ingest transactional data from Elasticsearch, transforming and storing it across bronze, silver, and gold Delta Lake layers for regulatory and financial reporting.\r\n    \u2022 Flattened deeply nested JSON structures from core banking systems and normalized them for downstream analytics using PySpark and Spark SQL in Databricks notebooks.\r\n    \u2022 Performed schema validation and data type standardization in silver tables, ensuring high data quality for sensitive banking metrics such as balances, risk scores, and KYC flags.\r\n    \u2022 Engineered gold-layer datasets with business-critical aggregations, joins, and time-series analyses to support fraud detection, compliance dashboards, and portfolio performance tracking.\r\n    \u2022 Automated and monitored ETL jobs using Databricks Workflows and integrated version control via Git to meet audit and change management requirements in a secure banking environment."},{"title":"Python Developer","start_date":"Aug 2020","end_date":"Nov 2022","company":"TD Bank","description":"    \u2022 Designed and developed Azure Data Factory (ADF) pipelines, configuring Linked Services and Azure Key Vault to securely connect with databases and flat files for data movement into Azure Data Lake Storage Gen2.\r\n    \u2022 Executed seamless data migration from on-premises virtual machines (VMs) to ADLS using ADF.\r\n    \u2022 Implemented initial data load transformations using Data Flow Activities, storing the processed data in ADLS Gen2 for further use.\r\n    \u2022 Scheduled and monitored ADF pipelines using time-based triggers, ensuring reliable and timely execution of workflows.\r\n    \u2022 Conducted hands-on migration of legacy on-prem applications to Azure Cloud, optimizing performance and scalability.\r\n    \u2022 Performed transformation and loading of curated data into Azure SQL Data Warehouse using ADF\u2019s Copy Activity.\r\n    \u2022 Developed Logic Apps to automate the ingestion of daily incremental updates (Excel files) from SharePoint into ADLS Gen2.\r\n    \u2022 Created and maintained incremental data pipelines in ADF to load daily updates from ADLS Gen2 into SQL Data Warehouse.\r\n    \u2022 Implemented automated email alerts for success and failure events at each pipeline activity level for proactive monitoring.\r\n    \u2022 Worked extensively with key ADF components such as Linked Services, Data Flows, Copy Activities, Lookup Activities, Source Connections, and Azure Data Lake Storage to deliver scalable data integration solutions.\r\n    \u2022 Developed and optimized cloud solutions using AWS components (EC2, EMR, Lambda) and automated tasks such as Hadoop job migration and data processing.\r\n    \u2022 Automated API integration, log rotation, and AWS services using Python and Shell scripts, and managed CI\/CD pipelines with Git and Jenkins.\r\n    \u2022 Managed all backend tasks, including RabbitMQ automation, API migration from Sybase to Oracle, and troubleshooting Python applications.\r\n    \u2022 Customized and deployed Jupyter-Lab, Jupyter-Notebook, and Jupyter-Hub for Data Scientists, including package creation and versioning solutions.\r\n    \u2022 Leveraged Python modules for web crawling and optimized multi-threading for performance enhancement in various processes."},{"title":"Python Developer","start_date":"Mar 2017","end_date":"Apr 2020","company":"HCL Technologies Ltd","description":"Data Engineering:\r\n    \u2022 Collaborated directly with clients to gather, clarify, and document business and technical requirements for data integration solutions.\r\n    \u2022 Designed and developed robust ADF pipelines, Linked Services, and Datasets in version 2 to support complex data processing workflows.\r\n    \u2022 Engineered various pipelines tailored to specific business use cases, ensuring scalability and performance.\r\n    \u2022 Configured key Azure cloud services, including Azure Blob Storage and Azure SQL Database, to support data ingestion and storage.\r\n    \u2022 Implemented email notification workflows using Azure Logic Apps to alert stakeholders of pipeline execution outcomes.\r\n    \u2022 Scheduled ADF pipelines using time-based triggers to automate daily data loads and ensure data availability.\r\n    \u2022 Developed and maintained PySpark code to retrieve and process data from the refined data layer for downstream consumption.\r\n    \u2022 Actively participated in Agile ceremonies, including daily stand-ups, sprint planning, and backlog grooming sessions, to align development efforts with sprint goals.\r\n    \u2022 Solution for an insurance client, leveraging LLMs to extract high, deep, and deepest-level insights from claim data, generating CXO-level reports and executive summaries.\r\n\r\nSoftware Engineering:\r\n    \u2022 Engaged in all SDLC stages, including design, development, testing, and implementation.\r\n    \u2022 Re-engineered modules to enhance system efficiency and incorporate new features.\r\n    \u2022 Collaborated with stakeholders to gather requirements and create high-level and detailed design documents.\r\n    \u2022 Fixed and deployed Python bug fixes for key applications used by customers and internal teams.\r\n    \u2022 Utilized JIRA for bug tracking and Git for version control and deployment.\r\n    \u2022 Implemented CI\/CD pipelines using Ansible playbooks with Jenkins and SonarQube.\r\n    \u2022 Developed applications in UNIX environments and utilized relevant commands.\r\n    \u2022 Created business decision graphs using Python\u2019s matplotlib library and maintained technical documentation.\r\n    \u2022 Worked with feature engineers on defect reproduction, troubleshooting, and root cause analysis.\r\n    \u2022 Conducted peer reviews of design and code and recommended cost-effective AWS solutions.\r\n    \u2022 Automated tasks using Crontab and participated in Agile and Scrum practices for project management."},{"title":"System Specialist","start_date":"Jun 2013","end_date":"Mar 2017","company":"Datapage Digital Services Pvt Ltd","description":"    \u2022 Directed end-to-end Wintel administration, including monitoring 43 Windows Domain Controllers using Microsoft SCOM for automated ticketing.\r\n    \u2022 Led troubleshooting for ADDS, DHCP, DNS, and client-related issues like printer and connectivity problems.\r\n    \u2022 Developed, configured, and managed group policies, redesigned Active Directory hierarchy, and created new policies for software deployment and settings.\r\n    \u2022 Identified risks and developed mitigation plans; implemented ITIL Service Management processes for incident, configuration, and change management.\r\n    \u2022 Trained and mentored the team, allocated tasks, and reported on performance indicators and value delivery.\r\n    \u2022 Managed the installation, maintenance, and upgrade of anti-virus, firewalls, WSUS Patch Management, and native ADS tools.\r\n    \u2022 Monitored server performance, applied test patches\/hot fixes, and ensured timely updates and virus definition updates.\r\n    \u2022 Coordinated with Symantec, production, development, and application teams for virus management, patch deployment, and firewall policies.\r\n    \u2022 Planned and deployed user policies on firewalls, configured leased lines, monitored network traffic, and reviewed network logs for misuse."}],"_candidate_skill":[{"title":"Python","percentage":"100"},{"title":"SQL","percentage":"80"},{"title":"Pyspark","percentage":"95"},{"title":"Azure Databricks","percentage":"90"},{"title":"Azure Delta Lake","percentage":"80"},{"title":"Apache Spark","percentage":"90"},{"title":"Apache NiFi","percentage":""},{"title":"Github","percentage":""},{"title":"Agile Methodology","percentage":""},{"title":"ETL Pipeline"}]},"_links":{"self":[{"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=\/wp\/v2\/candidate\/4108","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=\/wp\/v2\/candidate"}],"about":[{"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=\/wp\/v2\/types\/candidate"}],"replies":[{"embeddable":true,"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4108"}],"wp:attachment":[{"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4108"}],"wp:term":[{"taxonomy":"candidate_category","embeddable":true,"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=%2Fwp%2Fv2%2Fcandidate_category&post=4108"},{"taxonomy":"candidate_location","embeddable":true,"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=%2Fwp%2Fv2%2Fcandidate_location&post=4108"},{"taxonomy":"candidate_tag","embeddable":true,"href":"https:\/\/jobs.pactemployment.ae\/index.php?rest_route=%2Fwp%2Fv2%2Fcandidate_tag&post=4108"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}