Job Purpose:
In this role, you will design, build, and optimize the data engines that power DTB's intelligence. You will develop robust data pipelines, feature stores, modelÃÂâÃÂÃÂÃÂÃÂserving systems, and scalable bigÃÂâÃÂÃÂÃÂÃÂdata platforms that enable advanced credit scoring, fraud detection, customer intelligence, and a wide range of machineÃÂâÃÂÃÂÃÂÃÂlearning applications.
You will be at the heart of transforming DTB into a dataÃÂâÃÂÃÂÃÂÃÂdriven organization—ensuring that teams across the bank can rely on highÃÂâÃÂÃÂÃÂÃÂquality, trusted, and scalable data to drive smarter decisions, stronger governance, and innovative digital solutions. This is a highÃÂâÃÂÃÂÃÂÃÂimpact role for a builder, a problemÃÂâÃÂÃÂÃÂÃÂsolver, and a visionary ready to shape the future of data and AI at DTB
Key Responsibilities:
Science & ML
Build and maintain ETL/ELT pipelines that feed modelling datasets from multiple banking systems (CBS, LMS, CRM, Cards, Mobile Banking, Bureau, Collections systems).
Develop automated data preparation workflows for credit scoring, fraud models, behavioral models, and IFRS9 modelling.
Create end-to-end ML pipelines integrating feature engineering, data validation, model deployment, and monitoring.
Manage and Build other Enterprise ETL using tools like ODI , informatica etc.
Big Data Platform Engineering
Develop scalable data-processing workflows using Spark, Hadoop, Kafka, Airflow, Flink or similar.
Optimize large datasets (transactional, bureau, behavioural, logs) for modelling in batch and real-time environments.
Manage distributed computation and ensure reliability and fault tolerance.
Feature Store & Data Assets Management
Design and maintain a centralized feature store for credit, fraud, marketing, and customer analytics models.
Ensure feature consistency between training and serving environments.
Implement versioning, lineage, documentation, and metadata management for data features.
Model Deployment & MLOps
Collaborate with data scientists to deploy models using MLflow, Docker, Kubernetes, API gateways, CI/CD pipelines.
Develop automated monitoring pipelines for model performance, drift detection, data quality, and explainability.
Ensure models operate efficiently in real-time decision engines and batch scoring environments.
Data Quality & Governance
Implement robust data validation, profiling, anomaly detection, and reconciliation checks.
Work with Data Governance teams to ensure compliance with IFRS9, Basel, CBK, GDPR, and internal data standards.
Manage data lineage, cataloguing, and documentation to support audits and regulatory reviews.
Collaboration & Stakeholder Support
Partner with Data Scientists, Risk, Credit, Fraud, Marketing, and BI teams to align data pipelines with business use cases.
Work with IT and Infrastructure teams on cluster performance, security, access controls, and SLA adherence.
Participate in sprint planning, architecture reviews, and model implementation committee sessions.
Performance Optimization
Improve the efficiency, scalability, and cost of ML workloads.
Optimize database queries, Spark jobs, Kafka streams, and storage systems.
Qualifications & Experience:
Strong academic foundation with a Bachelor's or Master's in Computer Science, Data Engineering, Data Science, Information Technology, or a related quantitative field.
3 - 7+ years of impactful, handsÃÂâÃÂÃÂÃÂÃÂon experience in data engineering, bigÃÂâÃÂÃÂÃÂÃÂdata processing, or building scalable ML infrastructure—ideally within fastÃÂâÃÂÃÂÃÂÃÂpaced, dataÃÂâÃÂÃÂÃÂÃÂdriven environments.
Advanced programming capability, with strong proficiency in Python, SQL, and PySpark; experience with Scala is an added advantage.
Demonstrated expertise in modern data and ML platforms, including:
BigÃÂâÃÂÃÂÃÂÃÂdata technologies: Spark, Hadoop, Kafka, Airflow
MLOps & containerization: MLflow, Docker, Kubernetes
CI/CD pipelines: GitLab, Jenkins, GitHub Actions
Cloud platforms: AWS, GCP, or Azure (highly preferred)
Experience working with banking systems, risk data, or creditÃÂâÃÂÃÂÃÂÃÂmodelling datasets—a significant advantage that accelerates success in this role.
Key Competencies
Strong understanding of data structures, distributed systems, and ML workflows.
Excellent problem-solving, debugging, and optimization skills.
Fast learner with ability to adapt to new technologies.
High attention to detail, documentation discipline, and data governance awareness.
Strong collaboration and communication skills.