Summary
I come with 13 years of experience in the field of Data Science and Machine Learning, primarily in Banking. I bring deep, hands-on technical expertise, intellectual humility, and clear communication to the table. I am looking to work on interesting problems where I can contribute individually and / or lead a team.
Skillset
- Python (Pandas, Matplotlib, Scikit-learn, Flask, Kivy, Dask), Bash, Hadoop, Impala, Spark, Mercurial, MS Excel, JavaScript (ES6, VueJS), SAS (Advanced)
- Machine Learning Algorithms (Supervised and Unsupervised), Stochastic Simulations, Statistical Modeling (Logistic Regression, Linear Regression, Segmentation, Scorecards)
Work & Education
- Vice President, Fraud Analytics using Entity Linkage - Citicorp Services India Private Limited (2018 - Present)
- Vice President, Secured Lending Analytics - Fullerton India - (2015 - 2018)
- Senior Analyst, Risk CoC - McKinsey and Company - (2011 - 2014)
- Analyst, Global Modeling and Risk Analytics - TCS e-Serve / Citigroup (2009 - 2011)
- Chemical Engineering - Masters' thesis in Stochastic Simulation - IIT Madras (2004-2009, GPA: 8.4/10)
Notable Projects
- Neural Networks for Synthetic ID Prediction: Extensive distributed hyperparameter tuning on top of an existing Application Fraud model pipeline to predict Synthetic ID Frauds. Impact: Performance orthogonal to FICO, expected potential impact ~$3MM per year in loss reduction [Hadoop MapReduce / Keras / Pandas / Scikit-Learn]
- Fuzzy String Matching for false Unemployment claims: Implemented the Jaro-Winkler algorithm in Python to identify spurious claims. Impact: 20M accounts identified and reviewed. [Python / Hadoop MapReduce]
- Gradient Boosted Fraud Prediction and Detection: Combine multiple dimensions of data, strategies, and scorecards in a tuned Gradient Boosted Model to predict (Ex Ante) and detect (Ex Post) first- and third-party frauds. Impact: $10MM per year in loss reductions at a 30+% precision [Pyspark / Hadoop MapReduce]
- Rule based Application Fraud Reviews: Use inter-connectivity of accounts/customers to predict new application frauds and scrub out undetected frauds. Impact: >$1MM loss reduction per year at a 40+% precision [Pandas / Scikit-Learn]
- Naive Bayes Classifier with expert ratings for small business risk: Low sample size problem, solved using expert committee assigned performance. Impact: Model reduces acquisition risk by 50%. [Pandas, Scikit-Learn, SAS]
- Build and Deploy Statistical Model for unsecured loans: digitally sourced unsecured loans on a low volume, sparse feature set data. Impact: Improves approval rates by ~150% [SAS]
- Python GUI for Non-performing loans - A proof-of-concept product to securitize (tranche) and price a portfolio of NPL with a graphical frontend and Pandas backend. Impact: Two large European banks on-boarded [wxPython]