A Data Science practitioner, passionate about solving real world problems, with strong knowledge in Python, SQL, R. Experience in Data Acquisition, Statistical Analysis, Model Building (Machine Learning, Deep Learning, Time Series, NLP) and Deployment.
In a nutshell, it's been a roller coaster ride for a person who loves taking challenges.
• Aggregating data from multiple sources and performing KYC for institutional customers and classifying users as Authorised Signatory or Agent. Tools involved: Teradata, Excel (VBA Macro), Qlik Sense.
• Mining critical banking information from loads of PDF documents using PyMuPDF, Tabula & OCR.
• Built an attrition model which predicts when the commercial customers of a banking giant are going to churn. Data has been aggregated from various source systems to Teradata using SQL and the model has been built using Python and Pandas.
• Developed Python scripts which helped to find best similarity match among company details stored in Teradata database with companies present in external survey data using NLP. Saved a lot of manual work with 91% accuracy.
• Finding the Customer Lifetime Value for an ecommerce client using Vertex AI, Big Query. Recommended marketing actions by segmenting customers on the basis of RFM score (Recency, Frequency, Monetary). Used K Means clustering to make the customer segments.
• Shortlisting best resumes with respect to job description. Techniques involved: extract documents (using Textract, PyMuPDF, NLTK, OCR), filter skills using NER, find best match score using cosine similarity. Webapp was built using Streamlit and deployed using Cloud Run, Cloud Build.
• Built an OOPS based NLP Framework for HCL CoE Team which will accelerate the work for all NLP based projects. Framework involves: word embedding, sentiment analysis, text search, POS tags, classification, etc.
• Building chatbots with a focus on the power of Conversational Artificial Intelligence (AI) & Machine Language (ML) Models, Voice - for the Healthcare industry using NLP and Deep Learning.
• Preparing dataset from zero for a healthcare domain problem, by web scraping, annotating, pre-processing.
• Implementation of state-of-the-art Transformer models (BERT). Docker image was built and model was deployed on AWS using ECR and ECS (with Fargate).
• Designed CNN based model using Tensorflow which detects level of eye blindness caused due to diabetes.
• Developed a multi-label classification model using AWS Sagemaker which detects several complications of diabetes. Model was deployed using AWS Lambda and API Gateway.
• Specially featured in Asentech’s monthly newsletter.
• Post Graduated with a GPA of 6.55/8.
• Achieved 3rd rank in Data Science batch.
• Product Marketing, User Acquisition, Customer Retention for an app-based purchasing tool that helps small retailers in Daily Stock Audit, Inventory Management, Prepare Purchase Order Lists.
• FAQ Chatbot using Machine Learning models.
• Helps retailers in stock & financial projections based on the previous purchase history.
• Handled large datasets, performed exploratory analysis to identify trends. Generated reports, translated data into visualizations and built dashboards. Used: Python, SQL, Tableau.
• Built ML-based FD propensity model for a banking client, to get the customers who are more prone to take FD.
• Built and deployed web app which can identify the customers that are eligible for loan, so that the client can specifically target those customers.
• Designed data models, business process and conceptual models for clinical trials using ER Studio.
• Developed mappings and workflows on Informatica Data Quality Tool for ETL purpose.
• Data Migration project: Historical and current data of 750 organizations was cleansed, validated and migrated from Payroll to modern Payroll+HCM system.
• Graduated with a GPA of 7.9/10.
• Achieved 2nd rank in Computer Science department in SEM-I.
From Data Preprocessing to Machine Learning, I have been honing my skills in the following set of tools.
List of my hard earned certifications.
January 2022
November 2023
November 2022
February 2023
June 2023
September 2023
Check out some of my blogs on Medium.
Check out the major topics of Linear Algebra, Calculus, Statistics, Probability.
We will see how to build ML model, design the HTML/CSS page and finally run it on localhost.
Learn how to deploy ML enabled app on webhost (using Amazon EC2) in 10 easy steps.
Here's list of papers which got published on IEEE or other conferences.
Predicting the volatility of three equities listed on India's national stock market (NSE). Methods used GARCH, GJR-GARCH, EGARCH, LSTM.
Learn MoreThis paper implements a complete movie recommendation system prototype. Details include Genre based recommendation, Pearson Correlation Coefficient, Cosine Similarity, KNN-Based, Content-Based Filtering (using TFIDF and SVD), Collaborative Filtering (using TFIDF and SVD), Surprise Library based recommendation system.
Learn MoreBuilt a robust model to predict stock prices of three different sectors. 6 methods have been used to predict. The TS includes Holt-Winter Exponential Smoothing, the Econometric method involves ARIMA, ML method involves Random Forest and MARS, and DL method involves RNN and LSTM.
Learn MoreComing soon.
Dunlop, B.T. Road, Kolkata - 700108
West Bengal, India
Have work opportunities? Want freelancing for Data Science projects? Want to collab in Hackathons? Please drop a mail. I'll get back to you soon.