About me
Hello, I'm Sarath Tharayil, a data scientist with expertise in analyzing complex datasets and developing machine learning models to extract meaningful insights.
My journey in data science began with a curiosity for how data can drive decision-making and a desire to create something meaningful from raw information. Over the years, I've honed my skills in statistical analysis, machine learning, and data visualization.
Experience
Sep 2024 - Present
- •Developed a real-time recommendation system using Kafka, Faiss, and PyTorch, improving customer engagement by 30% for an e-commerce client.
- •Built an end-to-end ML pipeline using Feast (feature store), MLflow, and Kubeflow, reducing model deployment time by 40%.
- •Designed a RAG-based AI chatbot for a finance client, integrating LangChain and vector databases, enhancing query response accuracy by 20%.
- •Fine-tuned LLaMA 2 for a legal-tech startup, achieving 15% higher accuracy in case-law retrieval compared to GPT-4-turbo.
- •Engineered an automated supply chain forecasting model using transformers (TFT), reducing inventory costs by 25%.
- •Implemented self-supervised learning for anomaly detection in fraud analytics, cutting false positives by 35%.
Sep 2020 - Sep 2022
- •Led the development of a real-time data pipeline handling 1,000+ API calls daily, automating ingestion with PySpark, Apache Hive, and Airflow on AWS.
- •Built ARIMA-LSTM ensemble models for price and demand forecasting, achieving 87% accuracy and deploying scalable inference via Docker & Kubernetes.
- •Designed an anomaly detection system using Isolation Forests & Autoencoders, reducing anomalies by 65% with AWS Lambda for real-time fraud detection.
- •Developed a feature engineering framework using wavelet transforms & Fourier analysis, improving forecast accuracy by 17%.
- •Created Tableau dashboards integrating Flask APIs & AWS Redshift for real-time insights, optimizing query performance.
- •Built a machine learning pipeline to diagnose supply chain disruptions, reducing shortages by 38% and preventing $1.5M daily losses.
- •Developed a Bayesian MMM model, optimizing marketing spend and increasing ROI by 16%.
- •Designed a reinforcement learning algorithm for dynamic inventory management, cutting stockouts by 25%.
- •Built an RShiny simulator for MMM, boosting incremental revenue by 16%.
Some of my projects
View all →Latest blog posts
View all →March 15, 2023•8 min read
Machine Learning Trends to Watch in 2023
An overview of the most promising machine learning technologies and methodologies emerging this year.
February 2, 2023•6 min read
Data Visualization Best Practices for Effective Communication
Learn how to create impactful visualizations that clearly communicate your data insights.