Founder | Senior Data Scientist
2018 - PRESENTQuasar Labs- Consulted enterprise, SMB, and startup clients on the implementation of cutting-edge machine learning capabilities for a variety of use cases with the express goal of increasing performance and impact.
- Communicated with executives, senior managers, and teams of data scientists from over 20 companies and over 40 countries.
- Implemented deep learning neural networks using CNNs in TensorFlow for object detection and recognition—earthquake impact detection, receipt text detection, valve defect, and wear and tear detection.
- Built custom learners for revenue forecasting in retail using seasonal ARIMA and RNNs and 85GB hourly sampled data. Deployed models in a real-time production environment—used Docker, Flask, AWS, PostgreSQL, and MySQL Server.
- Implemented optical character recognition (OCR) for automated receipt text extraction and classification using Google OCR, TensorFlow, Flask, and Keras.
- Developed an end-to-end training pipeline to predict user churn for a telecom client from the Bahamas. The architecture used leveraged time-to-event RNNs and gradient boosted decision trees.
Technologies: Data Analysis, Data Engineering, Computer Vision, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Spark Streaming, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, SQL, TensorFlow, RFounder
2019 - 2020DataZip- Collected, processed, and controlled the distribution of auto dual dash-cam imagery and telematics data, as well as healthcare imagery.
- Built pipelines for cleaning, processing, classifying, and anomaly detection applied to 1080p and 720p, and 30fps footage.
- Synchronized the telematics and dash-cam video footage using audio recordings, Fast Fourier Transform (FFT) convolutions, de-noising, and signal processing techniques.
- Implemented image semantic segmentation, road object classification, identification of rapid decelerations/breaks, and occurrences of near-misses and collisions.
- Managed client interactions, projects, and development.
Technologies: Amazon Web Services (AWS), Data Analysis, Data Engineering, Computer Vision, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Technical Leadership, Android, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, AWS, TensorFlow, OpenCVData Scientist | Natural Language Research Engineer
2019 - 2019Tessian- Developed language models, transfer learning, text analysis, classification and clustering, few-shot learning, embeddings, and attention to RNN networks across 100GB of email data.
- Pioneered techniques such as unsupervised data augmentation, weak supervision in Snorkel MeTaL, and multi-task learning for malicious data classification.
- Implemented end-to-end machine learning models in production, using TensorFlow, AWS S3 and Athena, and SageMaker on both CPU and GPU-based architectures.
- Proactively explored and analyzed the compatibility of string similarity matching using one-shot learning and siamese networks across multiple use cases.
- Implemented various codebase improvements, testing automation, parallelized processing, and documentation design.
Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Spark Streaming, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, Amazon Athena, Amazon DynamoDB, Amazon S3 (AWS S3), Docker, Bash, TensorFlowData Scientist
2018 - 2018Apsara Capital- Led the development and implementation of the data analysis and research infrastructure.
- Developed the AWS S3, Lambda, EC2, and Docker orchestration for extracting, processing, and storing financial, economic, and market data from the Thomson Reuters Eikon API.
- Built an NLP language model using Snorkel and MeTaL for the analysis earnings of call transcripts.
- Created the technical analysis infrastructure using R and a set of 20 customizable technical indicators.
- Designed the codebase, automate the testing, integrated the production, and generated and managed documentation.
Technologies: Quantitative Modeling, Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, R, Amazon Kinesis Data Firehose, AWS Glue, Amazon Athena, Amazon S3 (AWS S3)Data Scientist
2017 - 2018Tracktics GmbH- Analyzed time series data for motion classification and identification of activity bursts using CNN, Bayesian models, and Monte Carlo simulations.
- Supported the development of the analytical pipeline and user segmentation capabilities using AWS S3, AWS Lambda, and EC2.
- Implemented data management and visualization with AWS SQS, S3, DynamoDB, Python, Pandas, and Bokeh.
- Developed a general motion analysis over triaxial accelerometer, gyroscope, magnetometer data in addition to GPS and video.
- Proactively researched sports analytics, documentation management, scrum integration, and agile methodologies.
Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, JavaScript, Amazon Web Services (AWS), DjangoData Scientist | Analyst
2017 - 2018PredictX- Took the initiative and improved sales forecasting capabilities by more than 20% as part of an MVP for a retail client with 700 POS. Used tree-based/linear models and 40TB+ extraneous variables such as weather, events, and client-specific metrics.
- Drove business decisions by researching, testing, and integrating various regression and classification-based models using Python Scikit-learn, TensorFlow, and Keras.
- Led the implementation of end-to-end ETL processes using Python, MySQL, PostgreSQL, and Knime.
- Applied association rule mining with Neo4j Graph data representations for product recommendations in retail. Replicated results in production and supported the transition of the research initiative to a new market-ready product.
- Developed an insurance algorithm for seismic and flood risk computation using MCMC.
- Delivered codebase improvements via the use of in-memory processing with Spark and Hadoop.
Technologies: Quantitative Modeling, Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, KNIME, TensorFlow, Neo4j, MySQL, JavaScriptResearch Assistant
2016 - 2017University of Glasgow — Urban Big Data Centre- Started with no knowledge of machine learning and coding and ended up building an eCommerce recommender system that relied on RNNs and collaborative filtering to predict user-product relevance.
- Learned C# from scratch and developed an Android app with Xamarin, which aimed to collect sensitive data from mobile devices. Developed the solution end-to-end (both front, back end, and documentation) and paired it with a MySQL database for storage.
- Manipulated high-dimensional datasets with 120 GB+ for feature creation using Python Pandas, PostgreSQL, RDD in Hadoop DFS, and Spark. Visualized the data using Tableau, Stata, and LaTeX.
- Reviewed, replicated, and analyzed a variety of state-of-the-art research papers about recommender systems, information retrieval, and distributed systems.
- Used GPU and parallel computing for modeling 100 GB+ datasets and Spark and Hadoop in a research environment on an on-premise cluster.
Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Sentiment Analysis, Android, Machine Learning, Exploratory Data Analysis, Statistical Analysis, STATA, LaTeX, Spark, Hadoop, Xamarin, Java, C#Assistant Brand Manager
2015 - 2015Procter & Gamble- Led a competitive analysis initiative across nine SEE regions.
- Co-led a team of 5-10 people for launching Pampers Premium Care’s biggest innovation in the past five years and a Pampers UNICEF PR campaign across four SSE regions.
- Identified pricing gaps and researched and presented viable solutions to increase the company’s competitiveness in four SEE regions.
Technologies: Data Analysis, Data Science, Data Analytics, Statistical Data Analysis, Analytics, Marketing, Branding, Management, Microsoft Excel, Microsoft PowerPointCo-founder
2014 - 2015Crowd Augur- Designed the project to harness video gamers’ actions for augmenting data analysis algorithms. Started in collaboration with five McGill-based bioinformatics and computer science researchers.
- Took the initiative to secure meetings with top executives, which led to several partnership agreements and four qualified clients from healthcare and finance.
- Proposed and developed a unique business model for bringing more accurate data analysis to genomics and finance.
- Ranked 5/150+ in the McGill University’s Dobson Startup Cup.
Technologies: Data Analysis, Data Science, Statistical Data Analysis, Leadership, Research, Analysis, Microsoft PowerPoint, Strategy, Business Strategy, Microsoft ExcelAssistant Manager
2010 - 2014Maximal Group- Proposed, built, and promoted using SEO, Google AdWords, and Analytics, the company’s first online store. This initiative leads to a four times increase in new customer acquisition and a 13% increase in sales in the first three months.
- Took the initiative to propose and coordinate a Kaizen/Lean-inspired waste reduction program that contributed to a 30% leftover reduction.
- Managed suppliers and negotiated bulk purchases, which led to a 5% reduction in raw material costs.
Technologies: Data Analysis, Data Science, Statistical Data Analysis, Technical Leadership, Management, Lean, Warehouses, Statistical Modeling, WordPress, CSS, HTML, Python 3