I am a Data Scientist II at HealthNow NY in Buffalo, NY. My main interests include data mining, R programming, and problem solving using advanced analytical techniques from machine learning and applied statistics. I also enjoy developing software (e.g. R packages and Shiny web apps), exploring technologies and trends (e.g. AWS and Rcpp), and presenting data analysis in written, verbal, and visual form to large audiences. I’ve worked on a wide range of projects and provided extended support for teams in marketing, operations, sales, and healthcare. What really motivates me is the idea of identifying and solving big problems using my expertise in data science.
I began my career in 2015 doing investigative research for a small political news outlet in Columbia, SC. I mainly wrote R scripts to pull data from the web, perform statistical analyses, store data, and present findings. There was noone at the organization who could help me do what I was doing, so I worked hard and learned a lot fast.
I then took a job as a Clinical Analyst in 2016. I worked on various investigative projects, applying data analysis to (relatively) large insurance claims data. I did mostly outlier analysis for fraudulent behavior detection, and variations of A/B tests to measure impact. It was then when I learned about SQL, R programming with big data, Git, and professional communication.
In 2019, I became a Data Scientist. Since then, I’ve found myself doing quite a bit of text mining, software development, and classification modeling. I’ve been very interested recently in development technologies, web applications, high(er) performance computing, and time series forecasting.
Over the last 5 years, I’ve developed expertise in the R programming language. In my day to day, I use R for database connections, HTTP requests, general purpose programming, machine learning, text mining, data visualization, reproducible research, web app development, and package authoring.
In addition to R and the R ecosystem, I regularly utilize Git, GitLab, SQL Server, SQLite, and Jira. Occasionally I use Python, Alteryx, and Tableau.
Breadth Versus Depth
One of the main drivers behind my career success has been my breadth-first approach. You could think of me as a “generalist.” Throughout my career, my focus has been on solving a wide range of problems for a wide range of audiences. For this reason, I forced myself to become skillful in many domains related to data analysis, statistics, machine learning, data visualization, programming, software development, presenting, and project management. Being skillful in many areas has helped me become a very fast learner.
Though I am more of a generalist, I naturally developed more expertise in some areas than others. R programming is one of those areas, as well as text mining, anomaly detection, and supervised classification.
Moving forward in my career, my focus is on gaining experience with production deployment strategies, application development for analytics, and high-performance computing. These skills, combined with my expertise in data analysis, will help me grow as a leader in the field of data science.