It has been a stressful week, and the primary source of the pain is understanding how the reliability and stability of code can make or break a process, even in data science. At the end of last week, one of our internal tools began experiencing difficulties. The problems started to stack one on top of another, in a cascading effect. Once I fixed one bug, another few would follow in what felt like a never ending game of whack-a-mole.
In software and data science, we write code that can often end up in some applications. As that code is developed, it should be reliable and stable to ensure it doesn’t break the application. Reliability is being able to put trust in a consistently performing process, while stability is being resistant to change and not likely giving way when change happens. But why are these two concepts critical to our code? …
Starting my job in 2019, I came into the team with another developer to clean up the codebase and stabilize it. The first three months consisted of the planning phase in which we read the code, determined the architecture moving forward, and set up the infrastructure that our team now runs off of. Looking back, it was a vast undertaking that taught me so much about automation, code development, and collaboration. But the biggest lesson learned from this experience was the importance of clean code.
One resource I used heavily during this project was a software book that I reference often. If you haven’t read it yet, Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin is a great book. I’ve spoken about this book previously in an article discussing my top 3 recommendations for data scientists, and I still stand by it. Clean Code is a perfect book for anyone who writes code. This book helped me understand the codebase areas that could be better as I worked with my teammate to refactor 30+ repositories down to one software library. Our goal was to create a maintainable software library that was readable to allow any data scientist on the team to pick up the code and understand it. …
Another common question I see among data scientists is, “how do I transition into becoming a senior data scientist?” The status of senior data scientist appears to be very subjective and varies from company to company. There seems to be a wide variety of answers that detail what it means to be a senior data scientist. Here, I review and explore the title, as discussed by PayScale, Cleverism, Zippia, and KDnuggets.
Looking into what PayScale defines as a senior data scientist, it is clear that they expect Machine Learning, Python, and Big Data Analytics skills. These three areas correlate to individuals who are above average on the pay scale. …
If you’ve been following my journey, you will have noticed this isn’t my first time sharing my mentoring experiences. Last year I began to mentor younger individuals interested in data science or software engineering. Most engagements have been 100% remote, consisting of video calls and emails. Developing a trusted relationship in which you feel comfortable talking with someone about your career and life can be hard when you have not met the person face to face. As I began to mentor, I found it a balancing act of enough communication to keep the engagement going but not too much to overwhelm either party. These past seven months have taught me many lessons on developing such a relationship and producing a positive outcome. …
At the end of every week, I host a code review meeting. For 30 minutes, a small group of data scientists get together to look at any open pull requests for our libraries. When my team first started opening pull requests and reviewing code, it felt like a chore. The reviews began well but began to turn into one person nagging the rest of the team to finish their reviews. As more and more pull requests started to come in, our team decided to host weekly code review meetings instead. These meetings have changed how we look at code. …
Let the code and automation work for you. If you often find yourself writing similar code over and over again, then it’s time to consider automating the task or analysis. Maybe you’re starting up similar looking analyses to run over night or are comparing model results at each run with different input parameters, but are you effectively doing these tasks? I commonly look to improve upon two areas in my daily work: (1) standard code checks using CI/CD, and (2) performing analyses using automated jobs.
Working in data science, I have found I often am writing code in many different notebooks that get shared with different data scientists across the team. If the tasks I am doing begin to seem repetitive or others do something very similar, then I take a step back and look at the process. I look to see if I can utilize a CI/CD pipeline, a script, or an automated job to aid in the work, so I am doing more by doing less. …
In 2019, I began working as a senior data scientist, and by 2020 I became a team lead for my group, Tools, and Machine Learning Platforms. The transition to becoming a team lead was an exciting one, but many unexpected changes came. Before this position, I was not aware of what it meant to be a team lead or a product owner.
Transitioning from an individual contributor to a team lead and product owner came with many other tasks that need to get done. These tasks can range from planning out work and road mapping to helping team members with issues as they arise. As I made the transition, I realized it is not an easy shift away from development work. Instead, I try to focus my time on 50% development and 50% other, but it doesn’t always work that way. …
When I started my career, I was the only woman engineer on my team and in the company where I was working. I was studying for my Master’s in Data Science while working in DevOps on servers and developing dashboards. The room I sat in consisted of almost 30 people; all men expect me. As I learned, that environment was not one I could thrive and grow in. It was an environment that would keep me hidden, overworked, and pause my career growth. …
Having a mentor in data science can be beneficial to progress your career and your learning. When I began my role as a data scientist, I adopted two mentors. One was a senior person who had been there since the team’s conception, and the other was a technical fellow who was looking to mentor younger women. The first 6 to 8 months of my job, I spent learning from my mentors. I asked them tons of questions, threw ideas at them, and tried to understand what I wanted for my career progression. This past year, I have learned a lot from these two individuals about my career and learning. …
Job hunting is hard, especially if it is your first time. After a few jobs and several hundreds of applications, I wanted to share with you my experiences. Here I will discuss five lessons learned through job hunting and tips to help you in your search.
Suppose you are looking for a job working with data. In that case, the job title can vary significantly from a data analyst, data scientist, machine learning engineer to a data engineer, and more. There are so many jobs that allow you to work with data. Focusing solely on job title can make it easy to miss opportunities that are out there. …
About