Vanessa Sabino started her career as a system analyst in 2000, and in 2010 she jumped at the opportunity to start working with Digital Analytics, which brought together her educational background in Business, Applied Mathematics, and Computer Science. She gained experience from Internet companies in Brazil before moving to Canada, where she is now a data analysis lead for Shopify, transforming data into Marketing insights.
Follow Peadar’s series of interviews with data scientists here.
What projects have you worked on that you wish you could go back to and do better?
Working as practitioner in a company, as opposed to consulting, means I always have the option of going back and improving past projects, as long as the time spent on this task can be justified. There are always new ideas to try and new libraries being published, so as a team lead I try to balance the time spent on higher priority tasks, which for my team currently is ETL work to improve our data warehouse, with exploratory analysis of our data sets and creating and improving models that add value to our business users.
What advice do you have to younger analytics professionals and in particular PhD students in the Sciences?
My advice is to not underestimate the importance of communication skills, which goes from listening, in order to understand exactly what the data means and the context in which it is used, to presenting your results in a way that demonstrates impact and resonates with your audience.
What do you wish you knew earlier about being a data scientist?
I wish I knew 20 years ago how to be a data scientist! When I was finishing high school and I had to decide what to do in university, I had some interest in Computer Science, but I had no idea what a career in that area would be like. The World Wide Web was just starting, and living in Brazil, I had the impression that all software developing companies were north of the Equator. So I decided to study Business, imagining I’d be able to spend my days using spreadsheets to optimize things. During the course I learned about data warehouses, business intelligence, statistics, data mining and decision science, but when it was over it was not clear how to get a job where I could apply this knowledge. I went to work on a IT consulting company, where I had the opportunity to improve my software developing skills, but I missed working with numbers, so after two years I left to start a new undergrad in Applied Mathematics, followed by a Masters in Computer Science. Then I continued working as a software developer, now in web companies, and that’s when I started learning about the vast amount of online behavior data they were collecting and the techniques being used to leverage its potential. “Data scientist” is a new name for something that covers many different traditional roles, and a better understanding of the related terms would have allowed me to make this career move sooner.
How do you respond when you hear the phrase ‘big data’?
I prefer to work closer to data analysis than to data engineering, so in an ideal world I’d have a small data set with a level of detail just right to summarize everything that I can extract from that data. Whatever size the data is, if someone is calling it big data it probably means that the tool they are using to manipulate it is no longer meeting certain expectations, and they are struggling with the technology in order to get their job done. I find it a little frustrating when you write correct code that should be able to transform a certain input to the desired output, but things don’t work as expected due to a lack of computing resources, which means you have to do extra work to get what you want. And the new solution only lasts until your data outgrows it again. But that’s just the way it is, and being in the boundary of what you can handle means you’ll be learning and growing in order to overcome the next challenges.
What is the most exciting thing about your field?
I’m excited about the opportunities to collaborate in a wide range of projects. Nowadays everyone wants to improve things with data informed decisions, so you get to apply your skills to many areas and you learn a lot in the process.
How do you go about framing a data problem – in particular, how do you avoid spending too long, how do you manage expectations etc. How do you know what is good enough?
I always like to start with simple proof of concepts and iterate from there, using feedback from stakeholders to identify where are the biggest gains so that I can pivot the project in the right direction. But the most important thing in this process is to constantly ask “why”, in particular when dealing with requests. This helps you validate the understanding of the problem and enables you to offer better alternatives that the business user might not be aware of when they make a request.
[bctt tweet=”The most important thing in this process is to constantly ask “why”’ – @bani #datascience”]
(image credit: Britany G, CC2.0)