Basic Statistics: At least a basic understanding of statistics is vital as a data scientist. An interviewer once told me that many of the people he interviewed couldn't even provide the correct definition of a p-value. You should be familiar with statistical tests, distributions, maximum likelihood estimators, etc. Think back to your basic stats class! This will also be the case for machine learning, but one of the more important aspects of your statistics knowledge will be understanding when different techniques are (or aren't) a valid approach. Statistics is important at all company types, but especially data-driven companies where the product is not data-focused and product stakeholders will depend on your help to make decisions and design / evaluate experiments. Machine Learning: If you're at a large company with huge amounts of data, or working at a company where the product itself is especially data-driven, it may be the case that you'll want to be familiar with machine learning methods. This can mean things like k-nearest neighbors, random forests, ensemble methods - all of the machine learning buzzwords. It's true that a lot of these techniques can be implemented using R or Python libraries - because of this, it's not necessarily a dealbreaker if you're not the world's leading expert on how the algorithms work. More important is to understand the broadstrokes and really understand when it is appropriate to use different techniques.
Ver más