A Data Scientist requires expertise in several backgrounds:
- Machine Learning
- Statistics
- Programming (Python or R)
- Mathematics
- Databases
A Data Scientist must find patterns within the data. Before he/she can find the patterns, he/she must organize the data in a standard format.
Here is how a Data Scientist works:
- Ask the right questions - To understand the business problem.
- Explore and collect data - From database, web logs, customer feedback, etc.
- Extract the data - Transform the data to a standardized format.
- Clean the data - Remove erroneous values from the data.
- Find and replace missing values - Check for missing values and replace them with a suitable value (e.g. an average value)