An Introduction to Data Science

Anjali Pal
3 min readJul 24, 2020

--

Data science is a combination of statistics, mathematics, Computer Science and Domain Knowledge. It is an interdisciplinary field which uses algorithms, strategies and technical skills to extract information and draw unknown insights from structured or raw data. Data Science is a single word used for framing a problem, Collecting data, Processing data, Analysing data, interpreting the results and communicating the insights to stakeholders. These insights may further be used to make new policies, optimize business decisions, frame new strategies, analyse the previous ones, discovers patterns and whatnot.

Steps involved in any data science project are:

Note: Data mining originally means getting data and extracting useful information from it. Some people also use data mining for just obtaining data from scraping etc.

Never confuse data mining and data science.

Data mining is a technique while data science is an area of study. Data science includes data mining. Data mining means detecting unknown trends. Apart from trends, data mining also includes predictive modelling and info-graphics.

The relation between Data Science, Data Analytics, Machine learning and Artificial intelligence

Artificial Intelligence is a field of Computer science where a machine is trained on data to simulate human intelligence in machines. Self-driven cars, Obstacle detection in robots, Alexa, Google AI assistant, Netflix’s recommendation engine etc. are all marvellous examples of applications of AI.

Now, for training the above-mentioned machines, we need to have some algorithms which work and data for them to get trained on. This preliminary data is collected by IoT devices (Sensors attached to the machine) and the algorithms are written for them to use this data for training. This process is called Machine Learning and such algorithms are called ML algorithms. They’re many ways to build an algorithm we can use back/forward propagation of error or Neural Networks or Stochastic Gradient Descent etc.

Deep learning is the use of Neural Networks in algorithms. This is one of the best techniques of machine learning which can learn from diverse data set and produce the best results.

Data Analytics is a step involved in the data science process. Data analytics mostly involve using statistical knowledge to understand data and infer the results.

Now, data science is a broad term. Data science includes extracting data and gathering insights from it. Data science might use machine learning algorithms in modelling to predict better outcomes.

So, each of these terms is interrelated but one must always remember the differences. Data science is just like a country where ML, Deep Learning, data analytics and data mining are provinces within a country. They may or may not share their boundaries but their names can’t be used interchangeably.

--

--

Anjali Pal
Anjali Pal

Written by Anjali Pal

A data science enthusiast who believes that “It is a capital mistake to theorize before one has data”- Sherlock Holmes. Visit me at https://anjali001.github.io

No responses yet