Page 114 - Code & Click - 8
P. 114
computations through the network is called Forward Propagation. The input and output layers of a
deep neural network are called Visible layers.
Another process called Backpropagation uses algorithms, like gradient descent, to calculate errors in
predictions and then adjusts the weights and biases of the function by moving backwards through
the layers in an effort to train the model. Together, forward propagation and backpropagation allow
a neural network to make predictions and correct for any errors accordingly.
DATA SCIENCE AND DATA SCIENTISTS
Data science combines math and statistics, specialised programming, advanced analytics, artificial
intelligence, and machine learning with specific subject matter expertise to uncover actionable
insights hidden in an organisation’s data. These insights can be used to guide decision-making and
strategic planning.
As a result, it is no surprise that the role of the data scientist is one of the most coveted jobs world-
wide. Organisations are increasingly reliant on them to interpret data and provide actionable
recommendations to improve business outcomes.
The Data Science Lifecycle
The Data Science Lifecycle involves various roles, tools, and processes, which enable analysts to
glean actionable insights. Typically, a data science project undergoes the following stages:
1. Data Ingestion: The lifecycle begins with the data collection, both raw structured and
unstructured data from all relevant sources using a variety of methods. Data sources can
include structured data, such as customer data, along with
unstructured data, such as log files, video, audio, pictures, the Data
Internet of Things (IoT), and social media. Ingestion Storage
Data
and Data
2. Data Storage and Data Processing: Since data can have different Processing
formats and structures, companies need to consider different DATA SCIENCE
storage systems based on the type of data that needs to be LIFECYCLE
captured. This stage includes cleaning data, removing duplicate Communicate Analysis
Data
data, transforming and combining the data using ETL (extract,
transform, load) jobs or other data integration technologies.
3. Data Analysis: Data scientists conduct an exploratory data analysis to examine biases, patterns,
ranges, and distributions of values within the data. It also allows analysts to determine the
data’s relevance for use within modelling efforts for predictive analytics, machine learning,
and/or deep learning.
4. Communicate: Finally, insights are presented as reports and other data visualisations that
make the insights and their impact on business easier for business analysts and other decision-
makers to understand.
Role of Data Scientist
Data science is considered a discipline, while data scientists are the practitioners within that field.
Data scientists are not necessarily directly responsible for all the processes involved in the data
science lifecycle. It’s common for a data scientist to partner with machine learning engineers to scale
machine learning models.
In short, a data scientist must be able to:
• Know enough about the business to ask pertinent questions and identify business pain points.
• Use a wide range of tools and techniques for preparing and extracting data.
112