Page 114 - Code & Click - 8
P. 114

computations through the network is called Forward Propagation. The input and output layers of a
            deep neural network are called Visible layers.
            Another process called Backpropagation uses algorithms, like gradient descent, to calculate errors in
            predictions and then adjusts the weights and biases of the function by moving backwards through
            the layers in an effort to train the model. Together, forward propagation and backpropagation allow
            a neural network to make predictions and correct for any errors accordingly.

            DATA SCIENCE AND DATA SCIENTISTS
            Data science combines math and statistics, specialised programming, advanced analytics, artificial
            intelligence,  and  machine  learning  with  specific  subject  matter  expertise  to  uncover  actionable
            insights hidden in an organisation’s data. These insights can be used to guide decision-making and
            strategic planning.
            As a result, it is no surprise that the role of the data scientist is one of the most coveted jobs world-
            wide.  Organisations  are  increasingly  reliant  on  them  to  interpret  data  and  provide  actionable
            recommendations to improve business outcomes.

            The Data Science Lifecycle
            The Data Science Lifecycle involves various roles, tools, and processes, which enable analysts to
            glean actionable insights. Typically, a data science project undergoes the following stages:
              1.  Data  Ingestion:  The  lifecycle  begins  with  the  data  collection,  both  raw  structured  and
                  unstructured data from all relevant sources using a variety of methods. Data sources can
                  include  structured  data,  such  as  customer  data,  along  with
                  unstructured data, such as log files, video, audio, pictures, the                     Data
                  Internet of Things (IoT), and social media.                              Ingestion    Storage
                                                                                             Data
                                                                                                            and Data
              2.  Data Storage and Data Processing: Since data can have different                                Processing
                  formats  and  structures,  companies  need  to  consider  different             DATA SCIENCE
                  storage  systems  based  on  the  type  of  data  that  needs  to  be            LIFECYCLE
                  captured. This stage includes cleaning data, removing duplicate          Communicate     Analysis
                                                                                                            Data
                  data, transforming and combining the data using ETL (extract,
                  transform, load) jobs or other data integration technologies.

              3.  Data Analysis: Data scientists conduct an exploratory data analysis to examine biases, patterns,
                  ranges, and distributions of values within the data. It also allows analysts to determine the
                  data’s relevance for use within modelling efforts for predictive analytics, machine learning,
                  and/or deep learning.

              4.   Communicate: Finally, insights are presented as reports and other data visualisations that
                  make the insights and their impact on business easier for business analysts and other decision-
                  makers to understand.

            Role of Data Scientist
            Data science is considered a discipline, while data scientists are the practitioners within that field.
            Data scientists are not necessarily directly responsible for all the processes involved in the data
            science lifecycle. It’s common for a data scientist to partner with machine learning engineers to scale
            machine learning models.
            In short, a data scientist must be able to:
               •   Know enough about the business to ask pertinent questions and identify business pain points.
               •   Use a wide range of tools and techniques for preparing and extracting data.


            112
   109   110   111   112   113   114   115   116   117   118   119