Page 109 - Code & Click - 8
P. 109

Step 1: Define the Problem
            Before  any  data  is  touched,  extracted,  cleaned,  or  analysed,  it  is  important  to  understand  the
            underlying entity and the project at hand. What are the goals the company is trying to achieve by
            mining data? What is their current business situation?
            Step 2: Identify Required Data
            Once  the  business  problem  has  been
            clearly defined, it’s time to start thinking
            about data. This includes what sources
            are  available,  how  it  will  be  secured
            and  stored,  how  information  will  be
            gathered, and  what  the  final  outcome
            or analysis may look like. This step also
            critically thinks about what limits there are to data, storage, security, and collection, and assesses
            how these constraints will impact the data mining process.

            Step 3: Prepare and Pre-process
            During this stage, data is gathered, uploaded, extracted, or calculated. It is then cleaned, standardised,
            scrubbed for outliers, assessed for mistakes, and checked for reasonableness.
            Step 4: Model the Data
            Data  scientists  use  several  types  of  data  mining  techniques  to  search  for  relationships,  trends,
            associations, or sequential patterns.
            Step 5: Train and Test
            The data-centered aspect of data mining concludes by assessing the findings of the data model(s).
            The outcomes from the analysis may be aggregated, interpreted, and presented to decision-makers.

            Step 6: Verify and Deploy
            The data mining process concludes with management taking steps in response to the findings of the
            analysis. The company may decide the information was not strong enough or the findings were not
            relevant to change course. Alternatively, the company may strategically take decisions based on the
            findings.

            Data Warehousing
            Warehousing is an important aspect of data mining. Warehousing is when companies centralise their
            data into one database or program. With a data warehouse, an organisation may spin off segments
            of the data for specific users to analyse and use.
            Cloud data warehouse solutions use the space and power of a cloud provider to store data from
            multiple  data  sources.  This  allows  smaller  companies  to  leverage  digital  solutions  for  storage,
            security, and analytics.


            Data Mining Techniques
            Data mining uses algorithms and various techniques to convert large collections of data into useful
            output. The most popular types of data mining techniques include:

               •  Association Rules: Also referred to as Market Basket Analysis, this technique searches for
                  relationships between variables. This relationship in itself creates additional value within the
                  data set as it strives to link pieces of data.


                                                                                                                107
   104   105   106   107   108   109   110   111   112   113   114