Page 153 - Computer - 8
P. 153

Step 3: Prepare and Pre-process

          During this stage, data is gathered, uploaded, extracted, or calculated. It is then cleaned, standardised,
          scrubbed  for  outliers,  assessed  for  mistakes,  and  checked  for  reasonableness.  During  this  stage  of
          data mining, the data may also be checked for size, as an overbearing collection of information may
          unnecessarily slow computations and analysis.

          Step 4: Model the Data
          Data  scientists  use  several  types  of  data  mining  techniques  to  search  for  relationships,  trends,
          associations, or sequential patterns. The data may also be fed into predictive models to assess how
          previous information may translate into future outcomes.

          Step 5: Train and Test
          The data-centered aspect of data mining concludes by assessing the findings of the data model(s). The
          outcomes from the analysis may be aggregated, interpreted, and presented to decision-makers. In this
          step, organisations can choose to make decisions based on the findings.
          Step 6: Verify and Deploy
          The data mining process concludes with management taking steps in response to the findings of the
          analysis. The company may decide the information was not strong enough or the findings were not
          relevant to change course. Alternatively, the company may strategically take decisions based on the
          findings. In either case, management reviews the ultimate impacts of the business and re-creates future
          data mining loops by identifying new business problems or opportunities.

          Data Warehousing and Mining Software
          Data  mining  programs  analyse  relationships  and  patterns  in  data  based  on  what  users  request.  To
          illustrate this, imagine a restaurant that wants to use data mining to determine when it should offer
          certain specials. The data miner looks at the information collected about customers and creates classes
          based on when customers visit and what they order.
          In  other  cases,  data  miners  find  clusters  of  information  based  on  logical  relationships  or  look  at
          associations and sequential patterns to draw conclusions about trends in consumer behaviour.
          Warehousing is an important aspect of data mining. Warehousing is when companies centralise their
          data into one database or program. With a data warehouse, an organisation may spin off segments of
          the data for specific users to analyse and use. However, in other cases, analysts may start with the data
          they want and create a data warehouse based on those specs.
          Cloud  data  warehouse  solutions  use  the  space  and  power  of  a  cloud  provider  to  store  data  from
          multiple data sources. This allows smaller companies to leverage digital solutions for storage, security,
          and analytics.

          Data Mining Techniques
          Data mining uses algorithms and various techniques to convert large collections of data into useful
          output. The most popular types of data mining techniques include:
             •  Association  Rules:  Also  referred  to  as  Market  Basket  Analysis,  this  technique  searches  for
                relationships between variables. This relationship in itself creates additional value within the data
                set as it strives to link pieces of data. For example, association rules would search a company’s
                sales history to see which products are most commonly purchased together; with this information,
                stores can plan, promote, and forecast accordingly.


                                                                                                             151
   148   149   150   151   152   153   154   155   156   157   158