Page 153 - Computer - 8
P. 153
Step 3: Prepare and Pre-process
During this stage, data is gathered, uploaded, extracted, or calculated. It is then cleaned, standardised,
scrubbed for outliers, assessed for mistakes, and checked for reasonableness. During this stage of
data mining, the data may also be checked for size, as an overbearing collection of information may
unnecessarily slow computations and analysis.
Step 4: Model the Data
Data scientists use several types of data mining techniques to search for relationships, trends,
associations, or sequential patterns. The data may also be fed into predictive models to assess how
previous information may translate into future outcomes.
Step 5: Train and Test
The data-centered aspect of data mining concludes by assessing the findings of the data model(s). The
outcomes from the analysis may be aggregated, interpreted, and presented to decision-makers. In this
step, organisations can choose to make decisions based on the findings.
Step 6: Verify and Deploy
The data mining process concludes with management taking steps in response to the findings of the
analysis. The company may decide the information was not strong enough or the findings were not
relevant to change course. Alternatively, the company may strategically take decisions based on the
findings. In either case, management reviews the ultimate impacts of the business and re-creates future
data mining loops by identifying new business problems or opportunities.
Data Warehousing and Mining Software
Data mining programs analyse relationships and patterns in data based on what users request. To
illustrate this, imagine a restaurant that wants to use data mining to determine when it should offer
certain specials. The data miner looks at the information collected about customers and creates classes
based on when customers visit and what they order.
In other cases, data miners find clusters of information based on logical relationships or look at
associations and sequential patterns to draw conclusions about trends in consumer behaviour.
Warehousing is an important aspect of data mining. Warehousing is when companies centralise their
data into one database or program. With a data warehouse, an organisation may spin off segments of
the data for specific users to analyse and use. However, in other cases, analysts may start with the data
they want and create a data warehouse based on those specs.
Cloud data warehouse solutions use the space and power of a cloud provider to store data from
multiple data sources. This allows smaller companies to leverage digital solutions for storage, security,
and analytics.
Data Mining Techniques
Data mining uses algorithms and various techniques to convert large collections of data into useful
output. The most popular types of data mining techniques include:
• Association Rules: Also referred to as Market Basket Analysis, this technique searches for
relationships between variables. This relationship in itself creates additional value within the data
set as it strives to link pieces of data. For example, association rules would search a company’s
sales history to see which products are most commonly purchased together; with this information,
stores can plan, promote, and forecast accordingly.
151