Skip Headers
Oracle® Retail Data Model Reference
10g Release 2 (10.2)

Part Number E10084-04
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

6 Data Mining Models in Oracle Retail Data Model

This chapter provides reference information about the data mining models that are provided with Oracle Retail Data Model if you do choose to install the Data Mining Option. "About Data Mining in Oracle Retail Data Model" provides overview information. The rest of this chapter describes the following data mining models in more detail:

Each model topic provides the following types of information: a description of the model, examples of desired rules, a discussion of what the discovered rules explain, a discussion of what the model mines, a list of the target variables, a list of columns included in the target views, and sample reports.

Note:

For instructions on setting up and loading the data mining source, and executing the data mining models, see "Populating the Data Mining Component" in Oracle Retail Data Model Operations Guide.

About Data Mining in Oracle Retail Data Model

Oracle Retail Data Model includes data mining packages. The data mining portion of Oracle Retail Data Model consists of source tables that are populated by detail data for use by the data mining packages. This data is organized in a specific way to be compatible with the data mining modules so they can properly analyze and mine the data. Data mining packages pull in the source data and feed it into the data mining packages, and populate the target tables with the results. The data in the target tables are presented by the OBIEE reports.

Tip:

Changed or new data models are not supported by Oracle Retail Data Model. Consequently, do not change the data models that are defined and delivered with Oracle Retail Data Model, but, instead, copy a delivered data model to create a new one.

As outlined in Table 6-1, the Oracle Retail Data Model data mining models are of three types: Adaptive Bayes Network (ABN), Decision Tree (DT) and Apriori Association (APASS).

See also:

"Physical Data Model of the Data Mining Component" and "Data Mining ETL Packages" on page 5-28.

Associate Basket Analysis Model

This model addresses the business problem of building a profile of associates to explain their basket KPIs, such as Total Baskets, Average Basket Value, and other statistics.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Associate Basket Analysis Model Report

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
SALARY ELIGIBLITY is (N)
AND
EMPLOYEE_TYPE is (TEMPORARY)
THEN
NUMBER OF BASKETS IS THE HIGHEST

Desired Rules Example 2

IF
SALARY ELIGIBLITY is (N)
AND
EMPLOYEE_TYPE is (TEMPORARY)
THEN
NUMBER OF BASKETS IS THE LOWEST

What the Discovered Rules for the Associate Basket Analysis Model Report Explain

The discovered rules provide correlations between the basket KPIs and associate (employee) attributes.

What the Associate Basket Analysis Model Mines

This analysis identifies which key attributes of an associate influence his or her number of baskets sold, average basket value, and size. This model mines the various attributes of associates. It takes the binned variables one at a time for the Total Basket Count, Average Basket Value, and Average Basket Size as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model with a single feature and discovers rules described in terms of associate attributes.

Target Variables for the Associate Basket Analysis Model

The rules are designed to be generated monthly. Therefore, nine ABN and nine DT models are created every month across all the associates using the following variables as targets in this order:

  1. Total Basket Count Quartile (TBCQR)

  2. Total Basket Count Quintile (TBCQN)

  3. Total Basket Count Decile (TBCDE)

  4. Average Basket Value Quartile (ABVQR)

  5. Average Basket Value Quintile (ABVQN)

  6. Average Basket Value Decile (ABVDE)

  7. Average Basket Size Quartile (ABSQR)

  8. Average Basket Size Quintile (ABSQN)

  9. Average Basket Size Decile (ABSDE)

Note:

Associates are grouped into N-Tiles according to their sales performance figures.

Source Variables for the Associate Basket Analysis Model

The following attributes of associates are identified from the Oracle Retail Data Model data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Year

  • Month

  • Employee Id

  • Designation Name

  • Designation Title

  • Designation Level

  • Nationality

  • Gender

  • Marital Status

  • Age

  • Net Income

  • Demographics Code

  • Title

  • Total Months of Job

  • Employee Type

  • Correspondence Language

  • Disability Indicator

  • Rehire Recommendation Indicator

  • HR Based Salary Eligibility Indicator

  • Overtime Hours Salary Eligibility Indicator

  • Commission Eligibility Indicator

  • SPIFF Allowed Flag

  • Total Hours Worked

  • Total Overtime Hours

  • Total Basket Count Quartile (Target)

  • Total Basket Count Quintile (Target)

  • Total Basket Count Decile (Target)

  • Average Basket Value Quartile (Target)

  • Average Basket Value Quintile (Target)

  • Average Basket Value Decile (Target)

  • Average Basket Size Quartile (Target)

  • Average Basket Size Quintile (Target)

  • Average Basket Size Decile (Target)

Columns Included in the Target Views of the Associate Basket Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and are also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Associate Loss Analysis Model

This model addresses the business problem of correlating associate characteristics to shrink and theft.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Associate Loss Analysis Model Report

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
DESIGNATION is (TEMPORARY CASHIER)
AND
COMMISSION_ELIGIBILITY is (Y)
THEN
SHRINK TO SALES RATIO IS THE HIGHEST

Desired Rules Example 2

IF
DESIGNATION is (TEMPORARY CASHIER)
AND
COMMISSION_ELIGIBILITY is (N)
THEN
SHRINK TO SALES RATIO IS THE LOWEST

What the Discovered Rules of the Associate Loss Analysis Model Explain

The discovered rules explain the reasons for the shrink and theft associated with an Associate.

What the Associate Loss Analysis Model Mines

This model mines the Total Shrink Count, Total Shrink Amount, Shrink as a percentage of Sales, Total Theft Count, Total Theft Amount and Theft as a percentage of Sales of individual associates to identify which of their key attributes influence their shrinkage and theft.

This model takes the binned variables one at a time for Total Shrinkage and Theft Count or Value or percentage of Sales as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model and discovers rules described in terms of associate attributes.

Target Variables for the Associate Loss Analysis Model

The rule are designed to be generated monthly. Therefore, eighteen ABN models and eighteen DT models are created every month across all the associates using the following variables as targets in this order:

  1. Total Shrink Count Quartile (TSCQR)

  2. Total Shrink Count Quintile (TSCQN)

  3. Total Shrink Count Decile (TSCDE)

  4. Total Shrink Amount Quartile (TSAQR)

  5. Total Shrink Amount Quintile (TSAQN)

  6. Total Shrink Amount Decile (TSADE)

  7. Shrink as a percentage of Sales Quartile (STSQR)

  8. Shrink as a percentage of Sales Quintile (STSQN)

  9. Shrink as a percentage of Sales Decile (STSDE)

  10. Total Theft Count Quartile (TTCQR)

  11. Total Theft Count Quintile (TTCQN)

  12. Total Theft Count Decile (TTCDE)

  13. Total Theft Amount Quartile (TTAQR)

  14. Total Theft Amount Quintile (TTAQN)

  15. Total Theft Amount Decile (TTADE)

  16. Theft as a percentage of Sales Quartile (TTSQR)

  17. Theft as a percentage of Sales Quintile (TTSQN)

  18. Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Associate Loss Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Year

  • Month

  • Employee Id

  • Designation Name

  • Designation Title

  • Designation Level

  • Nationality

  • Gender

  • Marital Status

  • Age

  • Net Income

  • Demographics Code

  • Title

  • Total Months of Job

  • Employee Type

  • Correspondence Language

  • Disability Indicator

  • Rehire Recommendation Indicator

  • HR Based Salary Eligibility Indicator

  • Overtime Hours Salary Eligibility Indicator

  • Commission Eligibility Indicator

  • SPIFF Allowed Flag

  • Total Hours Worked

  • Total Overtime Hours

  • Total Shrink Count Quartile (Target)

  • Total Shrink Count Quintile (Target)

  • Total Shrink Count Decile (Target)

  • Total Shrink Amount Quartile (Target)

  • Total Shrink Amount Quintile (Target)

  • Total Shrink Amount Decile (Target)

  • Shrink as a percentage of Sales Quartile (Target)

  • Shrink as a percentage of Sales Quintile (Target)

  • Shrink as a percentage of Sales Decile (Target)

  • Total Theft Count Quartile (Target)

  • Total Theft Count Quintile (Target)

  • Total Theft Count Decile (Target)

  • Total Theft Amount Quartile (Target)

  • Total Theft Amount Quintile (Target)

  • Total Theft Amount Decile (Target)

  • Theft as a percentage of Sales Quartile (Target)

  • Theft as a percentage of Sales Quintile (Target)

  • Theft as a percentage of Sales Decile (Target)

Columns Included in the Target Views of the Associate Loss Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

The Model Signature Target View outlines the attribute structure of the model (built using an ABN or DT algorithm). The Model Signature Target View contains the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Associate Sales Analysis Model

This model addresses the business problem of profiling associate characteristics to sales, cost, and profit patterns.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Associate Sales Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
ASSOCIATE IS NOT ELIGIBLE FOR SPIFF
AND
ASSOCIATE IS ELIGIBLE FOR SALARY
AND
ASSOCIATE IS NOT ELIGIBLE FOR COMMISSION
THEN
ASSOCIATE PROFIT IS THE LOWEST

Desired Rules Example 2

IF
ASSOCIATE IS ELIGIBLE FOR SPIFF
AND
ASSOCIATE IS NOT ELIGIBLE FOR SALARY
AND
ASSOCIATE IS NOT ELIGIBLE FOR COMMISSION
THEN
ASSOCIATE PROFIT IS THE HIGHEST

What the Discovered Rules of the Associate Sales Analysis Model Explain

The discovered rules provide correlations between associate (employee) characteristics and their sales, cost, and profit profiles.

What the Associate Sales Analysis Model Mines

This model mines the various attributes of associates. It takes the binned variables one at a time for Sales, Costs, and Profits as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model with a single feature and discovers rules described in terms of associate attributes.

Target Variables for the Associate Sales Analysis Model

The rules are designed to be generated monthly. Therefore, nine ABN models and nine DT models are created every month across all the associates using the following variables as targets in this order:

  1. Sales Amount Quartile (SAQR)

  2. Sales Amount Quintile (SAQN)

  3. Sales Amount Decile (SADE)

  4. Cost Amount Quartile (CAQR)

  5. Cost Amount Quintile (CAQN)

  6. Cost Amount Decile (CADE)

  7. Profit Amount Quartile (PADR)

  8. Profit Amount Quintile (PAQN)

  9. Profit Amount Decile (PADE)

Source Variables for the Associate Sales Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Year

  • Month

  • Employee Id

  • Designation Name

  • Designation Title

  • Designation Level

  • Nationality

  • Gender

  • Marital Status

  • Age

  • Net Income

  • Demographics Code

  • Title

  • Total Months of Job

  • Employee Type

  • Correspondence Language

  • Disability Indicator

  • Rehire Recommendation Indicator

  • HR Based Salary Eligibility Indicator

  • Overtime Hours Salary Eligibility Indicator

  • Commission Eligibility Indicator

  • SPIFF Allowed Flag

  • Total Hours Worked

  • Total Overtime Hours

Columns Included in the Target Views of the Associate Sales Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Year (PK)

  • Analysis Desc

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Customer Category Mix Analysis Model

This model addresses the business problem of discovering product categories that are frequently bought by customers. The model is used to understand the Categories purchased by a Customer in a typical transaction in terms of the components like the Categories in the Basket, Target Category in a Basket and additional information like Basket Significance (Sales Value), Target Category Significance (Sales Value) which are generated from regular Customer Transactional data.

Using Oracle Data Mining, the KPIs are modeled with the APRIORI algorithm utilised by the Association Rules model. The model type used for Association Rules with Apriori Algorithm is APASS. This model type is an example of Unclassified Learning since the Categories (or Target Category) which constitute the Category Basket are not inferred or guided (as part of data preparation) but are generated by the model itself.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Customer Category Mix Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
CUSTOMER HAS BOUGHT 'BABY', 'GRAB AND GO'
THEN
CUSTOMER IS LIKELY TO BUY 'PACKAGED BEVERAGES' (11 Support: 36%, Probability:56%)

Category Basket Significance of ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') is 45% of Sales Value => The Sales from the 3 categories in Category Basket ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') account for 45% of the Total Sales across all categories in that particular store.

The Category Basket Significance (Sales Value) KPI allows us to filter out Rules which may be insignificant from a Basket Sales Value perspective.

Target Category Significance of ('PACKAGED BEVERAGES') is 60% of the Basket Sales Value => The Sales from the Target Category ('PACKAGED BEVERAGES') account for 60% of the Total Sales from the Category Basket ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') in that particular store.

The Target Category Significance (Sales Value) KPI allows us to filter out Rules determining insignificant Customer Purchases (insignificant Target Category). In other words, it helps us to extract Rules which relate to singificant Customer Purchases, where the Target Category is significant within the Basket of Categories (from a Sales Value perspective). Identifying this information is useful from a campaign/promotion/upsell perspective.

Desired Rules Example 2

IF
CUSTOMER HAS BOUGHT 'FLORAL', 'PHARMACY', 'HOT FOODS'
THEN
CUSTOMER IS LIKELY TO BUY 'BABY' (Support: 36%, Probability: 62%)'

What the Discovered Rules for the Customer Category Mix Analysis Model Explain

The discovered rules explain customer behavior and buying patterns regarding various product categories. They help indicate the groups of product categories that sell well together, for example:

  • The chances of a customer buying a BABY product increases from 11% to 62% if that customer purchases FLORAL and HOT FOODS products. This Rule is 28% significant in terms of Basket Value (Sales Value - all 3 categories in Basket) and The Target Category (BEAUTY) significance is about 75% of the Basket.

  • The chances of a customer buying a BEAUTY product is 34% if that customer has purchased a product from categories 'HEALTH', 'MAGAZINES' and 'PHARMACY'. This Rule is about 66% significant in terms of Basket Value (Sales Value of all 4 categories in Basket) and The Target Category (BEAUTY) significance is about 45% of the Basket.

  • BEAUTY products are very rarely sold together (about 3% of the time) with LIQUOR and AUTOMOTIVE PRODUCTS.

The significance of a rule can be measured in terms of support and confidence and a host of additional supporting measures; for example, Basket Significance (Value), Target Category Signifcance (Value), Basket Significance (Customers), Target Category Signifcance (Customers), Basket Significance (Transactions), Target Category Signifcance (Transactions), and so on..

What the Customer Category Mix Analysis Model Mines

This model mines the monthly purchases of individual customers and discovers rules about the categories that are frequently bought in groups by customers.

Target Variables for the Customer Category Mix Analysis Model

The purchase patterns are designed to be generated monthly for each individual store. Therefore, the APASS models are created every month for each store.

Source Variables for the Customer Category Mix Analysis Model

The following item attributes are the source variables:

  • Case Id Alt (PK)

  • Store ID

  • Year

  • Month

  • ID

  • Name

  • Value

Note:

If the mining must be performed at multiple levels, such as category, subcategory, item, there may be multiple source tables.

Columns Included in the Target Views for the Customer Category Mix Analysis Model Report

The mined patterns and rules are visible through the target view CUST_CATEGORY_MIX_APASS_RULES with the following columns and can be displayed in an OBIEE report:

  • STORE_ID - Store ID

  • YEAR - Year

  • MONTH - Month

  • MODEL_NAME - Model Name (CCM_MDL_APASS_<Store_ID>)

  • MODEL_TYPE - Model Type (APASS)

  • MODEL_TYPE_DESC - Model Description (Apriori Association)

  • ANALYSIS_NAME - Analysis Name (CUST_CATEGORY_MIX)

  • ANALYSIS_DESC - Analysis Description (Customer Category Mix Analysis)

  • RULE_ID - Rule Id .. IF (antecedent) THEN (consequent) END

  • RULE_ANTECEDENT_ITEMS - List of Categories making up the IF part of the Category Basket (1 upto max as specified in settings while building the model)

  • RULE_CONSEQUENT_ITEMS - List of Categories making up the THEN part of the Category Basket (usually 1 category)

  • RULE_SUPPORT - Support (number of cases in the input dataset which pertain to the current basket)

  • RULE_CONFIDENCE - Confidence (Probability of the THEN part of the Rule coming true based on input dataset)

  • RULE_DISPLAY_ORDER - Default or recommended display order of the rules. Critical for DT models. Not critical for APASS models.

  • RULE_LENGTH - Number of Categories in the antecedent (IF) part of the Rule

  • BSKT_CTGRY_COUNT - Number of categories in the Category Basket (includes antecedent and consequent)

  • BSKT_ALL_SLS_VAL - Sales Value (total) for all categories in the Category Basket

  • BSKT_ALL_SLS_UNITS - Sales Units (total) for all categories in the Category Basket

  • BSKT_ALL_TRX_COUNT - Transaction Count for all categories in the Category Basket

  • BSKT_ALL_CUST_COUNT - Customer Count for all categories in the Category Basket

  • BSKT_ALL_SLS_VAL_SIGN - Significance of Category Basket within Store per Sales Value ....... ratio of Basket Sales Value to Store wide Sales Value (all categories)

  • BSKT_ALL_SLS_UNITS_SIGN - Significance of Category Basket within Store per Sales Units ....... ratio of Basket Sales Units to Store wide Sales Units (all categories)

  • BSKT_ALL_TRX_COUNT_SIGN - Significance of Category Basket within Store per Transaction Count.. ratio of Basket Transaction Count to Store wide Transaction Count (all categories)

  • BSKT_ALL_CUST_COUNT_SIGN - Significance of Category Basket within Store per Customer Count......ratio of Basket Customer Count to Store wide Customer Count (all categories)

  • BSKT_ALL_AVG_SLS_VAL_TRX - Avg Sales Value per Transaction (within Category Basket)

  • BSKT_ALL_AVG_SLS_VAL_CUST - Avg Sales Value per Customer (within Category Basket)

  • BSKT_ALL_AVG_SLS_UNITS_TRX - Avg Sales Units per Transaction (within Category Basket)

  • BSKT_ALL_AVG_SLS_UNITS_CUST - Avg Sales Units per Customer (within Category Basket)

  • BSKT_TGT_SLS_VAL - Sales Value of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_SLS_UNITS - Sales Units of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_TRX_COUNT - Transaction Count of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_CUST_COUNT - Customer Count of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_SLS_VAL_SIGN - Significance of the Target Category within Category Basket per Sales Value ..... ratio of Target Category Sales Value to Basket Sales Value (all categories in Basket)

  • BSKT_TGT_SLS_UNITS_SIGN - Significance of the Target Category within Category Basket per Sales Units ..... ratio of Target Category Sales Units to Basket Sales Units (all categories in Basket)

  • BSKT_TGT_TRX_COUNT_SIGN - Significance of the Target Category within Category Basket per Transaction Count ..... ratio of Target Category Transaction Count to Basket Transaction Count (all categories in Basket)

  • BSKT_TGT_CUST_COUNT_SIGN - Significance of the Target Category within Category Basket per Customer Count ..... ratio of Target Category Transaction Count to Basket Customer Count (all categories in Basket)

  • BSKT_TGT_AVG_SLS_VAL_TRX - Avg Sales Value per Transaction (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_VAL_CUST - Avg Sales Value per Customer (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_UNITS_TRX - Avg Sales Units per Transaction (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_UNITS_CUST - Avg Sales Units per Customer (Target Category .. consequent category /THEN part of RULE)

  • STR_ALL_SLS_VAL - Sales Value (total) for all categories in Store

  • STR_ALL_SLS_UNITS - Sales Units (total) for all categories in Store

  • STR_ALL_TRX_COUNT - Transaction Count for all categories in Store

  • STR_ALL_CUST_COUNT - Customer Count for all categories in Store

Customer Loyalty Analysis Model

This model addresses the business problem of discovering the impact of customer characteristics on customers' loyalty to a store.

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Customer Loyalty Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
YEARS OF RESIDENCE IS (8 - 10)
AND
HOUSEHOLD_SIZE IS (3+)
THEN
CUSTOMER IS GROUP A

Desired Rules Example 2

IF
YEARS OF RESIDENCE IS (1 ñ 3)
AND
HOUSEHOLD_SIZE IS (LESS THAN 3)
THEN
CUSTOMER IS GROUP E

What the Discovered Rules for the Customer Loyalty Analysis Model Explain

The discovered rules help explain the loyalty of a customer.

What the Customer Loyalty Analysis Model Mines

This model mines the Customer and Account demographic characteristics of Customers to identify the key attribute influencing the Customer Loyalty scores (RFMP Category Value).

Target Variable for the Customer Loyalty Analysis Model

The rules are designed to be generated monthly. Therefore, one ABN model and one DT model is created every month across all stores combined using the Customer Loyalty variable as the target.

Source Variables for the Customer Loyalty Analysis Model

The following attributes of customers are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Year

  • Month

  • Customer Number

  • Registered as Gift Receiver

  • Registered as Gift Giver

  • Customer Occasion Type This Month

  • Campaign This Month

  • Membership Account Type Code (None if the customer does not have any account; the last used account if the customer has multiple accounts)

  • Life-To-Date Points

  • Available Points

  • Customer Account Type (None if the customer does not have any account; the last used account if the customer has multiple accounts)

  • Customer Identity Required Indicator

  • Customer Identity Type Name

  • Customer Group Code (None if the customer does not belong to any group)

  • Age

  • Marital Status

  • Gender

  • Income

  • Race

  • Education

  • Profession

  • Household Size

  • Years of Residence

  • Demography Group Name

  • Customer County or District

  • Customer City

  • Customer State

  • Customer Country

  • Customer World Region

Loyalty Categories for the Customer Loyalty Analysis Model

The RFMP algorithms provide functionality to group customers into quartiles, deciles, and quintiles. Each customer falls into one of the following five loyalty categories based on the RFMP quintile he or she belongs to in a particular month:

  • Group A (RFMP Quintile 5)

  • Group B (RFMP Quintile 4)

  • Group C (RFMP Quintile 3)

  • Group D (RFMP Quintile 2)

  • Group E (RFMP Quintile 1)

Note:

The definition of each of the loyalty types as well as the number of loyalty types may vary with each implementation.

Columns Included in the Target Views for the Customer Loyalty Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Frequent Shopper Category Mix Analysis Model

This model addresses the business problem of finding product categories that are frequently bought by frequent shoppers. Finding these product categories can help in optimizing Merchandising and Category Mix options that relate to Store Layout, Display and Frontage, Promotional Campaigns, Co-branding, and others.

The model is used to understand the Categories purchased by a Frequent Shopper in a typical transaction in terms of the components like the Categories in the Basket, Target Category in a Basket and additional information like Basket Significance (Sales Value), Target Category Significance (Sales Value) which are generated from regular Customer Transactional data.

Using Oracle Data Mining, the KPIs are modeled with the APRIORI algorithm utilised by the Association Rules model. The model type used for Association Rules with Apriori Algorithm is APASS. This model type is an example of Unclassified Learning since the Categories (or Target Category) which constitute the Category Basket are not inferred or guided (as part of data preparation) but are generated by the model itself.

Examples of Desired Rules for the Frequent Shopper Category Mix Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
CUSTOMER HAS BOUGHT 'FILM ACCESSORIES'
THEN
CUSTOMER IS LIKELY TO BUY 'BEAUTY' [Support - 80%, Confidence - 70%]

Desired Rules Example 2

IF
CUSTOMER HAS BOUGHT 'FILM ACCESSORIES'
AND
CUSTOMER HAS BOUGHT 'MAGAZINE'
AND
CUSTOMER HAS BOUGHT 'PET'
THEN
CUSTOMER IS LIKELY TO BUY 'BEAUTY' [Support – 48%, Confidence - 82%]

Category Basket Significance of ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') is 62% of Sales Value => The Sales from the 4 categories in Category Basket ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') account for 62% of the Total Sales across all categories in that particular store.

The Category Basket Significance (Sales Value) KPI allows us to filter out Rules which may be insignificant from a Basket Sales Value perspective.

Target Category Significance of (BEAUTY') is 70% of the Basket Sales Value => The Sales from the Target Category ('BEAUTY') account for 70% of the Total Sales from the Category Basket ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') in that particular store.

The Target Category Significance (Sales Value) KPI allows us to filter out Rules determining insignificant Customer Purchases (insignificant Target Category). In other words, it helps us to extract Rules which relate to singificant Customer Purchases, where the Target Category is significant within the Basket of Categories (from a Sales Value perspective). Identifying this information can be useful from a campaign/promotion/upsell perspective.

Desired Rules Example 3

IF
CUSTOMER HAS BOUGHT 'BAKERY'
THEN
CUSTOMER IS LIKELY TO BUY 'BEAUTY' [Support – 36%, Confidence - 90%]

What the Discovered Rules for the Frequent Shopper Category Mix Analysis Model Explain

The discovered rules help explain purchase patterns of frequent shoppers, for example:

  • The chances of a frequent shopper buying a BEAUTY product increases from 30% to 70% if he or she purchases a FILM ACCESSORIES product. It further increases to 74% if the frequent shopper buys a MAGAZINES product and to 82% if he or she also buys a PET product.

  • The chances of a customer buying a BEAUTY product is 82% if he or she has purchased a product from categories 'FILM ACCESSORIES', 'MAGAZINES' and 'PET'. This Rule is about 62% significant in terms of Basket Value (Sales Value of all 4 categories in Basket) and The Target Category (BEAUTY) significance is about 70% of the Basket.

  • BEAUTY products are very rarely (about 3% of the time) sold with BAKERY products.

The significance of a rule can be measured in terms of support and confidence and a host of additional supporting measures like Basket Significance (Value), Target Category Signifcance (Value), Basket Significance (Customers), Target Category Signifcance (Customers), Basket Significance (Transactions), Target Category Signifcance (Transactions), and so on.

What the Frequent Shopper Category Mix Analysis Model Mines

This model mines the monthly purchases of individual frequent shoppers and discovers rules about the categories that are frequently bought in groups by frequent shoppers.

Target Variable for the Frequent Shopper Category Mix Analysis Model

The purchase patterns are designed to be generated monthly for each individual store. Therefore, APASS models are created every month for each store.

Source Variables for the Frequent Shopper Category Mix Analysis Model

The following item attributes are the source variables:

  • Case Id Alt (PK)

  • Store ID

  • Year

  • Month

  • ID

  • Name

  • Value

Note:

If the mining must be performed at multiple levels, such as category, subcategory, item, and others, there may be multiple source tables.

Columns Included in the Target Views of the Frequent Shopper Category Mix Analysis Model Report

The mined patterns and rules are visible through the target view FS_CATEGORY_MIX_APASS_RULES with the following columns and can be displayed in an OBIEE report:

  • STORE_ID - Store ID

  • YEAR - Year

  • MONTH - Month

  • MODEL_NAME - Model Name (CCM_MDL_APASS_<Store_ID>)

  • MODEL_TYPE - Model Type (APASS)

  • MODEL_TYPE_DESC - Model Description (Apriori Association)

  • ANALYSIS_NAME - Analysis Name (CUST_CATEGORY_MIX)

  • ANALYSIS_DESC - Analysis Description (Customer Category Mix Analysis)

  • RULE_ID - Rule Id .. IF (antecedent) THEN (consequent) END

  • RULE_ANTECEDENT_ITEMS - List of Categories making up the IF part of the Category Basket (1 upto max as specified in settings while building the model)

  • RULE_CONSEQUENT_ITEMS - List of Categories making up the THEN part of the Category Basket (usually 1 category)

  • RULE_SUPPORT - Support (number of cases in the input dataset which pertain to the current basket)

  • RULE_CONFIDENCE - Confidence (Probability of the THEN part of the Rule coming true based on input dataset)

  • RULE_DISPLAY_ORDER - Default or recommended display order of the rules. Critical for DT models. Not critical for APASS models.

  • RULE_LENGTH - Number of Categories in the antecedent (IF) part of the Rule

  • BSKT_CTGRY_COUNT - Number of categories in the Category Basket (includes antecedent and consequent)

  • BSKT_ALL_SLS_VAL - Sales Value (total) for all categories in the Category Basket

  • BSKT_ALL_SLS_UNITS - Sales Units (total) for all categories in the Category Basket

  • BSKT_ALL_TRX_COUNT - Transaction Count for all categories in the Category Basket

  • BSKT_ALL_CUST_COUNT - Customer Count for all categories in the Category Basket

  • BSKT_ALL_SLS_VAL_SIGN - Significance of Category Basket within Store per Sales Value ....... ratio of Basket Sales Value to Store wide Sales Value (all categories)

  • BSKT_ALL_SLS_UNITS_SIGN - Significance of Category Basket within Store per Sales Units ....... ratio of Basket Sales Units to Store wide Sales Units (all categories)

  • BSKT_ALL_TRX_COUNT_SIGN - Significance of Category Basket within Store per Transaction Count.. ratio of Basket Transaction Count to Store wide Transaction Count (all categories)

  • BSKT_ALL_CUST_COUNT_SIGN - Significance of Category Basket within Store per Customer Count......ratio of Basket Customer Count to Store wide Customer Count (all categories)

  • BSKT_ALL_AVG_SLS_VAL_TRX - Avg Sales Value per Transaction (within Category Basket)

  • BSKT_ALL_AVG_SLS_VAL_CUST - Avg Sales Value per Customer (within Category Basket)

  • BSKT_ALL_AVG_SLS_UNITS_TRX - Avg Sales Units per Transaction (within Category Basket)

  • BSKT_ALL_AVG_SLS_UNITS_CUST - Avg Sales Units per Customer (within Category Basket)

  • BSKT_TGT_SLS_VAL - Sales Value of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_SLS_UNITS - Sales Units of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_TRX_COUNT - Transaction Count of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_CUST_COUNT - Customer Count of Target Category (consequent category /THEN part of RULE)

  • BSKT_TGT_SLS_VAL_SIGN - Significance of the Target Category within Category Basket per Sales Value ..... ratio of Target Category Sales Value to Basket Sales Value (all categories in Basket)

  • BSKT_TGT_SLS_UNITS_SIGN - Significance of the Target Category within Category Basket per Sales Units ..... ratio of Target Category Sales Units to Basket Sales Units (all categories in Basket)

  • BSKT_TGT_TRX_COUNT_SIGN - Significance of the Target Category within Category Basket per Transaction Count ..... ratio of Target Category Transaction Count to Basket Transaction Count (all categories in Basket)

  • BSKT_TGT_CUST_COUNT_SIGN - Significance of the Target Category within Category Basket per Customer Count ..... ratio of Target Category Transaction Count to Basket Customer Count (all categories in Basket)

  • BSKT_TGT_AVG_SLS_VAL_TRX - Avg Sales Value per Transaction (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_VAL_CUST - Avg Sales Value per Customer (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_UNITS_TRX - Avg Sales Units per Transaction (Target Category .. consequent category /THEN part of RULE)

  • BSKT_TGT_AVG_SLS_UNITS_CUST - Avg Sales Units per Customer (Target Category .. consequent category /THEN part of RULE)

  • STR_ALL_SLS_VAL - Sales Value (total) for all categories in Store

  • STR_ALL_SLS_UNITS - Sales Units (total) for all categories in Store

  • STR_ALL_TRX_COUNT - Transaction Count for all categories in Store

  • STR_ALL_CUST_COUNT - Customer Count for all categories in Store

Item Basket Analysis Model

This model addresses the business problem of identifying the extent to which item (product) characteristics influence the items' sales KPIs.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Item Basket Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
FOOD STAMP is allowed
AND
ENVIRONMENT TYPE CODE is (NORMAL)
AND
Item Basket Analysis Report
Data Mining Reports 6-21
FREQUENT_SHOPPER_POINTS (800 - 999)
THEN
AVERAGE BASKET VALUE is (HIGHEST)

Desired Rules Example 2

IF
FOOD STAMP is NOT allowed
AND
ENVIRONMENT TYPE CODE is (REFRIGERATED)
AND
FREQUENT_SHOPPER_POINTS (800 - 999)
THEN
AVERAGE BASKET VALUE is (LOWEST)

What the Discovered Rules for the Item Basket Analysis Model Explain

The discovered rules draw the profile of items that have the extreme values of the target KPI. For example, the Examples of Desired Rules for this model discover the profiles of items showing extreme average basket values.

What the Item Basket Analysis Model Mines

This model identifies which key attributes of an item influence the number of baskets sold, average basket value, and size in a particular store. This model mines the various attributes of items. It takes the binned variables one at a time for Total Basket Count, Average Basket Value, and Average Basket Size as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of item characteristics.

Target Variable for the Item Basket Analysis Model

The rules are designed to be generated monthly for each individual store. Therefore, nine ABN and nine DT models are created every month for each store using the following variables as targets in this order:

  1. Total Basket Count Quartile (TBCQR)

  2. Total Basket Count Quintile (TBCQN)

  3. Total Basket Count Decile (TBCDE)

  4. Average Basket Value Quartile (ABVQR)

  5. Average Basket Value Quintile (ABVQN)

  6. Average Basket Value Decile (ABVDE)

  7. Average Basket Size Quartile (ABSQR)

  8. Average Basket Size Quintile (ABSQN)

  9. Average Basket Size Decile (ABSDE)

Source Variables for the Item Basket Analysis Model

The following item attributes are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Store ID

  • Year

  • Month

  • Item ID

  • Brand Name

  • Category Name

  • Department Name

  • Customer Pickup Type Code

  • Discount Indicator

  • Environment Type Code

  • Hazardous Material Type Code

  • Perishable Indicator

  • Kit Set Code

  • Order Collection Code

  • Price Audit Flag

  • Sale Weight or Unit Count Code

  • Security Required Type Code

  • Sell Unit Landed Cost Amount

  • Sell Unit Last Received Base Cost Amount

  • Sell Unit Last Received Net Cost Amount

  • Item Sale Unit Price Amount

  • Shrink Flag

  • Substitute Identified Indicator

  • Swell Flag

  • Item Usage Code

  • Vendor Item Number

  • Max Shipping Capability

  • Min Order Quantity

  • Sale Unit per Packet Unit Count

  • Shipping Capability Units

  • Store Order Allowed Flag

  • Store Receipt Allowed Flag

  • Style Description

  • Terms Code

  • Vendor Number

  • Vendor Class Code

  • Buy Status Indicator

  • Credit Limit Offered

  • Inform Government Indicator

  • Vendor Number of Years in Business

  • Pay Status Indicator

  • Competitor Retail Item Name

  • Competitor Name

  • Competitor Item Local Advertising Flag

  • Competitor Item On Promotion Flag

  • Competitor Item Promotion Store Coupon Indicator

  • Competitor Sale Unit Price Amount

  • Allow Coupon Multiply Indicator

  • Allow Food Stamp Indicator

  • Coupon Restricted Indicator

  • Electronic Coupon Flag

  • Employee Discount Allowed Flag

  • Frequent Shopper Points

  • Frequent Shopper Points Eligibility Indicator

  • Give Away Flag

  • Item Tender Restriction Group Code

  • Manufacturer

  • Manufacturer Family Code

  • Maximum Sale Unit Count

  • Price Entry Required Flag

  • Prohibit Repeat Key Flag

  • Prohibit Return Flag

  • Selling Status Code

  • Visual Verify Price Flag

  • Weight Entry Required Flag

Columns Included in the Target Views for the Item Basket Analysis Model Report

The mined patterns and rules are visible through in a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Store ID (PK)

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Item POS Loss Analysis Model

This model addresses the business problem of building a profile of item (product) characteristics regarding POS losses.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

See:

Oracle Retail Data Model Operations Guide for a sample report based on this model.

Examples of Desired Rules for the Item POS Loss Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
EMP_DISCOUNT_ALLOWED_IND IS 'Y'
AND
SCANNED_PER_UNITS_SOLD IS (0.8-1)
THEN
THEFT_AMOUNT_TO_SALES_AMOUNT_QUARTILE IS THE HIGHEST

Desired Rules Example 2

IF
SCANNED_PER_UNITS_SOLD IS (0.8-1)
AND
DEPARTMENT_NAME In ('BEVERAGE', 'CIGARETTES')
THEN
SHRINK_AMOUNT_TO_SALES_AMOUNT_QUARTILE IS THE HIGHEST

What the Discovered Rules for the Item POS Loss Analysis Model Explain

The discovered rules describe correlations between item shrinkage and item characteristics.

What the Item POS Loss Analysis Model Mines

This model mines the POS transactions along with the item attributes to identify their impact on Total Shrink Count, Total Shrink Amount, Shrink as a percentage of Sales, Total Theft Count, Total Theft Amount, and Theft as a percentage of Sales.

Target Variables for the Item POS Loss Analysis Model

The rules are designed to be generated monthly for each individual store. Therefore, eighteen ABN and eighteen DT models are created every month for each of the stores using the following variables as targets in this order:

  1. Total Shrink Count Quartile (TSCQR)

  2. Total Shrink Count Quintile (TSCQN)

  3. Total Shrink Count Decile (TSCDE)

  4. Total Shrink Amount Quartile (TSAQR)

  5. Total Shrink Amount Quintile (TSAQN)

  6. Total Shrink Amount Decile (TSADE)

  7. Shrink as a percentage of Sales Quartile (STSQR)

  8. Shrink as a percentage of Sales Quintile (STSQN)

  9. Shrink as a percentage of Sales Decile (STSDE)

  10. Total Theft Count Quartile (TTCQR)

  11. Total Theft Count Quintile (TTCQN)

  12. Total Theft Count Decile (TTCDE)

  13. Total Theft Amount Quartile (TTAQR)

  14. Total Theft Amount Quintile (TTAQN)

  15. Total Theft Amount Decile (TTADE)

  16. Theft as a percentage of Sales Quartile (TTSQR)

  17. Theft as a percentage of Sales Quintile (TTSQN)

  18. Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Item POS Loss Analysis Model

The following attributes of POS and items are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Store

  • Year

  • Month

  • Item ID

  • Brand Name

  • Category Name

  • Department Name

  • Customer Pickup Type Code

  • Discount Indicator

  • Hazardous Material Type Code

  • Perishable Indicator

  • Kit Set Code

  • Order Collection Code

  • Price Audit Flag

  • Sale Weight or Unit Count Code

  • Security Required Type Code

  • Sell Unit Landed Cost Amount

  • Sell Unit Last Received Base Cost Amount

  • Sell Unit Last Received Net Cost Amount

  • Item Sale Unit Price Amount

  • Shrink Flag

  • Substitute Identified Indicator

  • Swell Flag

  • Item Usage Code

  • Vendor Item Number

  • Max Shipping Capability

  • Min Order Quantity

  • Sale Unit per Packet Unit Count

  • Shipping Capability Units

  • Store Order Allowed Flag

  • Store Receipt Allowed Flag

  • Style Description

  • Terms Code

  • Vendor Number

  • Vendor Class Code

  • Buy Status Indicator

  • Credit Limit Offered

  • Inform Government Indicator

  • Vendor Number of Years in Business

  • Pay Status Indicator

  • Competitor Retail Item Name

  • Competitor Name

  • Competitor Item Local Advertising Flag

  • Competitor Item On Promotion Flag

  • Competitor Item Promotion Store Coupon Indicator

  • Competitor Sale Unit Price Amount

  • Allow Coupon Multiply Indicator

  • Allow Food Stamp Indicator

  • Coupon Restricted Indicator

  • Electronic Coupon Flag

  • Employee Discount Allowed Flag

  • Frequent Shopper Points

  • Frequent Shopper Points Eligibility Indicator

  • Give Away Flag

  • Item Tender Restriction Group Code

  • Manufacturer

  • Manufacturer Family Code

  • Maximum Sale Unit Count

  • Price Entry Required Flag

  • Prohibit Repeat Key Flag

  • Prohibit Return Flag

  • Selling Status Code

  • Visual Verify Price Flag

  • Weight Entry Required Flag

Retail Transaction Attributes

  • Total Number of Retail Transactions For Item

  • Total Amount of Retail Transactions For Item

  • Average Amount Per Retail Transaction For Item

  • Number of Distinct Currency Used For Item

  • Total Units Sold For Item

  • Average Units Sold Per Retail Transaction For Item

  • Total Idle Interval For Item (This attribute is the sum of idle intervals of all transactions that contain this Item)

  • Average Idle Interval Per Retail Transaction For Item

  • Total Ring Interval For Item (This attribute is the sum of ring intervals of all transactions that contain this item)

  • Average Ring Interval Per Retail Transaction For Item

  • Total Tender Interval For Item (This attribute is the sum of tender intervals of all transactions that contain this item)

  • Average Tender Interval Per Retail Transaction For Item

  • Total Lock Interval For Item (This attribute is the sum of lock intervals before or after all transactions that contain this item)

  • Average Lock Interval Per Retail Transaction For Item

  • Total Line Items Scanned For Item (This attribute is the total number of times this item is scanned)

  • Average Line Items Scanned Per Units Sold For Item

  • Total Line Items Keyed For Item (This attribute is the total number of times this item is keyed)

  • Average Line Items Keyed Per Units Sold

  • Total Key Department Count For Item (This attribute is the total number of times this item is keyed by the department)

  • Average Key Department Count Per Units Sold

  • Total Service Charge For Item

  • Average Service Charge Per Retail Transaction For Item

  • Total Tax Amount For Item

  • Average Tax Amount Per Retail Transaction For Item

  • Total Number of Voided Transactions For Item

  • Average Number of Voided Transactions Per Retail Transaction For Item

  • Total Amount of Voided Transactions For Item

  • Average Amount of Voided Transactions Per Retail Transaction For Item

  • Average Amount of Voided Transaction as Percentage of Total Retail Transaction Amount For Item

  • Total Number of Discount Line Items For Item

  • Average Number of Discount Line Items Per Retail Transaction For Item

  • Total Amount of Discount Line Items For Item

  • Average Amount of Discount Line Items Per Retail Transaction For Item

  • Average Amount of Discount Line Items as Percentage of Total Retail Transaction Amount For Item

  • Total Number of Return Line Items For Item

  • Average Number of Return Line Items Per Retail Transaction For Item

  • Total Amount of Return Line Items For Item

  • Average Amount of Return Line Items Per Retail Transaction For Item

  • Average Amount of Return Line Items as Percentage of Total Retail Transaction Amount For Item

  • Total Number of Miscellaneous Fee Line Items For Item

  • Average Number of Miscellaneous Fee Line Items Per Retail Transaction For Item

  • Total Amount of Miscellaneous Fee Line Items For Item

  • Average Amount of Miscellaneous Fee Line Items Per Retail Transaction For Item

  • Average Amount of Miscellaneous Fee Line Items as Percentage of Total Retail Transaction Amount For Item

  • Total Number of Promotional Line Items For Item

  • Average Number of Promotional Line Items Per Retail Transaction For Item

  • Total Amount of Promotional Line Items For Item

  • Average Amount of Promotional Line Items Per Retail Transaction For Item

  • Average Amount of Promotional Line Items as Percentage of Total Retail Transaction Amount For Item

  • Total Number of Deposit Redemption Line Items For Item

  • Average Number of Deposit Redemption Line Items Per Retail Transaction For Item

  • Total Amount of Deposit Redemption Line Items For Item

  • Average Amount of Deposit Redemption Line Items Per Retail Transaction For Item

  • Average Amount of Deposit Redemption Line Items as Percentage of Total Retail Transaction Amount For Item

Control Transaction Attributes

  • Total Tax Exempt Transaction Count For Item

  • Average Tax Exempt Transaction Count Per Retail Transaction For Item

  • Tax Exempt Total Amount For Item

  • Average Tax Exempt Amount Per Retail Transaction For Item

  • Tax Exempt Total Amount as a Percentage of Total Retail Transaction Amount For

  • Item

  • Total Number of Store Coupons For Item

  • Average Number of Store Coupons Per Retail Transaction For Item

  • Average Number of Store Coupons Per Retail Transaction For Item

  • Total Amount of Store Coupons For Item

  • Average Amount of Store Coupons Per Retail Transaction For Item

  • Average Amount of Store Coupons as Percentage of Total Retail Transaction

  • Amount For Item

  • Total Markdown Count For Item

  • Average Markdown Count per Retail Transaction For Item

  • Markdown Total Amount For Item

  • Average Markdown Amount Per Retail Transaction For Item

  • Average Markdown Amount as a Percentage of Total Retail Transaction Amount

  • For Item

  • Total Employee Discount Count For Item

  • Average Employee Discount Per Retail Transaction For Item

  • Total Employee Discount Amount For Item

  • Average Employee Discount Amount Per Retail Transaction For Item

  • Average Employee Discount Amount as a Percentage of Retail Transaction

  • Amount For Item

  • Total Weighed Line Item Count For Item

  • Average Weighed Line Item Count Per Retail Transaction For Item

  • Total Weighed Line Item Amount For Item

  • Average Weighed Line Item Amount Per Retail Transaction For Item

  • Average Weighed Line Item Amount as a Percentage of Total Retail Transaction

  • Amount For Item

  • Total Layaway Payments Collected Count For Item

  • Average Layaway Payments Collected Count Per Retail Transaction For Item

  • Total Layaway Payments Collected Amount For Item

  • Average Layaway Payments Collected Amount Per Retail Transaction For Item

  • Average Layaway Payments Collected Amount as a Percentage of Total Retail

  • Transaction Amount For Item

  • Total Container Deposit Count For Item

  • Average Container Deposit Count Per Retail Transaction For Item

  • Total Container Deposit Amount For Item

  • Average Container Deposit Amount Per Retail Transaction For Item

  • Average Container Deposit Amount as a Percentage of Total Retail Transaction

  • Amount For Item

  • Total Redeemed Container Deposit Count For Item

  • Average Redeemed Container Deposit Count Per Retail Transaction For Item

  • Total Redeemed Container Deposit Amount For Item

  • Average Redeemed Container Deposit Amount Per Retail Transaction For Item

  • Average Redeemed Container Deposit Amount as a Percentage of Total Retail

  • Transaction Amount For Item

  • Total Cash Tender Count For Item

  • Average Cash Tender Count Per Retail Transaction For Item

  • Total Cash Tender Amount For Item

  • Average Cash Tender Amount Per Retail Transaction For Item

  • Average Cash Tender Amount as a Percentage of Total Retail Transaction Amount

  • For Item

  • Total Check Tender Count For Item

  • Average Check Tender Count Per Retail Transaction For Item

  • Total Check Tender Amount For Item

  • Average Check Tender Amount Per Retail Transaction For Item

  • Average Check Tender Amount as a Percentage of Total Retail Transaction

  • Amount For Item

  • Total Credit Card Tender Count For Item

  • Average Credit Card Tender Count Per Retail Transaction For Item

  • Total Credit Card Tender Amount For Item

  • Average Credit Card Tender Amount Per Retail Transaction For Item

  • Average Credit Card Tender Amount as a Percentage of Total Retail Transaction

  • Amount For Item

  • Total Debit Card Tender Count For Item

  • Average Debit Card Tender Count Per Retail Transaction For Item

  • Total Debit Card Tender Amount For Item

  • Average Debit Card Tender Amount Per Retail Transaction For Item

  • Average Debit Card Tender Amount as a Percentage of Total Retail Transaction

  • Amount For Item

  • Total Customer Account Tender Count For Item

  • Average Customer Account Tender Count Per Retail Transaction For Item

  • Total Customer Account Tender Amount For Item

  • Average Customer Account Tender Amount Per Retail Transaction

  • Average Customer Account Tender Amount as a Percentage of Total Retail

  • Transaction Amount

  • Total Gift Certificate Tender Count For Item

  • Average Gift Certificate Tender Count Per Retail Transaction For Item

  • Total Gift Certificate Tender Amount For Item

  • Average Gift Certificate Tender Amount Per Retail Transaction For Item

  • Amount For Item

  • Total Coupon Tender Count For Item

  • Average Coupon Tender Count Per Retail Transaction For Item

  • Total Coupon Tender Amount For Item

  • Average Coupon Tender Amount Per Retail Transaction For Item

  • Average Coupon Tender Amount as a Percentage of Total Retail Transaction

  • Amount For Item

Columns Included in the Target Views of the Item POS Loss Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Store ID (PK)

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

    Attribute Type

POS Flow Analysis Model

This model addresses the business problem of detecting patterns in the flow of items, transactions, and amount across individual points of sale during different time periods.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Examples of Desired Rules for the POS Flow Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
NUMBER_OF_HOUSEHOLDS IS<3000
AND
STORE LOCATION TYPE IS 'Free Standing'
AND
PER CAPITA INCOME IS<3000
THEN
Total Return Items Count Quartile IS THE LOWEST

Desired Rules Example 2

IF
NUMBER_OF_HOUSEHOLDS IS>5000
AND
AVERAGE_DRIVE_TIME_MIN IS<15
AND
(STORE_USAGE IS 'Store Within a Store' OR STORE_USAGE IS 'Department')
AND
STORE_OPEN_HOURS IS>=16
THEN
Total Sale Transactions Count Quartile IS THE HIGHEST

What the Discovered Rules for the POS Flow Analysis Model Explain

The discovered rules describe the influence of trade area demographic characteristics and store characteristics on the amount of POS traffic at individual workstations over different periods throughout the day.

What the POS Flow Analysis Model Mines

This model mines the various attributes of Store and Location. It takes the binned variables one at a time for Transaction Type, Transaction Count, and Transaction Amount (Sales) as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of Store, Location, and State Demographic attributes.

Target Variables for the POS Flow Analysis Model

The rules are generated from the historical data across all time periods and store workstations. A total of twenty-seven ABN and twenty-seven DT models with a single feature are created using the following variables as targets in this order:

  1. Total Sale Transactions Count Quartile (TSTCQR)

  2. Total Sale Transactions Count Quintile (TSTCQN)

  3. Total Sale Transactions Count Decile (TSTCDE)

  4. Total Return Transactions Count Quartile (TRTCQR)

  5. Total Return Transactions Count Quintile (TRTCQN)

  6. Total Return Transactions Count Decile (TRTCDE)

  7. Total Void Transactions Count Quartile (TVTCQR)

  8. Total Void Transactions Count Quintile (TVTCQN)

  9. Total Void Transactions Count Decile (TVTCDE)

  10. Total Sale Items Count Quartile (TSICQR)

  11. Total Sale Items Count Quintile (TSICQN)

  12. Total Sale Items Count Decile (TSICDE)

  13. Total Return Items Count Quartile (TRICQR)

  14. Total Return Items Count Quintile (TRICQN)

  15. Total Return Items Count Decile (TRICDE)

  16. Total Void Items Count Quartile (TVICQR)

  17. Total Void Items Count Quintile (TVICQN)

  18. Total Void Items Count Decile (TVICDE)

  19. Total Sale Amount Quartile (TSAQR)

  20. Total Sale Amount Quintile (TSAQN)

  21. Total Sale Amount Decile (TSADE)

  22. Total Return Amount Quartile (TRAQR)

  23. Total Return Amount Quintile (TRAQN)

  24. Total Return Amount Decile (TRADE)

  25. Total Void Amount Quartile (TVAQR)

  26. Total Void Amount Quintile (TVAQN)

  27. Total Void Amount Decile (TVADE)

Source Variables for the POS Flow Analysis Model

The following attributes of store workstation and time periods are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Store Workstation ID

  • Workstation Name

  • POS Type

  • Equipment Type

  • Manufacturer Name

  • Terminal Model Number

  • Terminal Type (POS register, Goods receipt terminal, and others)

  • Outside Indicator

  • Store Name

  • Store Manager Name

  • Store Usage (Store, Store within a store, Department, Kiosk, and others)

  • Store Status (Under construction, New, and others)

  • Total Open Hours

  • Store Location Type (Free standing, Shopping Center, CBD, SBD, NBD, and others)

  • Primary Trade Area Code

  • Trade Area Coverage

  • Market Area Code

  • Market Area Type (Urban, Suburban, Rural, and others)

  • Market Area Population

  • Pull Factor

  • Total Commuter Population

  • Peak Season Population

  • Tourist Population

  • Average Drive Time

  • Number of Households

  • Average Household Size

  • Average Family Size

  • Per Capita Income

  • Average Number of Vehicles per Household

  • Shopping Center Type (Strip Center, Mall, and others)

  • Store Concept (Convenience, General Merchandise, Fashion oriented, and others)

  • Terrain (Mountain, Inland, Desert)

  • Total Built-up Area

  • Total Super Built-up Area

  • Number of Functional Months

  • Usable Area

  • Inventory Area

  • Selling Area

  • New Store Indicator

  • Store Price Index

  • Number of Levels of Floors

  • Number of Window Displays

  • Area of Window Displays

  • Fitting Rooms Available

  • Number of External Signs

  • Rest Rooms Available

  • Type Of Parking

  • Distance to Nearest Cross

  • Distance from Market Area Center

  • Store County or District

  • Store City

  • Store State or Province

  • State Population

  • State Sales

  • Store Country

  • Store World Region

  • Date Time Key

  • Hour of Day (1-24)

  • Calendar Day of Week (Sun-Sat)

  • Calendar Day of Month (1-30 or 31)

  • Holiday Indicator

  • Weekend Indicator

  • Calendar Week Number

  • Fiscal Week Number

  • Advertising Week Number

  • Planning Week Number

  • Calendar Half Month Number

  • Fiscal Half Month Number

  • Calendar Month Number

  • Fiscal Month Number

  • Advertising Period Number

  • Planning Period Number

  • Calendar Quarter Number

  • Fiscal Quarter Number

  • Advertising Quarter Number

  • Planning Quarter Number

  • Calendar Half Year Number

  • Fiscal Half Year Number

  • Calendar Year Number

  • Fiscal Year Number

  • Advertising Year Number

  • Planning Year Number

Columns Included in the Target Views of the POS Flow Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type

Store Loss Analysis Model

This model addresses the business problem of building a profile of organization (store) characteristics regarding shrinkage.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Examples of Desired Rules for the Store Loss Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
STORE IS NEW
and
NUMBER OF WINDOW DISPLAYS IS (4 - 5)
and
STORE DEPARTMENT IS (RETURN)
THEN
STORE THEFT AMOUNT IS THE HIGHEST

Desired Rules Example 2

IF
STORE IS NEW
and
NUMBER OF WINDOW DISPLAYS IS (4 - 5)
and
STORE DEPARTMENT IS (GIFT)
THEN
STORE THEFT AMOUNT IS THE LOWEST

What the Discovered Rules for the Store Loss Analysis ModelExplain

The discovered rules describe correlations between shrinkage and store characteristics.

What the Store Loss Analysis Model Mines

This analysis identifies the extent to which key store characteristics influence shrinkage and theft. This model mines the various attributes of stores. It takes the binned variables one at a time for Shrink and Thefts as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of store attributes.

Target Variables for the Store Loss Analysis Model

The rules are designed to be generated monthly. Therefore, eighteen ABN and eighteen DT models are created every month across all stores using the following variables as targets in this order:

  1. Total Shrink Count Quartile (TSCQR)

  2. Total Shrink Count Quintile (TSCQN)

  3. Total Shrink Count Decile (TSCDE)

  4. Total Shrink Amount Quartile (TSAQR)

  5. Total Shrink Amount Quintile (TSAQN)

  6. Total Shrink Amount Decile (TSADE)

  7. Shrink as a percentage of Sales Quartile (STSQR)

  8. Shrink as a percentage of Sales Quintile (STSQN)

  9. Shrink as a percentage of Sales Decile (STSDE)

  10. Total Theft Count Quartile (TTCQR)

  11. Total Theft Count Quintile (TTCQN)

  12. Total Theft Count Decile (TTCDE)

  13. Total Theft Amount Quartile (TTAQR)

  14. Total Theft Amount Quintile (TTAQN)

  15. Total Theft Amount Decile (TTADE)

  16. Theft as a percentage of Sales Quartile (TTSQR)

  17. Theft as a percentage of Sales Quintile (TTSQN)

  18. Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Store Loss Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

  • Case Id Alt (PK)

  • Year

  • Month

  • Store ID

  • Store Name

  • Store Manager Name

  • Store Usage (Store, Store within a store, Department, Kiosk, and others)

  • Store Status (Under construction, New, and others)

  • Total Open Hours

  • Store Location Type (Free standing, Shopping Center, CBD, SBD, NBD, and others)

  • Primary Trade Area Code

  • Trade Area Coverage

  • Market Area Code

  • Market Area Type (Urban, Suburban, Rural, and others)

  • Market Area Population

  • Pull Factor

  • Total Commuter Population

  • Peak Season Population

  • Tourist Population

  • Average Drive Time

  • Number of Households

  • Average Household Size

  • Average Family Size

  • Per Capita Income

  • Average Number of Vehicles per Household

  • Shopping Center Type (Strip Center, Mall, and others)

  • Store Concept (Convenience, General Merchandise, Fashion oriented, and others)

  • Terrain (Mountain, Inland, Desert)

  • Total Built-up Area

  • Total Super Built-up Area

  • Number of Functional Months

  • Usable Area

  • Inventory Area

  • Selling Area

  • New Store Indicator

  • Store Price Index

  • Number of Levels of Floors

  • Number of Window Displays

  • Area of Window Displays

  • Fitting Rooms Available

  • Number of External Signs

  • Rest Rooms Available

  • Type Of Parking

  • Distance to Nearest Cross

  • Distance from Market Area Center

  • Store County or District

  • Store City

  • Store State or Province

  • State Population

  • State Sales

  • Store Country

  • Store World Region

Columns Included in the Target Views for the Store Loss Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Rule ID (PK)

  • Performance Measure

  • Measure Value

  • Associate Profile

  • Prediction Count

  • Record Count

  • Support

  • Confidence

  • Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

  • Analysis Name (PK)

  • Analysis Desc

  • Year (PK)

  • Month (PK)

  • Model Type

  • Model Type Desc

  • Model Name (PK)

  • Performance Measure

  • Attribute Name (PK)

  • Attribute Type