Skip to Content
Previous

ABC Analysis with SAP HANA PAL

By akula86

Using SAP HANA Predictive Analytics Library for ABC Analysis

You will learn

You will use the SAP HANA Predictive Analytics Library to create and run an algorithm to do analysis of data in separate groups (also known as “ABC” analysis). For that you will use SAP HANA multi-tenant database container (MDC) instance in SAP HCP. All tenant databases in the same system share the same system resources (memory and CPU cores) but each tenant database is fully isolated with its own database users, catalog, repository, persistence (data files and log files) and services. And now you can have HANA MDC instance in your HCP Trial account.

Details


Step 1: Check or create your SAP HANA MDC instance in HCP

In your SAP HCP Trial account’s cockpit go to Databases & Schemas under Persistence folder and check if you have database of the system type HANA MDC (<trial>) already may be from some earlier work. If you have, then please go to step 2 of this tutorial.

Databases and Schemas

If you do not have HANA MDC instance, then follow these steps to create the instance and to assign required roles to SYSTEM user. This will take about 10 minutes.

In your SAP HCP Cockpit go to Databases & Schemas and click New.

Databases and Schemas - New

Choose the HANA MDC (<trial>) as the Database System. Pick the Database ID for your instance - it should be all lowercase, only letters and digits.

Choose a password for the user SYSTEM. Please note it down, as you are the only one who is going to use it, and there is no way to reset it if lost.

New SAP HANA MDC definition

Now your SAP HANA MDC instance is being built, and this will take 6-10 minutes. You should see a message Database started successfully on the Events screen for the instance.

MDC Events

Now two more steps need to be done to give your SYSTEM user administration and developer roles. Never give the same user administration and developer roles in real-life productive scenarios! ;-)

Go to the Overview page of your db instance and click on Administration Tools: SAP HANA Cockpit.

Open HANA

Logon as user SYSTEM with the password you provided during the instance creation.

Logon as SYSTEM user

First time you logon as a SYSTEM the information screen pops up saying your user has no administration authorizations yet. Just click OK on this and Continue on the next pop up.

Confirm authorization assignment

Once the SAP HANA Database Administration screen is open, click on Manage Roles and Users to open Security Workbench.

Open Roles and Users

SAP HANA Web-based Development Workbench: Security opens in the new browser window. Expand Users, click once on SYSTEM user, and click green plus icon to add roles to the user.

Edit SYSTEM user

Search for roles containing develop in names, select all of them and click OK to add them to Granted Roles.

Select Developer roles

Press Ctrl+S or click Save icon. Now your SYSTEM user has developer roles assigned. Again, never do this in productive systems!

Close Security Workbench and SAP HANA Database Administration screens and move to the next step.

Please log in to access the content
Step 2: Open SAP HANA Web-based Development Workbench

The database should be in the state STARTED. Click on SAP HANA Web-based Development Workbench.

Database Overview screen

Please log in to access the content
Step 3: Open the Catalog tool

On the SAP HANA Web-based Development Workbench screen click Catalog to open Catalog tool, which allows you to create, edit, execute and manage SQL catalog artifacts in the SAP HANA database

SAP HANA Web-based Development Workbench

Then in the Catalog tool click on SQL icon to open a new SQL Console. Logon using your developer user. For brevity, you can use the user SYSTEM, but ideally a separate user should be used.

Type the following code and run it using the Run icon or pressing F8 on the keyboard:

SELECT * FROM "SYS"."AFL_FUNCTIONS" WHERE AREA_NAME = 'AFLPAL'
ORDER BY FUNCTION_NAME;

If Predictive Analytics Library (PAL) is installed in the SAP HANA instance, then the query will return the list of PAL predictive analytics algorithms available in the library. You should be able to find ABC among PAL functions and this is the one you are going to use in this tutorial.

SAP HANA SQL Console

Please log in to access the content
Step 4: Introduction to PAL and ABC algorithm

The Predictive Analysis Library (PAL) defines functions that can be called from within SQLScript procedures to perform analytic algorithms. PAL includes classic and universal predictive analysis algorithms in different data-mining categories, like Clustering, Classification, Regression, Association, Social Network Analysis etc.

ABC Analysis algorithm is used to classify objects (such as customers or products) based on a particular measure (such as revenue or profit). It suggests that inventories of an organization are not of equal value, thus can be grouped into three categories (A, B, and C) by their estimated importance. “A” items are very important for an organization. “B” items are of medium importance. “C” items are of the least importance.

An example of ABC classification is as follows:

  • “A” items – 20% of the items (customers) accounts for 70% of the revenue.
  • “B” items – 30% of the items (customers) accounts for 20% of the revenue.
  • “C” items – 50% of the items (customers) accounts for 10% of the revenue.

In this tutorial you will classify customers into ABC groups accordingly to the revenue they generate based on some dummy data.

Please log in to access the content
Step 5: Create a schema and load input data

First create a new schema TUTORIAL_PAL_ABC and a table with some dummy data. As in the previous step just paste this code into SQL Editor and hit Run.

--SET SCHEMA FOR THE TUTORIAL
CREATE SCHEMA "TUTORIAL_PAL_ABC";
SET SCHEMA "TUTORIAL_PAL_ABC";

CREATE COLUMN TABLE DUMMYDATA (
  accountnr varchar(11) default NULL,
  location varchar(100) default NULL,
  total_amount double default NULL
);

You should see the TUTORIAL_PAL_ABC schema and the DUMMYDATA table created in the Catalog Explorer on the left:

Catalog Explorer

Next load 100 records with randomly generated data from the file pal_abc_data.txt - This is actually an SQL file, where you find INSERT statements like below. Copy all of them into SQL Editor and execute.

-- Load data from pal_abc_data.txt file, which contains INSERT statements like: 
-- INSERT INTO DUMMYDATA (accountnr,location,total_amount) VALUES ('51670856799','Sao Tome and Principe',640);
-- INSERT INTO DUMMYDATA (accountnr,location,total_amount) VALUES ('19947726599','Ukraine',270);
-- and so on...

Commit inserts and verify all records have been loaded.

COMMIT;
SELECT COUNT(*) FROM "TUTORIAL_PAL_ABC"."DUMMYDATA";

Results of data load

Now create a view that will be the input to the ABC algorithm.

CREATE VIEW "TUTORIAL_PAL_ABC"."V_ABC_INPUT" as (SELECT ACCOUNTNR as SOMEOBJECT, TOTAL_AMOUNT as SOMEAMOUNT from "DUMMYDATA");

SELECT COUNT(*) FROM "TUTORIAL_PAL_ABC"."V_ABC_INPUT";
--Should return the 100 records
Please log in to access the content
Step 6: Create the wrap procedure for ABC PAL function

To use PAL functions, you must do the following:

  • Generate a procedure that wraps the PAL function with your specific data structures,
  • Call the procedure, for example, from an SQLScript procedure.

You can find more details in SAP HANA Predictive Analysis Library (PAL) Reference document at http://help.sap.com/hana_platform.

Use the following code to generate the wrap procedure:

 --Prepare Table Types
 DROP TYPE PAL_ABC_DATA_VIEW_T;
 CREATE TYPE PAL_ABC_DATA_VIEW_T AS TABLE ("SOMEOBJECT" NVARCHAR(80), "SOMEAMOUNT" DOUBLE);  
 DROP TYPE PAL_CONTROL_T;
 CREATE TYPE PAL_CONTROL_T AS TABLE("Name" VARCHAR(100), "intArgs" INT, "doubleArgs" DOUBLE, "strArgs" VARCHAR(100));  
 DROP TYPE PAL_ABC_RESULT_T;
 CREATE TYPE PAL_ABC_RESULT_T AS TABLE("ABC" NVARCHAR(10), "SOMEOBJECT" NVARCHAR(80));

 --Create the proc using the wrapper
 --Define and populate parameters table
 DROP TABLE PAL_ABC_PDATA_TBL;
 CREATE COLUMN TABLE PAL_ABC_PDATA_TBL(
 "POSITION" INT,
 "SCHEMA_NAME" NVARCHAR(256),
 "TYPE_NAME" NVARCHAR(256),
 "PARAMETER_TYPE" VARCHAR(7)
 );
 INSERT INTO PAL_ABC_PDATA_TBL VALUES (1,'TUTORIAL_PAL_ABC','PAL_ABC_DATA_VIEW_T', 'in');  
 INSERT INTO PAL_ABC_PDATA_TBL VALUES (2,'TUTORIAL_PAL_ABC','PAL_CONTROL_T', 'in');  
 INSERT INTO PAL_ABC_PDATA_TBL VALUES (3,'TUTORIAL_PAL_ABC','PAL_ABC_RESULT_T','out');

 --Call the wrapper procedure to generate our ABC calculation procedure
 CALL SYS.AFLLANG_WRAPPER_PROCEDURE_DROP('TUTORIAL_PAL_ABC', 'PAL_ABC_CALCULATE');
 CALL SYS.AFLLANG_WRAPPER_PROCEDURE_CREATE ('AFLPAL', 'ABC', 'TUTORIAL_PAL_ABC', 'PAL_ABC_CALCULATE', "TUTORIAL_PAL_ABC"."PAL_ABC_PDATA_TBL");

You should see following objects created in the TUTORIAL_PAL_ABC schema:

Catalog Explorer

Please log in to access the content
Step 7: Run the wrapper procedure and check the results

Now execute the procedure with the desired allocation of A, B, and C groups and desired nr of parallel calculation threads. In this example you will have 70%-20%-10% distribution and 5 threads.

--Prepare and execute the procedure with ABC Analysis
--Prepare parameters table
DROP TABLE "TUTORIAL_PAL_ABC"."PAL_CONTROL_TBL";
CREATE COLUMN TABLE "TUTORIAL_PAL_ABC"."PAL_CONTROL_TBL"
("Name" NVARCHAR(100) NOT NULL,
 "intArgs" INTEGER,
 "doubleArgs" DECIMAL(15, 2),
 "strArgs" NVARCHAR(100),
 PRIMARY KEY ("Name"));

TRUNCATE TABLE "TUTORIAL_PAL_ABC"."PAL_CONTROL_TBL";
INSERT INTO "TUTORIAL_PAL_ABC"."PAL_CONTROL_TBL"
SELECT 'THREAD_NUMBER',5,null,null from DUMMY UNION ALL
SELECT 'PERCENT_A',null,0.70,null from DUMMY UNION ALL
SELECT 'PERCENT_B',null,0.20,null from DUMMY UNION ALL
SELECT 'PERCENT_C',null,0.10,null from DUMMY;

--Execute the procedure
DROP TABLE "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL";
CREATE COLUMN TABLE "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL"
("ABC" NVARCHAR(10) NOT NULL,
 "SOMEOBJECT" NVARCHAR(100) NOT NULL,
 PRIMARY KEY ("ABC", "SOMEOBJECT"));
TRUNCATE TABLE "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL";

CALL "TUTORIAL_PAL_ABC"."PAL_ABC_CALCULATE"("TUTORIAL_PAL_ABC"."V_ABC_INPUT","TUTORIAL_PAL_ABC"."PAL_CONTROL_TBL","TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL") WITH OVERVIEW;

As a result the table "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL" should be populated with the results of the ABC Analysis with account numbers assigned to particular A, B, or C group based on their revenue.

Please log in to access the content
Step 8: Check your results

Check the results. Firstly, check what group each account has been assigned to.

SELECT "ABC" AS ABC, T."SOMEOBJECT" AS COMPANY, "SOMEAMOUNT" AS GROSS_AMOUNT
FROM "TUTORIAL_PAL_ABC"."V_ABC_INPUT" AS T INNER JOIN "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL" AS R ON T."SOMEOBJECT" = R."SOMEOBJECT"
ORDER BY 1, 3 desc;

ABC Groups assigned to accounts

Secondly, using SAP HANA’s SQL window functions you can calculate percentages of the total revenue falling into particular groups and compare to desired 70-20-10 distribution.

sql SELECT "ABC" AS ABC, count(*) AS NR_OF_ACCOUNTS, sum("SOMEAMOUNT") AS GROSS_AMOUNT, round(sum("SOMEAMOUNT") / (sum(sum("SOMEAMOUNT") ) over ()),2) AS PT_SHARE FROM "TUTORIAL_PAL_ABC"."V_ABC_INPUT" AS T INNER JOIN "TUTORIAL_PAL_ABC"."PAL_ABC_RESULT_TBL" AS R ON T."SOMEOBJECT" = R."SOMEOBJECT" GROUP BY "ABC" ORDER BY 1;

ABC Groups calculation verification

In real-world distribution you would expect group A has a smaller count, but this is randomly generated data.

You have run your first PAL algorithm.

Final note: the SAP HANA MDC instance is yours and you can use it as well outside of this tutorial. Just remember that it is stopped automatically after 12 hours, and then deleted if not re-started during 7 days. You will get email notifications before the instance is scheduled for deletion, so that you can go to your SAP HCP Trial account to start this instance and to prevent it from being removed.

And yes, use it to learn SAP HANA programming skills!!

Please log in to access the content

Next Steps

Updated 09/24/2016

Time to Complete

20 Min.

Beginner

Tags

Prerequisites

Next Steps

Next
Back to top