This page describes my journey for creating Qlik Sense app for WPP.
REDCap REST Connection
This was the easiest connection I had to create. All I needed was my project's API token which I had requested before and copied from REDCap.
Also, make sure to add the following to Query Headers:
Content-Type application/x-www-form-urlencoded
Blackbaud Grantmaking API Call
API Key
The API call for any Blackbaud Grantmaking (BBGM) had a very long complex journey.
WPP purchased an API key from Blackbaud and we started building the REST connection from Qlik Sense.
Bearer Token
Based on BBGM API documentations:
Blackbaud Grantmaking API Documentation
Every call requires a Bearer token. ( more on Bearer token here: Bearer Authentication). The bearer token is generated by calling the me-auth POST endpoint:
https://api.microedge.com/auth/token/me-auth
The Body of request looks like this:
{
"userId":[usetID_based_on_key_file] ,
"privateKey":[actual_key_based_on_key_file]
}
Connection is created through REST connector. With the following header:
Content-Type application/json
The response is in json format and looks like this:
{
"authenticated": true,
"token": "a_very_long_jibbrish_string",
"tokenExpires": 1579725270,
"clientId": "wpp"
}
The bearer is the long string value of "token" and it expires half an hour after it has been created.
Requests Endpoint
Then I created a second REST connection where I used the Bearer token. The endpoint is a POST call to getRequests:
https://api.microedge.com/goapi/request/getRequests
and I am retrieving only approved requests. This is the body of request:
{
"criteria":{
"disposition":"Approved"
}
}
with the following Query headers:
Authorization Bearer [token_retrieved_from_previous_step]
Content-Type application/json
Very Important Notes:
- We enter the token by adding the term "Bearer" and a space before the actual retrieved token.
- Every time we want to change something in this connection or do a "Insert Script", we need to replace the old token with a fresh token.
Issues I faced and solutions
Token expiry
The first issue we were facing was the fact that with token expiration, we were unable to schedule the load.
The connections had to be manually manipulated:
RestConnectorMasterTable2: SQL SELECT "token" , "__KEY_root" FROM JSON (wrap on) "root" PK "__KEY_root";
And the bearer was retrieved out of that and assigned into a variable:
LET vBBGMBearer= peek('token',0,'RestConnectorMasterTable');
Now I was ready to run the second call. I used the "Insert Script" functionality from the "Select Data" and ran it using "With connection":
With Connection ( HTTPHEADER "Authorization" "$(vBBGMBearer)" );
And I was happiest person on earth when it ran successfully.
Pagination Limitation
Soon I realized that I am not getting all the necessary data. After going through API documentation, I realized that each call will retrieve only 20 records.
I was able to send the page number to get more data. It didn't have any "nextURL" for me to add it into the connection. So, I had to manually set it up.
So, I had to manipulate the request's body through my script and send it that way. After hours of experimentation, I was able to do that:
Let vBody ='{"criteria":{"disposition":"Approved"},"perPage":'&'$(vRowCountLimit)'&',"page":'&'$(vPageNumber)'&'}'; let vBody = replace(vBody,'"', chr(34)&chr(34));
And I was able to send it through the "With Connection" clause. Now the ending of my SQL Select script looked like this :
With Connection ( HTTPHEADER "Authorization" "$(vBBGMBearer)", BODY "$(vBody)" );
The Loop!
Now that I had one successful run, I had to wrap the calls with a loop to get all the records which is way more than 20.
And in order to capture all data, I had to concatenate each round to the existing data:
Do while ; Concatenate('RequestsTable') . . . Loop; STORE RequestsTable INTO [Lib://WPP_QVDs/WPP_BBGM.QVD] (qvd); Drop Table RequestsTable;
And in order to avoid infinite loop, I applied an exit logic:
Exit do when (('$(vNewRecords)' = 0) or '$(vPageNumber)' = '$(vCallLimit)');
Messy Data Dumps
The data is retrieved in a json format:
A surprise hit me when I noticed that Qlik Sense does not handle nested json gracefully.
That is why the load creates multiple tables. Basically, for every sub-select, there is one table created.
Selecting just whatever I needed didn't go well. Basically, some data was saved horizontally and some vertically. This ended up giving me more than one row per record.:
Set Analysis
The first thing in my mind was using "Set Analysis".
After trying it on a different table for OAC/PERC Amount, I noticed that the set analysis will override user selection. So, I had to leave that solution.
I kept it for OAC/PERC Amount however which is a different load after this load.
Setting Expressions with If Statement
At this point, I thought, if I create an if statement for each table, I will have proper value showing.
I tried setting up a filter for "Fund" which should only include one of the two values:
- Oversight and Advisory Committee
- Partnership Education and Research Committee
So, I set the following if statement on my table:
=If(([table_id] = 6), [Fund])
And the data still didn't line up and there was also an empty row for the missing classification type:
However, this is an issue with table only and not so much with charts that represent only one dimension and exclude null values (including null values causes the same issue as tables):
Having said that, excluding null values here would result in hiding the missing values as well. For example, in the following chart, one record should be having a "missing" value which is unfortunately completely hidden when null excluded and unrecognizable when null included:
No Composite Key issue
Why does the data behave like that. Here is a simple diagram of how the associations look like behind the scene:
The "codings" table uses the "__Key_codings" to classification tables (codeTable, level1, level2, ...). The "codings" table also links to "requests" through "__KEY_requests".
The issue is that the classification tables have no other means to know about requests. If for example there was a composite key made out of both __Key_codings and __KEY_requests and then passed down to classification tables, they would know about it.
So, when we choose "id-u3" = 6 from codeTable which represents the "Fund" and "Subfund" classifications at the same time, and then connect it to requests, it will fill the table with twice the data, once only with Fund present and Subfund absent and the second time the other way around. Also, it will load two rows for subfund since there are in general two funds, but it will leave one of them blank. This is a very confusing data structure.
In order to achieve what I was looking for, one of the options was to have multiple loads of classification tables.
Multiple Loads
So, the only other option to get them all lined up which I was trying hard to avoid so far was trying to load the nested data multiple times once per classification.
Here is a list of all classifications and their associated table_id:
- 4 Program Area
- 6 Fund
- 14 Internal Program
- 15 Gender
- 16 Age Group
- 17 Ethnicity
- 18 Population Served
- 20 Geographical Area Served
- 102 Research categories
- 105 OAC Primary Categories
- 106 Request Area Served County
- 107 Fund:Subfund
- 109 PERC Primary Categories
- 115 Use of Human Subjects
- 116 Use of Animal Subjects
- 117 Use of Biological Materials
- 118 Use of Radioactive Materials
- 119 Social Determinants of Health
- 121 Five-Year Plan Goals Addressed
- 123 Urban/Rural
Altogether 20 classifications at the time of this writing.
First I limited the table to 6 (Fund) to see if it will work at all:
[codeTable]: LOAD [id_u3], [description_u1], [__FK_codeTable] AS [__KEY_codings] RESIDENT RestConnectorMasterTable WHERE NOT IsNull([__FK_codeTable]) and [id_u3] = 6;
and it didn't:
Then I did an inner join between the classification tables and "codings" table and that worked:
Now I needed to load the classification tables multiple times. Since the tables were already loaded and inner joined (even under a different name), they didn't return anything. I had to have a different key name. Changing the key name caused the same issue as before when we tried to load the tables in view through an if statement:
So, I only could do one inner join at a time to the "codings" table which has all the keys. The reason being that when we do inner join, Qlik puts all those joined tables into one.
So, the following script:
[codings]: LOAD [id_u9], [allocationPercentage], [allocationAmount], [isPrimary], [__KEY_codings], [__FK_codings] AS [__KEY_requests] RESIDENT RestConnectorMasterTable WHERE NOT IsNull([__FK_codings]); inner Join [codeTable]: LOAD [id_u3], [__FK_codeTable] AS [__KEY_codings] RESIDENT RestConnectorMasterTable WHERE NOT IsNull([__FK_codeTable]) and [id_u3] = 6; inner Join [level1]: LOAD [id_u4], [code_u1] as [Fund Code], [description_u2] as [Fund], [__FK_level1] AS [__KEY_codings] RESIDENT RestConnectorMasterTable WHERE NOT IsNull([__FK_level1]);
Takes all keys from the "codings" table and all non-key fields from the other two and combines them together as shown below:
This means that I had to create one codings table per classification in addition to the other classification tables. And THAT worked:
Duplicate Key issue
After the load with sample data which was 20 records or one API call was finally successful, I ran a full data load and everything went back to messy again.
After hours of struggle, I realized that every time data was retrieved, the key was back to 0. This caused multiple records with the same key and since key was the means for linking among tables, we faced improper data linkage.
Manipulating the data manually
I had to change the key manually in order to have proper linking. So, I injected the API call's page number into the script and it wouldn't work.
Here is some examples of what I tried:
'$(vPageNumber)'
"$(vPageNumber)"
@vPageNumber='$(vPageNumber)'
chr(34)&$(vPageNumber)'&chr(34)
with all the different combinations you see above and nothing would come back. Even I tried hard coding just to test what the issue was and hard coded value wouldn't come back.
Finally, after a substantial amount of research, I realized that I cannot combine Qlik syntax together with SQL Select script, but I can combine it in Load script since Load uses Qlik syntax basically.
In order to do that, I had to include the load inside the loop and it worked.
Also, since the load would take a long time, I included the token generator call in the loop as well.
Race condition
After a couple runs where I got to see the successful results and started cleaning up, something new came up. My API calls were failing because of lack of authorization! It would not occur in debug mode and this leaned me towards race condition.
Race condition is when different parts of the application compete in resources and one piece might run too soon because of that. And it was happening here.
So, I set a condition after token generation to check and see if the token has any value yet, if not, just keep going with the old token.
LET vTempBearer = peek('token',0,'RestConnectorMasterTable'); LET vTempBearer = '$(vTempBearer)'; If vTempBearer <> '' Then LET vBBGMBearer = 'Bearer ' & '$(vTempBearer)'; End If Exit do when '$(vBBGMBearer)' = '';
I also added an exit condition if empty token came back.
Qvd or No Qvd That is the Question!
So, I put the entire script in Load app and soon realized that in Qlik, there can be only one table per QVD. So, I couldn't just load the entire data set in Load and hand picked data in visual app.
I ended up moving the entire BBGM API script over to visualization app without saving anything in any QVD.
There is the end of the long BBGM API call from Qlik journey.
The issue of Multiple Tables
Looking at the backend, I noticed the same table in multiple instances, each instance per API call:
The solution tried was to concatenate them to the already existing data. Qlik has this limitation that I cannot use two prefixed commands on a LOAD. For example, in my case, I cannot use Inner Join and Concatenate(Table).
I ended up using the Concatenate only on tables I did not force Inner Join on. And that fixed the multiple table issue of tables joint with them as well.
//Create necessary tables for concatenation in the loop. //If we don't create them here, first run will fail. organization: LOAD 0 as id AutoGenerate 0; requests: LOAD 0 as id_u11 AutoGenerate 0; codingsFund: LOAD 0 as id_u9_fund AutoGenerate 0; codingsSubfund: LOAD 0 as id_u9_subfund AutoGenerate 0; ... Concatenate([organization]) LOAD
This however, did not fix the tables that were not part of any joint:
It was fixed for main tables by applying square brackets around names and then something unexpected happened with "requests" concatenation.
By applying concatenation however, the number of rows goes up to an unreasonable number (more than 10 digits).
and also data was inaccurate:
The sync creates multiple rows and at the end, I had to not apply concatenation to tables.
Token didn't take new values
The multiple tables issue caused another problem. When token was retrieved, it was Loaded into a table and I read it from that table and used it. However, the next run, Qlik automatically created a new table with a new name and that wasn't valid any longer.
I forced the page number into the table name and read token based on that and it fixed the issue:
if vPageNumber > 0 then LET vSuffix = '-'&'$(vPageNumber)'; end if Let RestConnectorMasterTable = 'RestConnectorMasterTable'&'$(vSuffix)'; LIB CONNECT TO ...; '$(RestConnectorMasterTable)': SQL SELECT "token", "__KEY_root" FROM JSON (wrap on) "root" PK "__KEY_root"; // Read the token into a variable to be used in the next query/api call LET vTempBearer = peek('token',0,'$(RestConnectorMasterTable)'); Drop Table '$(RestConnectorMasterTable)';
Forced concatenation issue
After cleaning up all the concatenation code, when running the data, I noticed that RequestsTable that collects the data retrieved from SQL SELECT statement before LOAD statements happen forces concatenation on itself. This caused the script to run for a very long time and mostly fail. I had to create the empty table RequestsTable before the loop and wipe all the existing data in it before each run.
RequestsTable: LOAD 0 as ID AutoGenerate 0; // START Requests API Call and add the returned results to previous ones LIB CONNECT TO 'WPP Blackbaud Grantmaking (ad_ahmadijooban)'; RequestsTable: SQL SELECT
Separating Load from Visualization
In order to do that, one giant load was created inside the loop and the entire API calls moved into LAD app. The result was saved in one QVD.
The version of load that broke the table into smaller ones with links and joins stayed inside the visualization app.
This way, concatenating all tables into one big table in order to save it in one QVD was possible without any issues, mostly because there were no linking happening yet.
Join vs Keep
Replacing joins with keep didn't work. It only loaded data from the first set of classification tables joint to codings table. And it also didn't line up the data properly: