The QlikView Incremental Load GOTCHA!

Qlik touts the Incremental Load paradigm as a way to improve the performance of your load scripts.  The idea with an incremental load is to load from a (usually non-QVD) data source and then CONCATENATE that table with a load from a QVD, but filtering the concatenating (QVD) data with the filter:

WHERE NOT EXISTS (fieldName)

N.B. EXISTS() also takes a second parameter (an expression) but for this blog post, I’m ignoring it.

The idea is that – for each row in the concatenating QVD data set – if the value for fieldName of that row already exists in the (non-QVD) data set.  ‘EXISTS’ will return TRUE and, therefore, ‘NOT EXISTS’ will be FALSE and that row in the QVD data set will not be concatenated with the (non-QVD) data set.

In this way, only those rows in the QVD data that are not in the (non-QVD) data will be added to the latter.  Which is what we want.

However, things are not so easy.  The Qlik Help states:

EXISTS() determines whether a specific field value has already been loaded into the field in the load script… The field must exist in the data loaded so far by the script.

I’ve emphasised ‘so far’ as the practical impact of these two words cannot be understated.  It appears that the check whether the value of fieldName exists, is done repeatedly – after each row from the concatenating QVD file is loaded.  This means:

If you have more than one row in your (concatenating) QVD data that has the same value for fieldName, only the first row of that QVD will be concatenated and the remaining rows (having the same value for fieldName) will not be concatenated (as there is now a row in the table having that value for fieldname, so ‘NOT EXISTS’ will return FALSE and those rows will not be concatenated.

The way around this, of course, is to make sure that the values for fieldName are unique in both data sets.  This will usually mean you need to create a composite key from more than one field as, mentioned above, EXISTS only accepts a single fieldName parameter.

For another explanation, with an example, see this post:

https://community.qlik.com/docs/DOC-7020