DataStage Range Lookup Failure – What Gives?
I just spent the majority of my morning trying to figure out why in the world my range lookup kept failing. Hopefully this will save you some time in troubleshooting your range lookup failures. To cut right to the chase, my resolution involved sorting my reference source prior to doing the lookup. Adding a sort stage with the appropriate sort keys prior to doing the lookup solved my problem. I probably learned this in a training at some point, but I just didn’t retain it. So I learned it again the hard way.
For those who’d like to see a bit more of the details involved, feel free to have a look below.
Scenario:
I have a list of funding transactions in my data pipeline that lack some information, namely, the student’s major at the time he/she was funded. They key phrase there is “at the time he/she was funded”.
To assist in supplementing my fact record (the funding transaction) with this student demographic information, I have constructed a snapshot table in the data warehouse that captures a student’s major as a sort of “slowly changing dimension”. Each time a change in the student’s major is detected, new records are inserted and effective and expiration dates are adjusted. I’d like to supplement my funding record with this student major information by doing a lookup. Here is the information I have to work with, just to be clear:
My source data pipeline starts with my funding transactions. It contains
- Student ID (Key)
- Transaction Date (will help me with my lookup)
-
Other columns that aren’t relevant to this particular task
-
-
Student ID (Key)
- Effective Date of the student’s major (will help with lookup)
-
Expiration Date of the student’s major (will help with lookup)
- Student’s major (the information I want to use to supplement my fact row)
Range Lookup (My [failing] Setup):
To accomplish my goal of supplementing the fact record with the student’s major, I needed to do a range lookup. Below are sample screenshots of how I was attempting to set things up:
View of the job design surface:
View of Lookup Stage:
Notice that I use StudentID as a key so that I match the funding row to a particular student. Now, to get his/her major at the time they were funded, I do a range lookup on the DateFunded column…
View of Range Lookup on DateFunded:
After compiling the job and running it, I would receive this error message:
Lookup_StudentFieldOfStudy,0: Job aborting due to lookup failure on link: List_StudentFieldOfStudy
Problem Solved:
As stated in the introduction, to fix this problem, I simply added a sort stage prior to the lookup stage on the link serving as my reference (that is, on the link containing the value I’m trying to look up, which in my case was the student’s major/field of study). Here are some screen shots of how my job looked after I fixed it:
View of the job design surface:
View of Sort Stage:
An important note about the sort stage. I needed to include two sort keys – one to sort the students, and another to sort the effective dates of each student’s major(s).
Everything else in the job remained exactly the same. The results were a successfully running job with no lookup failures.
-
- Effective Date of the student’s major (will help with lookup)
- Transaction Date (will help me with my lookup)