November 1, 2004
Last week I discussed creating SQL using SQL, and promised
to show a specific application of this. Today we'll use this method
to scan a database to search for a particular string.
This actually came up again for me last week. I needed to find
all tables containing rows for a certain employee where either EMPL_RCD
or BENEFIT_RCD_NBR is not 0. This requires finding all tables containing
an EMPLID column plus either an EMPL_RCD column or a BENEFIT_RCD_NBR column,
and then searching for rows meeting the criteria.
I decided to search the PSRECFIELD and PSRECDEFN tables. PSRECFIELD gives
us the column (field) names, and PSRECDEFN allows us to select only
real tables and ignore views, derived records, etc. (This is Oracle syntax.)
SELECT 'UNION SELECT
''PS_' || F.RECNAME || ''','''
|| F.FIELDNAME || ''',COUNT(*) FROM PS_'
|| ' WHERE EMPLID = ''12345'' AND '
|| F.FIELDNAME || ' <> 0 HAVING COUNT(*) > 0'
FROM PSRECFIELD F
, PSRECDEFN R
WHERE F.FIELDNAME IN ('EMPL_RCD','BENEFIT_RCD_NBR')
AND R.RECNAME = F.RECNAME
AND R.RECTYPE = 0
WHERE RECNAME = F.RECNAME
AND FIELDNAME = 'EMPLID'
ORDER BY F.RECNAME
Running this SQL yielded the following:
UNION SELECT 'PS_ABSENCE_HIST','EMPL_RCD',COUNT(*)
FROM PS_ABSENCE_HIST WHERE EMPLID = '12345'
AND EMPL_RCD <> 0 HAVING COUNT(*) > 0
...plus about 830 more similar lines. Changing the "UNION SELECT" to
just "SELECT" on the first line results in one massive SQL
statement. In our Oracle database, containing about 12,000
employees, this ran in only about 20 seconds and produced
PS_ADDL_PAY_DATA EMPL_RCD 2
PS_ADDL_PAY_EFFDT EMPL_RCD 2
PS_ADDL_PAY_ERNCD EMPL_RCD 2
PS_BAS_PARTIC BENEFIT_RCD_NBR 2
PS_BAS_PARTIC_COST BENEFIT_RCD_NBR 144
PS_BAS_PARTIC_DPND BENEFIT_RCD_NBR 4
PS_BAS_PARTIC_ELIG BENEFIT_RCD_NBR 4
PS_BAS_PARTIC_OPTN BENEFIT_RCD_NBR 172
PS_BAS_PARTIC_PLAN BENEFIT_RCD_NBR 40
PS_BENEFIT_PARTIC EMPL_RCD 19
PS_BEN_PROG_PARTIC EMPL_RCD 1
PS_DISABILITY_BEN EMPL_RCD 2
PS_FSA_BENEFIT EMPL_RCD 2
PS_HEALTH_BENEFIT EMPL_RCD 8
PS_SAVINGS_PLAN EMPL_RCD 1
I took this information and created another script to retrieve
the actual data:
SELECT * FROM PS_ADDL_PAY_DATA WHERE EMPLID = '12345' AND EMPL_RCD <> 0
Instead of creating this last script manually, I could have
had the previous script generate it. But that would have been
a case of writing SQL to generate SQL that generates SQL. It's
possible but at some point it becomes, well, too much to think about!
[I did this in a followup
written later: see the January 31, 2005
There are two very important caveats related to this. First,
I ignored subrecords. These are useful during development but
are a big annoyance when you are trying to use the PSRECDEFN table.
One way around this difficulty is to query the database's internal
tables instead of PSRECDEFN and PSRECFIELD. Another way is to
expand the subrecords. (We would probably want to use SQR for that.)
In this particular case, I did not need to deal with subrecords.
A second important point is that you might not know the names
of all of the fields that can store a particular type of data.
For example, I was looking for an employee ID in field EMPLID.
But there are other fields that store employee IDs--SUPERVISOR_ID,
for example. Again, in this particular case, I was not
interested in any fields other than EMPLID.
I'll deal with these items in future posts.
Until next time...