We are running into the following issue with DB2 V8.1.3 on AIX 5:
We have a sample table
create table t ( c1 int, c2 varchar(20))
populated with some records:
db2 => select * from t
C1 C2
----------- --------------------
1 AAA
2 BBB
3 CCC
3 record(s) selected.
Here's an external java UDF that simply returns the argument string
and also writes it to a file on the database server:
import java.io.*;
public class FileLogWriter
{
public static String flw( String filename, String str ) throws
IOException
{
BufferedWriter wr = new BufferedWriter( new
FileWriter(file name,true));
wr.write(str);
wr.newLine();
wr.close();
return str;
}
}
CREATE FUNCTION flw(filename varchar(512), str varchar(2000))
RETURNS VARCHAR(2000)
SPECIFIC flw
LANGUAGE JAVA
PARAMETER STYLE JAVA
FENCED
THREADSAFE
DETERMINISTIC
NO SQL
EXTERNAL NAME 'FileLogWriter. flw'
@
Then we create the following view on the table t:
db2 => create view v1(c1,c3) as select c1, flw('/tmp/flw.log',c2) c3
from t
DB20000I The SQL command completed successfully.
and execute the following SQL using the view:
db2 => select c1 from v1
C1
-----------
1
2
3
3 record(s) selected.
No surprises here, but a big surprise is found in the flw.log file:
[root@aix-00 tmp]# cat XXX.log
AAA
BBB
CCC
It turns DB2 needlessly executed the UDF, ie flw(c2), in order to get
the record set on column c1 that the UDF does not refer at all!
Why DB2 optimizer desides to materialize the result set on all the
view's colums (c1 and c3 in this case) instead of just simply
rewriting the SQL to select c1 from t?
Well...it's not an issue in that basic case, but in production the
table like "t" is going to be very large and the real UDF like "flw"
implements a complex logic (writing to an external file will be taken
off from the production version of the flw UDF) so it would cause SQL
performance issues.
Any idea?
Thanks,
-Eugene
We have a sample table
create table t ( c1 int, c2 varchar(20))
populated with some records:
db2 => select * from t
C1 C2
----------- --------------------
1 AAA
2 BBB
3 CCC
3 record(s) selected.
Here's an external java UDF that simply returns the argument string
and also writes it to a file on the database server:
import java.io.*;
public class FileLogWriter
{
public static String flw( String filename, String str ) throws
IOException
{
BufferedWriter wr = new BufferedWriter( new
FileWriter(file name,true));
wr.write(str);
wr.newLine();
wr.close();
return str;
}
}
CREATE FUNCTION flw(filename varchar(512), str varchar(2000))
RETURNS VARCHAR(2000)
SPECIFIC flw
LANGUAGE JAVA
PARAMETER STYLE JAVA
FENCED
THREADSAFE
DETERMINISTIC
NO SQL
EXTERNAL NAME 'FileLogWriter. flw'
@
Then we create the following view on the table t:
db2 => create view v1(c1,c3) as select c1, flw('/tmp/flw.log',c2) c3
from t
DB20000I The SQL command completed successfully.
and execute the following SQL using the view:
db2 => select c1 from v1
C1
-----------
1
2
3
3 record(s) selected.
No surprises here, but a big surprise is found in the flw.log file:
[root@aix-00 tmp]# cat XXX.log
AAA
BBB
CCC
It turns DB2 needlessly executed the UDF, ie flw(c2), in order to get
the record set on column c1 that the UDF does not refer at all!
Why DB2 optimizer desides to materialize the result set on all the
view's colums (c1 and c3 in this case) instead of just simply
rewriting the SQL to select c1 from t?
Well...it's not an issue in that basic case, but in production the
table like "t" is going to be very large and the real UDF like "flw"
implements a complex logic (writing to an external file will be taken
off from the production version of the flw UDF) so it would cause SQL
performance issues.
Any idea?
Thanks,
-Eugene
Comment