Oracle PL/SQL language pitfalls
PL/SQL and Ada are both block structured languages where each block can be divided into roughly three sections. The declaration for specifying block-local variables, sub-program units, records etc. The code section containing the algorithmic code and an exception section where exceptions from the current block can be caught and handled. Further language features like package specifications, object oriented programming etc. will not be discussed in this post as they are not required to demonstrate any of our problems. Single line comments in Ada and PL/SQL start with double dash (--), multi-line comments will not be used by us.
An example PL/SQL block:
<<example_block>> -- optional label
declare
-- variable declarations
-- sub program units (functions/procedures)
begin
-- code
exception
when exception_name then -- handle a named exception
handle_exception();
when others then -- catch all exceptions handler
handle_all_exceptions();
end example_block;
The usage of 'end example_block' is only valid if the block was <<labeled>> otherwise just use 'end;'.Since we got our basics down, we can start with our first problem.
Weak compile time checks
The longer I work on large legacy code bases the more I appreciate errors caught during compilation.
In the following example, a
VARCHAR2
variable is declared with a length equal to one. You can think about the VARCHAR2
type as string. In the declaration section a constant literal 'abc'
is assigned to this string which exceeds it's capacity.
set serveroutput ON
DECLARE
PROCEDURE test_literal IS
v_string VARCHAR2(1) := 'abc';
BEGIN
dbms_output.put_line( v_string );
END;
BEGIN
dbms_output.put_line('Before Call');
test_literal();
dbms_output.put_line('After Call');
END;
/
From the above code, it is obvious that there is a bug which can be determined on compilation - sadly PL/SQL compiles the code without any warnings.
We have declared a local procedure in this block to demonstrate that the issue really arises on runtime and not during compilation.
Save the code to
literal_string.sql
and run it in a SQL Plus session.
SQL> @C:\blog\literal_string.sql
Before Call
ORA-06502: PL/SQL: numeric or value error: character buffer too small
ORA-06512: at line 4
ORA-06512: at line 10
As you probably guessed, this resulted in a run-time exception being raised (ORA-06502). The same problem extends to numbers exceeding the defined precision (ie.
NUMBER(2)
assigned 100 as a value). Since the value is literal it would be nice if the language at least reported it during compilation. Let's see how a similar example behaves in Ada. with Ada.Text_IO;
procedure Literal_String
is
v_string : String (1 .. 1) := "abc";
begin
Ada.Text_IO.Put_Line ( v_string );
end Literal_String;
Save the file to
literal_string.adb
and compile with gnatmake -gnato literal_string.adb
.
mulander@bunkier_mysli:~/code/blog/plsql-pitfalls$ gnatmake -gnato literal_string.adb
gcc -c -gnato literal_string.adb
literal_string.adb:5:33: warning: wrong length for array of subtype of "Standard.String" defined at line 5
literal_string.adb:5:33: warning: "Constraint_Error" will be raised at run time
gnatbind -x literal_string.ali
gnatlink literal_string.ali
The
-gnato
compiler flag enables run-time overflow checks but it doesn't affect this example.The compiler nicely warned us that the length of the assigned string exceeds the allowed range for the variable declared at line 5. The compiler even informed us which exception will be raised on run-time. Adding a
-we
flag would treat all warnings as errors and wouldn't even produce an executable. We didn't supply the flag so let's run the program.
mulander@bunkier_mysli:~/code/blog/plsql-pitfalls$ ./literal_string
raised CONSTRAINT_ERROR : literal_string.adb:5 length check failed
mulander@bunkier_mysli:~/code/blog/plsql-pitfalls$
We got our runtime exception - exactly the one we were warned about. This may seem trivial but trust me - things like this can sometimes save hours spent on handling a runtime error on a production system.
Ada also has a nice syntax for string literal declarations that infers the length from the assignment.
with Ada.Text_IO;
procedure Literal_String_2
is
v_string : String := "abc";
begin
Ada.Text_IO.Put_Line ( v_string );
end Literal_String_2;
Save this to
literal_string_2.adb
and compile with gnatmake -gnato literal_string_2.adb
. I'll skip the output as the program behaves correctly and no exceptions are raised.What happens if we assign a literal exceeding the inferred length?
with Ada.Text_IO;
procedure Literal_String_3
is
v_string : String := "abc";
begin
v_String := "abcd";
Ada.Text_IO.Put_Line ( v_string );
end Literal_String_3;
Save the content to
literal_string_3.adb
and compile with gnatmake -gnato literal_string_3.adb
.
mulander@bunkier_mysli:~/code/blog/plsql-pitfalls$ gnatmake -gnato literal_string_3.adb
gcc -c -gnato literal_string_3.adb
literal_string_3.adb:7:15: warning: wrong length for array of subtype of "Standard.String" defined at line 5
literal_string_3.adb:7:15: warning: "Constraint_Error" will be raised at run time
gnatbind -x literal_string_3.ali
gnatlink literal_string_3.ali
Again we will skip the output. The proper
Constraint_Error
exception is raised like the compiler promised. You can't imagine how happy I am seeing this behavior. At this point we can move on to our second problem.Implicit type conversions
Implicit type conversions are often hard to track down and can lead to significant problems during application run-time. One of the most severe problems caused by them are hard to predict choices when overloaded functions/procedures are involved - this time it's not I what I want to focus on. The specific bug I encountered is really interesting because of the innocent code it appears in and an exception that could be avoided with a more strict compilation phase.
set serveroutput on
DECLARE
n_first NUMBER := 1;
n_second NUMBER := 2;
BEGIN
dbms_output.put_line('Before faulty code');
dbms_output.put_line('n_first + n_second ' || n_first + n_second );
dbms_output.put_line('After faulty code');
END;
/
Imagine for a moment that the middle output statement is a debug log added and left by a programmer in an important part of the system. Let's run the code in SQL Plus:
SQL> @C:\blog\concat.sql
Before faulty code
ORA-06502: PL/SQL: numeric or value error: character to number conversion error
ORA-06512: at line 7
Why did this code result in a run-time error? My guess is operator precedence rules and implicit type conversions. From the linked documentation page we can see that +, - and || (addition, subtraction and concatenation operators) have equal precedence and as stated in the document such operators are applied in no particular order. So what happened?
The following line:
dbms_output.put_line('n_first + n_second ' || n_first + n_second );
Can be further simplified to:
procedure_call( string_type CONCAT number_type ADD number_type )
We know that before the function call the operators will be evaluated so we can drop the call:
string_type CONCAT number_type ADD number_type
Now, since both concatenation and addition are of equal precedence we can start with an assumption that concatenation was performed first (leftmost operation). So number_type was converted to a string and concatenated to our first string_type.
string_type ADD number_type
Now, we are left with an addition operation of a string and a number. Since adding a number to a string doesn't make any sense for the compiler an implicit string to number conversion happens. This operation fails with an obvious run-time exception. A really nasty bug hard to find in such an innocent statement.
So how does Ada handle this problem? Easy, no implicit type conversions.
with Ada.Text_IO;
procedure Concat is
n_first : Integer := 1;
n_second: Integer := 2;
begin
Ada.Text_IO.Put_Line ( "n_first + n_second" & n_first + n_second );
end Concat;
Save the file to
concat.adb
and compile with gnatmake -gnato concat.adb
mulander@bunkier_mysli:~/code/blog/plsql-pitfalls$ gnatmake -gnato concat.adb
gcc -c -gnato concat.adb
concat.adb:7:47: invalid operand types for operator "&"
gnatmake: "concat.adb" compilation error
The file doesn't even compile. The concatenation operator
&
expects a string as it's left and right value. The language will not attempt any implicit conversions even in a simple case like string_type CONCAT number_type. If you want to achieve concatenation of numbers with strings in Ada you need to explicitly cast them:
with Ada.Text_IO;
procedure Concat is
n_first : Integer := 1;
n_second: Integer := 2;
begin
Ada.Text_IO.Put_Line ( "n_first + n_second"
& Integer'Image(n_first + n_second) );
end Concat;
This time the code compiles without errors and runs with no exceptions raised. It's true that mitigating the problem in PL/SQL is also simple. All one needs to do is to surround the addition of variables with additional parentheses. I still prefer being informed on compilation - even with just a warning. PL/SQL type conversion problems are not limited to simple cases like this but extend to built-in functions. The
TO_DATE
function when converting a string to a date type can result in an exception being raised which only depends on two variable. The programmer forgetting to specify the string format and the national language support database level variable set to a value which expects a string in a different format than in the provided variable. Bugs like this are annoying to find.We are done with problems that can be compared to Ada directly. The following sections will only relate to PL/SQL.
Too_Many_Rows exception
Generally the problem with the too_many_rows exception is the moment it's raised. It's best to start with an example:
set serveroutput on
DECLARE
n_some_value NUMBER := -1e10;
BEGIN
BEGIN
-- The following rowset will be returned by
-- this select statement:
-- 1
-- 2
-- This will raise a too_many_rows exception
SELECT LEVEL
INTO n_some_value
FROM dual
CONNECT BY LEVEL <= 2
;
EXCEPTION
WHEN too_many_rows THEN
dbms_output.put_line( 'too_many_rows exception' );
NULL; -- Typical wrong way to handle this case
END;
-- So what's the value of n_some_value?
dbms_output.put_line( 'n_some_value='|| n_some_value );
END;
/
n_some_value
is being initialized to a constant -1e10
then a select into statement performs a query which always returns two rows of data and tries to save it into n_some_value
. Raising the too_many_rows
exception is of course correct in this case. The problem is at which point is it raised? Try to guess the value of n_some_value
printed out at the end of the code and compare it to the actual result:
SQL> @C:\blog\too_many_rows.sql
too_many_rows exception
n_some_value=1
PL/SQL procedure successfully completed
You probably expected to see
n_some_value=-10000000000
on the output right? We got 1 instead. What happened here is the query got executed, the first row was fetched and stored in the variable n_some_value
. Since there is more data on the cursor (select into is automatically converted by the database into a cursor fetch) the code is attempting to fetch the second row and in a select into statement only single row queries are allowed so the too_many_rows
exception gets raised. Since our variable was set during the first fetch it no longer holds the initialization value of -1e10
. This is an easy bug to fix (reinitialize the variable in the exception handler) but it's hard to track down if the error handling code was incorrect or missed this reinitialization in the first place.Code optimization
The title of this section might be a bit misleading. I love the fact that the compiler tries hard to optimize away unnecessary code - the problem is that it's hard to sometimes guess what could get optimized away and in specific use cases this can change actual program behavior leading to Heisenbugs.
Please keep in mind, that the only time I hit this bug was a really bad programming approach so the language can't really be blamed for making a wrong decision here but still at least it could be a bit more predictable.
SET serveroutput ON
DECLARE
v_date_string VARCHAR2(8) := '20110229'; -- only 28 days in feb 2011
b_valid_date NUMBER := 1;
invalid_date EXCEPTION;
-- introducing ORA-01839 exception by name
pragma exception_init(invalid_date,-01839);
BEGIN
BEGIN
IF TO_DATE ( v_date_string, 'yyyymmdd' ) =
TO_DATE ( v_date_string, 'yyyymmdd' )
THEN
NULL; -- do nothing
--dbms_output.put_line('oops');
END IF;
EXCEPTION
WHEN invalid_date THEN
b_valid_date := 0;
END;
-- an important block of logic executed only if the date was valid
IF b_valid_date = 1
THEN
dbms_output.put_line( '... executing ...' );
ELSE
dbms_output.put_line( '+++ ERROR - invalid date +++' );
END IF;
END;
/
Save this snippet to
opt_away_simple.sql
and run it in SQL Plus.
SQL> @C:\blog\opt_away_simple.sql
... executing ...
PL/SQL procedure successfully completed
SQL>
In this and all of the following snippets we expect to see
+++ ERROR - invalid date +++
as the output of the program. Here is the convoluted logic that was supposed to guarantee date validation. The oracle TO_DATE(string,format)
function can convert a string in date format to a date type. If the date matches the format but the date is incorrect (ie. 29 Feb 2011) an ORA-01839 exception will be raised which stands for date not valid for month specified. In the exception handler a control variable is set to 0 marking that the date was not valid. Now try to uncomment the put_line statement inside the IF
control structure. Running with the uncommented code will give us the expected behavior. See how nasty that would be to debug?To the point. Why is the code behaving this way? The PL/SQL compiler decided that the whole begin/exception/end block containing the if statement and exception handling code is not needed hence it was optimized away during compilation. The code literally does not exist after compilation. Here is my guess why:
The
TO_DATE()
function has no side effects. It doesn't modify any tables, doesn't return any values that get stored to a variable and doesn't insert any rows to tables. The only implicit side effect it has is a possible exception being raised if the provided date string is incorrect. The condition in the if statement also has no side effects except for the possible exception from TO_DATE()
and at both sides of the equality comparison the input parameters to TO_DATE()
are exactly the same. Since the function is deterministic (always returns the same output for a given set of input parameters) for the compiler this look like code that always evaluates to truth. The code section for the if statement is NULL
which stands for a do nothing operation. In this case the whole if is a do nothing operation so the compiler decides to remove it. Since there is no longer any code that can raise the invalid_date exception the control flag will not be set hence our unexpected behavior.Now here is the really funny part. Let's modify the code to use a boolean variable instead of a number. No other changes to the code and all of the above still remains valid (my reasoning for optimizing away the code).
SET serveroutput ON
DECLARE
v_date_string VARCHAR2(8) := '20110229'; -- only 28 days in feb 2011
b_valid_date BOOLEAN := TRUE;
invalid_date EXCEPTION;
-- introducing ORA-01839 exception by name
PRAGMA exception_init(invalid_date,-01839);
BEGIN
BEGIN
IF TO_DATE( v_date_string, 'yyyymmdd' ) =
TO_DATE( v_date_string, 'yyyymmdd' )
THEN
NULL; -- do nothing
--dbms_output.put_line('oops');
END IF;
EXCEPTION
WHEN invalid_date THEN
b_valid_date := FALSE;
END;
-- an important block of logic executed only if the date was valid
IF b_valid_date
THEN
dbms_output.put_line( '... executing ...' );
ELSE
dbms_output.put_line( '+++ ERROR - invalid date +++' );
END IF;
END;
/
Save this snippet to
opt_away_bool.sql
and run it in SQL Plus.
SQL> @C:\blog\opt_away_bool.sql
+++ ERROR - invalid date +++
PL/SQL procedure successfully completed
SQL>
Now this is interesting. Changing just the type of our control variable made the code behave correctly - I really have no idea why in this case the code was not optimized away like when we used a number.
Here is an even more interesting example. Let's change the if statement checking the control variable from
IF b_valid_date
to IF b_valid_date = TRUE
. SET serveroutput ON
DECLARE
v_date_string VARCHAR2(8) := '20110229'; -- only 28 days in feb 2011
b_valid_date BOOLEAN := TRUE;
invalid_date exception;
-- introducing ORA-01839 exception by name
pragma exception_init(invalid_date,-01839);
BEGIN
BEGIN
IF TO_DATE( v_date_string, 'yyyymmdd' ) =
TO_DATE( v_date_string, 'yyyymmdd' )
THEN
NULL; -- do nothing
--dbms_output.put_line('oops');
END IF;
EXCEPTION
WHEN invalid_date THEN
b_valid_date := FALSE;
END;
-- an important block of logic executed only if the date was valid
IF b_valid_date = TRUE
THEN
dbms_output.put_line( '... executing ...' );
ELSE
dbms_output.put_line( '+++ ERROR - invalid date +++' );
END IF;
END;
/
Save this snippet to
opt_away_bool_bad.sql
and run it in SQL Plus.
SQL> @C:\blog\opt_away_bool_bad.sql
... executing ...
PL/SQL procedure successfully completed
SQL>
We are back to our incorrect behavior just with a single line change outside the scope of the optimized away block. Sadly I am not aware of any utilities that could help us review the decisions made by the compiler. The last two snippets will remain a mystery for some time.
Summary
I really despised the language at first but as time passed by I really started to like large parts of it. PL/SQL is not a bad language but it's a shame that even though in large part it was based on Ada, the language designers decided to skip some of it's features - especially the really strong type system. I can imagine that some of the gripes with the language remain in order to maintain backwards compatibility and that new language features are slowly being added in. I have a final suggestions for everyone working with a language that isn't considered the cool kid in the industry/community: try to understand the inner workings of the tool you use - it may bring new interest to an otherwise mundane task.