Difference between revisions of "Secure programming"

From Lazarus wiki
Jump to navigationJump to search
Line 330: Line 330:
 
  '''if''' (Query1.Execute) '''then'''
 
  '''if''' (Query1.Execute) '''then'''
 
   ...
 
   ...
 +
 +
This technique allow the database engine to escape the parameter in a way that we could use the content without any problems of illegal characters. The down side is that we can never debug the outcome of the query. That is, we can not see how the content of ''sName'' embedded in the SQL statement, and we can never see if our query was correct because of that.

Revision as of 12:16, 8 April 2005

General Info

When developing a program, it is likely that it will interact with the user in some way, even if that means only reading files in the system and presenting the data.

Usually at schools and at universities when one starts to write programs, that person learns how to receive input, while teachers usually say to that person “assume that the data you receive is valid�?. That's when the problems begin.

From the second that a program receives an input, we can not trust any unknown input that we can not control.

Reading from a file is reading an untrusted input, and so is reading users input, or accepting input from a network for example.

Why can't I trust an input ?

In order to understand why an input is dangerous, we first need to understand what is an input.

An input can be from a key stroke, and mouse movement or mouse button clicks, or from reading and accepting information from many other ways like a data stream or even system functions.

It does not matter what is the type of input, because the user can give us wrong input, and the reasons can be intentional or a mistake. You can not control this input, and the main reason is that you can't guess what will the input be.

The result could be an empty (NULL) “data�? that the user provide us, an out of range number or bigger amount of chars we expected, or even an attempt to change the address of the variable that accepts the input from the user. We just can not know what the user is going provide.

Any “unsafe�? handle of the user input can cause for retrieving vital information that the user must not accept, and could not accept, or modification of data that the user could not do any other way, or even break the program itself.

What type of problems can we expect ?

On every type of bug you probably will find a type of attack, but I wish to give a small list of very common type of attacks, instead of writing a lot of the attack types.

The most common attack types are:

Buffer Overflow

When a given data overflows the amount of memory that was allocated for it:

var
iNums : array of integer;
....
SetLength (iNums, 10);
FillChar (iNums[-1], 100, #0); ....
for i := -10 to 10 do
readln (iNums[i]);
....

In this example we can see that for the open array of iNums we gave the ability to accept only 10 numbers, while we entered to the variable a content of 21 numbers.

If the user will try to execute an arbitrary code in one of our attempts he or she will succeed in doing so, because we went outside the buffer that was given to us. And that's a buffer overflow.

DoS Attack

Denial of Service is not only a network problem, but can exists also in many other ways:

procedure Recurse;
begin
while (True) do
begin
Recurse;
end;
end;

This procedure will run until the system will be out of resources to allocate more stack memory to run, and will cause the system to stop responding, or even crash. Altho some systems like Linux, will try to give you the ability to stop running the program, it will take a lot of time from you to do it.

Please note that this is only a static example, but we made a DoS attack on a system that will run the code.


Another known DoS attack is the lack of freeing system resources such as memory, sockets, file descriptors etc...

For example:

 ...
begin
while (True) do
begin
Getmem (OurPtr, 10);
OurPtr := Something;
end;
end.

This example displays a memory allocation (Getmem is like the C malloc), but we exit the execution without freeing the memory at the end of it's use.


Injections

When the user gives us an input, and we are working on the given input directly without sanitizing it, the user can place in some SQL tags or code (like script code, or machine code) for example, that will cause our program to delete some records/tables or send the user some restricted data such as database/table structure, database user and password, content of directory or file, or even execute a program at the computer.

A SQL injection example:

User Input:
  Please enter your name: a' OR 1=1
Inside the code:
  ...
write ('Please enter your name: ');
readln (sName);
Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32);
...

This addition of SQL statement will cause our query to add new “WHERE�? rule that can cause for data traversal or other problems that we are not always able to detect.

Myth and Assumptions

Many of the security issues exists because of ignoring important warnings and information that was given by the compiler, and by thinking that their program does not contain any problem that some one can take advantage.

Here are some examples for this type of problem:

Myths:

  • Security by Obscurity - When no one knows about a problem no one can take advantage of it.
  • Secure programming language - There are languages such as Perl that many people think that they are secure from Buffer overflows and other vulnerabilities while that does not make it so.
  • Hash password is secure - A file that have an hashed password is not secure. Hash can only passed one and you can not retrieve the original data.
  • Nothing can break my program.

Assumptions:

  • The QA team will find and fix my bugs.
  • The user will not harm my program and it's data.
  • My program will be used only for it's original use.
  • All exceptions can remain unhandled.

Explanation

Now after we know some problems we can encounter when developing programs, we should learn how to fix this problems. All of the problems we saw above manifest into two types of problems, assumptions and the lack of care programming. And for learning how to fix them, we first need to learn to think in different approach, that we have.

Overflow

For fixing overflow of data, like buffers and other type of input, we first of all need to identify the type of data we need to work with.

Buffer overflow

If we return to our example of:

var

  iNums : array of integer;

  ....

  SetLength (iNums, 10);

  FillChar (iNums[-1], 100, #0);
  ....

  for i := -10 to 10 do

     readln (iNums[i]);

  ....

We see here a range that was override by our values ,without even checking if the index number is correct.

In dynamic/open arrays in Pascal we can know the limits of the allocated memory. So all we need to do is check if the size is too small or too big for our buffer, and limit the accepting for the size we wish it to be.

So the example should be changed into:

var

  iNums : array of integer;

  ....

  SetLength (iNums, 10);

  FillChar (iNums[Low(iNum)], High(iNum), #0);
  ....

  for i := Low (iNum) to High (iNum) do

     readln (iNums[i]);

  ....

But wait Something is not right yet !

The readln will accept an unlimited amount of chars, and no one is promise us that it will be an Integer or even in the range we can handle.

Number Overflow

While string in Pascal is pure array (hrmm hrmm.. not really, at least not in FPC, but lets pretend it is for a second OK ?) so readln will try to find and see what are it's limits and will not try to overflow the range we gave that type, but Numbers are not the same.

Numbers have limits, a computer have limits of many kinds regarding memory and numbers. It can give only “small�? amount of memory for numbers (floating point and integer numbers). And many times we do not need a large range of numbers to use (like boolean variable that needs only two numbers usually).

In the above example we may have a buffer overflow that will cause a range check error that will give us the wrong number (Carry Flag reminder issues... I'm not going to explain them in here), and we also have a DoS effect, because our program will halt from that point.

So what can we do from that point ?

First of all we may wish to work in that point with a string variable that will be in the length of the largest number +1 (for minus sign), or we can create our own readln procedure/function that will specialize with the Integer type.

For the first offer we can do the following (Copied from the FPC documentation):

Program Example74;

{ Program to demonstrate the Val function. }
Var I, Code : Integer;

begin
  Val (ParamStr (1),I,Code);
  If Code<>0 then 
    Writeln ('Error at position ',code,' : ',Paramstr(1)[Code])
  else
    Writeln ('Value : ',I);  
end.

Here we see how to convert a string into a string with a very easy error handeling. The function StrToInt may also do the trick but it then we need to capture an exception in any error dealing.


Here is a small example for a small readln like procedure for integer numbers.

program MyReadln;
uses CRT;

procedure MyIntReadLn (var Param : Integer; ParamLength : Integer);
var
  Line  : string; 
  ch    : char;
  Error : Integer;
  
begin
  Line  := ;
  
  repeat
    ch := readkey;
    if (Length (Line) <> ParamLength) then
     begin
      if (ch in ['0'..'9']) then
       begin
         Line := Line + ch;
         write (ch);
       end
      else
      if (ch = '-') and (Length (Line) = 0) then
       begin
         Line := '-';
         write (ch);
       end;
      end;
      
    if (ch = #8) and (Length(Line) <> 0) then // backspace
     begin
      Line := copy (Line, 1, Length (Line) -1);
      gotoxy (WhereX -1, WhereY);
      write (' ');
      gotoxy (WhereX -1, WhereY);
     end;
  until (ch = #13);

  val (Line, Param, Error);

  if (Error <> 0) then
    Param := 0;

 writeln;
end;

var
 Num : Integer;

begin
  write ('Number: ');
  MyIntReadLn (Num, 2);
  writeln ('The number is: ', Num);
end.
 

Please note that you can make it even better, and more efficient if you wish. This is only a very small example for how to do it.

What is the security risks in Overflows ?

Overflow of memory can allow arbitrary CPU code to be executed and users may run whatever type of code they wish, and nothing can stop them.

The execution is possible by using the EIP flag that is responsible for CPU code execution.

Denail of Service

Overflow is one of the hardest types of attacks to prevent. The reasons are:

  • The denial of service can be executed even without any bug that is under exploitation like the “ping�? program.
  • Every system resource can be a possible denial of service, like opening sockets, reading files or just not freeing allocated memory when you "do not need�? the memory anymore.
  • Removal of files like a kernel module can cause a big problem.
  • Lack of configuration or wrong configuration can cause a denial of service as well.
  • Too much permissions or lack of them.
  • Almost any type of exploit can result into a denial of service.

So as you can see, a denial of service can be almost everything that can stop us from making our work as we wish to, because of exploitation or buggy code or just a program that captures the system resources..

In the above example (of the denial of service):

procedure Recurse;
begin
  while (True) do
    begin
      Recurse;
    end;
end; 

I created also a stack overflow (another type of buffer overflow), that caused the computer to arrive to a need for more memory resources to continue executing the code.

Any system resource that is available to the program can be abused by not returning it back to the system when the program “does not need it anymore�?. The keeping of system resources like memory, or sockets remove from other programs the ability to perform some of their actions. That way most programs will stop their execution and report an error, and some will hang and keep on looking for the system resources.

Please note that some of the abuse of system resources exist because of a bug in the programming, like waiting for a 150k buffer, while the actual buffer is only 2 bytes, and when the program is still looking for the 150k buffer a new request for a 150k buffer is made etc.. until the system is not able to answer any of the requests anymore (this is a known type of attack).

A good workaround for this bug is to limit how many non full buffers can be allocated at one time and if after a “timeout�? the buffer is not full, to free it completely. But also doing that, will cause a Denial of Service, because the communication will stop anyway at some point, or a slow connection can cause a lost of data.

Injection

There are many ways to inject type of code into our programs. As we saw at the above example:

User Input:
  Please enter your name: a' OR 1=1
Inside the code:
  ... 
  write ('Please enter your name: '); 
  readln (sName); 
  Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32); 
  ...

The injection occurred when we do not filter our code (sanitize is the more professional word :)), and we do not check that we received the exact type of input that we are looking for.

For example, we could check if sName have spaces, and if so, do not continue checking for the rest of the variable. The reason is very simple. The name should be only one word, and for us a word defines by letters, maybe even the tick sign (') and maybe even underscore (_) and then it's over. If we place a number, our word is over (unless we wish to use “hacker�? like language, or allow the use of numbers).

The best way to check this type of structure can done in many ways. The less effective one, but highly in use is the following:

function ValidVar (const S : AnsiString; AllowChars : TCharset) : Boolean;
var
 i : Word;
begin
 i      := 0;
 Result := True;
 
 While (Result) and (i <= Length (S)) do
  begin
     inc (i);
     Result := S [i] in AllowChars;
  end;
end;

The function return true if we have a valid structure of content given by the AllowChars in the S variable. Please note that this function is only a proof of concept and may need more work in order to be fully used.

Another good way to do the same is to use regular expression as the following (This is a Proof of concept only in the Perl language. FPC does not have a fully supported regular engine that allow to modify strings):

$sName =~ s/[^a-z0-9\_\']//gi;

The regular expression remove any non valid chars from the string and return to us only purged string. Please note that as far as I know, this regular expression will work also in ereg engines, but with minimal adjustments (g flag instruct Perl to continue the search from the position it stopped. i is for insensitive case).

Now when we know that our input is valid, we need to see what is the use of the variable content. If the variable content is going into a database, or a cgi script, or anything else that have it's own syntax, we must escape the content.

There are many ways to escape this type of content. Lets assume for now that this content is going into a query of a database. Now first of all we must make sure that our escaping will not raise above the length limits of our database fields. Because if they will, then we can cause from a data lost to a denial of server/buffer overflow problems (a respected database usually will trunk the data and sometimes not in a good location).

After we made sure that we stand in our limits, we can continue in our attempts. To escape the code we can use several approaches. A less debugging friendly way, but a sure way of correct escaping is to use the parameters technique:

Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name=?');
Query1.Parameters.Add (sName);
if (Query1.Execute) then
 ...

This technique allow the database engine to escape the parameter in a way that we could use the content without any problems of illegal characters. The down side is that we can never debug the outcome of the query. That is, we can not see how the content of sName embedded in the SQL statement, and we can never see if our query was correct because of that.