Secure programming

From Lazarus wiki
Revision as of 15:19, 24 July 2011 by BigChimp (talk | contribs) (Typos, grammar, clarification, fixed anomalous quotes)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

English (en) français (fr) polski (pl)

Forward

This wiki page is an attempt to teach a different approach in how to create software. The page uses very simple examples to show that many problems can be taken advantage of in order to create a security attack on a computer, a program or on an entire system.

Please note that the document is only a start to teach how to write better and somewhat more secure code, but it does not attempt to be a complete guide on how to do so. In fact, it's only a brief overview of how we need to see our code and program, and how to avoid many common problems our there.

Please remember that this document is about educating for better coding, not a guide for hacking and cracking programs.

General Info

When developing a program, it is likely that it will interact with the user in some way, even if that means only reading files in the system and presenting the data.

Usually at schools and at universities when one starts to write programs, that person learns how to receive input, while teachers usually say to that person "assume that the data you receive is valid". That's when the problems begin:

We can not trust any input that we can not control as its contents are unknown and might exploit a vulnerability in our software.

Reading from a file is reading an untrusted input, and so is reading users input, or accepting input from a network for example.

Why can't I trust an input ?

In order to understand why an input is dangerous, we first need to understand what is an input.

An input can be from a key stroke, and mouse movement or mouse button clicks, or from reading and accepting information from many other ways like a data stream or even system functions. In fact, anything your program gets from the outside is input.

It does not matter what is the type of input, because the user or another system can give us wrong input, and the reasons can be intentional or a mistake. You can not control this input, and the main reason is that you can't guess what will the input be.

The results could be empty (NULL) "data" that the user provides us, a number that is out of our expected range, or a larger amount of chars than we expected, or even an attempt to change the address of the variable that accepts the input from the user. We just can not know what the user is going to provide.

Any "unsafe" handling of input can cause retrieval of critical information that the user is unauthorized to see, or modification of data that the user is not permitted to do, corruption of data, or even breaking (crashing) the program itself.

What type of problems can we expect ?

On every type of bug you probably will find a type of attack, but I wish to give a small list of very common types of attacks, instead of writing a lot of the attack types.

The most common attack types are:

Buffer Overflow

When a given data overflows the amount of memory that was allocated for it:

var
iNums : array [0..9] of integer;
....
FillChar (iNums[-1], 100, #0); ....
for i := -10 to 10 do
readln (iNums[i]);
....

In this example we can see that for the static array of iNums we gave the ability to accept only 10 numbers, while we entered to the variable a content of 21 numbers.

Please note that while the compiler might warn in simple cases, it won't in more elaborate forms.

If the user can input data that is sent to the buffer, he can supply some values that can be interpreted as machine code instructions, which will be written outside our buffer. The computer could then execute this code instead of the code that should have been there. That's a buffer overflow.

DoS Attack

Denial of Service is not only a network problem, but can exist also in many other ways:

procedure Recurse;
begin
while (True) do
begin
Recurse;
end;
end;

This procedure will run until the system is out of resources as it allocates more stack memory every recursion, and will cause the system to stop responding, or even crash. Although some systems - like Linux - will try to give you the ability to stop running the program, it will take a lot of time from you to do it.

Please note that this is only a static example, but we made a DoS attack on a system running this code.


Another known DoS attack is the lack of freeing system resources such as memory, sockets, file descriptors, etc...

For example:

 ...
begin
while (True) do
begin
Getmem (OurPtr, 10);
OurPtr := Something;
end;
end.

This example displays a memory allocation (Getmem is like the C malloc: it reserves memory for use), but we exit the execution without freeing the memory at the end of its use.

Injections

When the user gives us an input, and we are working on the given input directly without sanitizing it, the user can place in some SQL tags or code (like script code, or machine code) for example, that will cause our program to perform some action (e.g. delete some records/tables, send the user some restricted data such as database/table structure, database user and password, content of directory or file, or even execute a program at the computer).

An SQL injection example:

User Input:
  Please enter your name: a' OR 1=1
Inside the code:
  ...
write ('Please enter your name: ');
readln (sName);
Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32);
...

By submitting this SQL statement as user input, it will cause our query to add OR 1=1 to the SQL statement passed to the database: in this case, this always results in true, and the user has gained unauthorized access to the program.

Myth and Assumptions

Many of the security issues exist because of ignoring important warnings and information that was given by the compiler, and by thinking that your program does not contain any exploitable problem.

Here are some examples for this type of problem:

Myths:

  • Security by Obscurity - When no one knows about a problem no one can take advantage of it; e.g. use an obscure column name for storing passwords in your database.
  • Secure programming language - There are languages such as Perl that many people think are secure from buffer overflows and other vulnerabilities while that is not true.
  • Hash password is secure - A file that has an hashed password is not secure. Hash can only passed one and you can not retrieve the original data. I don't get this. Does the author mean that a hashed password can be retrieved by a brute force or rainbow table attack and that it therefore needs a salt, or multiple hash rounds? --BigChimp 16:19, 24 July 2011 (CEST)
  • Nothing can break my program - Believing you're the only programmer in the world who writes faultless code is probably a bit optimistic. Maybe you're lucky and you just write code that doesn't work right without exploitable security vulnerabilies...

Assumptions:

  • The QA team will find and fix my security bugs.
  • The user will not attack my program and its data.
  • My program will be used only for its original use.
  • All exceptions can remain unhandled.

Solutions

Now after we know some problems we can encounter when developing programs, we should learn how to fix these problems. All of the problems we saw above manifest into two types: assumptions and lack of careful programming. And for learning how to fix them, we first need to learn to think in a different way than we have up to now.

Overflow

For fixing overflow of data, like buffers and other type of input, we first of all need to identify the type of data we need to work with.

Buffer overflow

If we return to our example of:

var
iNums : array [0..9] of integer;
....
FillChar (iNums[-1], 100, #0); ....
for i := -10 to 10 do
readln (iNums[i]);
....

We see here a range that was overflowed by our values, without even checking if the index number is correct.

In dynamic/open arrays in Pascal we can know the limits of the allocated memory. So all we need to do is check if the size is too small or too big for our buffer, and limit the accepting for the size we wish it to be.

So the example should be changed into:

var

  iNums : array [0..9] of integer;

  ....

  FillChar (iNums[Low(iNums)], High(iNums), #0);
  ....

  for i := Low (iNums) to High (iNums) do

     readln (iNums[i]);

  ....

But wait Something is not right yet !

The readln will accept an unlimited amount of chars, and no one is promise us that it will be an Integer or even in the range we can handle.

Number Overflow

While string in Pascal is pure array (hrmm hrmm.. not really, at least not in FPC, but lets pretend it is for a second OK ?) so readln will try to find and see what are it's limits and will not try to overflow the range we gave that type, but Numbers are not the same.

Numbers have limits, a computer have limits of many kinds regarding memory and numbers. It can give only “small�? amount of memory for numbers (floating point and integer numbers). And many times we do not need a large range of numbers to use (like boolean variable that needs only two numbers usually).

In the above example we may have a buffer overflow that will cause a range check error that will give us the wrong number (Carry Flag reminder issues... I'm not going to explain them in here), and we also have a DoS effect, because our program will halt from that point.

So what can we do from that point ?

First of all we may wish to work in that point with a string variable that will be in the length of the largest number +1 (for minus sign), or we can create our own readln procedure/function that will specialize with the Integer type.

For the first offer we can do the following (Copied from the FPC documentation):

Program Example74;

{ Program to demonstrate the Val function. }
Var I, Code : Integer;

begin
  Val (ParamStr (1),I,Code);
  If Code<>0 then 
    Writeln ('Error at position ',code,' : ',Paramstr(1)[Code])
  else
    Writeln ('Value : ',I);  
end.

Here we see how to convert a string into an integer with a very easy error handeling. The function StrToInt may also do the trick but it then we need to capture an exception in any error dealing.


Here is a small example for a small readln like procedure for integer numbers.

program MyReadln;
uses CRT;

procedure MyIntReadLn (var Param : Integer; ParamLength : Integer);
var
  Line  : string; 
  ch    : char;
  Error : Integer;
  
begin
  Line  := ;
  
  repeat
    ch := readkey;
    if (Length (Line) <> ParamLength) then
     begin
      if (ch in ['0'..'9']) then
       begin
         Line := Line + ch;
         write (ch);
       end
      else
      if (ch = '-') and (Length (Line) = 0) then
       begin
         Line := '-';
         write (ch);
       end;
      end;
      
    if (ch = #8) and (Length(Line) <> 0) then // backspace
     begin
      Line := copy (Line, 1, Length (Line) -1);
      gotoxy (WhereX -1, WhereY);
      write (' ');
      gotoxy (WhereX -1, WhereY);
     end;
  until (ch = #13);

  val (Line, Param, Error);

  if (Error <> 0) then
    Param := 0;

 writeln;
end;

var
 Num : Integer;

begin
  write ('Number: ');
  MyIntReadLn (Num, 2);
  writeln ('The number is: ', Num);
end.
 

Please note that you can make it even better, and more efficient if you wish. This is only a very small example for how to do it.

What is the security risks in Overflows ?

Overflow of memory can allow arbitrary CPU code to be executed and users may run whatever type of code they wish, and nothing can stop them.

Denial of Service

Denial of Service is one of the hardest types of attacks to prevent. The reasons are:

  • The denial of service can be executed even without any bug that is under exploitation like the “ping�? program.
  • Every system resource can be a possible denial of service, like opening sockets, reading files or just not freeing allocated memory when you "do not need�? the memory anymore.
  • Removal of files like a kernel module can cause a big problem.
  • Lack of configuration or wrong configuration can cause a denial of service as well.
  • Too much permissions or lack of them.
  • Almost any type of exploit can result into a denial of service.

So as you can see, a denial of service can be almost everything that can stop us from making our work as we wish to, because of exploitation or buggy code or just a program that captures the system resources..

In the above example (of the denial of service):

procedure Recurse;
begin
  while (True) do
    begin
      Recurse;
    end;
end; 

I created also a stack overflow (another type of buffer overflow), that caused the computer to arrive to a need for more memory resources to continue executing the code.

Any system resource that is available to the program can be abused by not returning it back to the system when the program “does not need it anymore�?. The keeping of system resources like memory, or sockets remove from other programs the ability to perform some of their actions. That way most programs will stop their execution and report an error, and some will hang and keep on looking for the system resources.

Please note that some of the abuse of system resources exist because of a bug in the programming, like waiting for a 150k buffer, while the actual buffer is only 2 bytes, and when the program is still looking for the 150k buffer a new request for a 150k buffer is made etc.. until the system is not able to answer any of the requests anymore (this is a known type of attack).

A good workaround for this bug is to limit how many non full buffers can be allocated at one time and if after a “timeout�? the buffer is not full, to free it completely. But also doing that, will cause a Denial of Service, because the communication will stop anyway at some point, or a slow connection can cause a lost of data.

Injection

There are many ways to inject type of code into our programs. As we saw at the above example:

User Input:
  Please enter your name: a' OR 1=1
Inside the code:
  ... 
  write ('Please enter your name: '); 
  readln (sName); 
  Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name='#32 + sName + #32); 
  ...

The injection occurred when we do not filter our code (sanitize is the more professional word :)), and we do not check that we received the exact type of input that we are looking for.

For example, we could check if sName have spaces, and if so, do not continue checking for the rest of the variable. The reason is very simple. The name should be only one word, and for us a word defines by letters, maybe even the tick sign (') and maybe even underscore (_) and then it's over. If we place a number, our word is over (unless we wish to use “hacker�? like language, or allow the use of numbers).

The best way to check this type of structure can done in many ways. The less effective one, but highly in use is the following:

function ValidVar (const S : AnsiString; AllowChars : TCharset) : Boolean;
var
 i : Word;
begin
 i      := 0;
 Result := True;
 
 While (Result) and (i <= Length (S)) do
  begin
     inc (i);
     Result := S [i] in AllowChars;
  end;
end;

The function return true if we have a valid structure of content given by the AllowChars in the S variable. Please note that this function is only a proof of concept and may need more work in order to be fully used.

Another good way to do the same is to use regular expression as the following (This is a Proof of concept only in the Perl language. FPC does not have a fully supported regular engine that allow to modify strings):

$sName =~ s/[^a-z0-9\_\']//gi;

The regular expression remove any non valid chars from the string and return to us only purged string. Please note that as far as I know, this regular expression will work also in ereg engines, but with minimal adjustments (g flag instruct Perl to replace all the matching patterns found. i is for insensitive case).

Now when we know that our input is valid, we need to see what is the use of the variable content. If the variable content is going into a database, or a cgi script, or anything else that have it's own syntax, we must escape the content.

There are many ways to escape this type of content. Lets assume for now that this content is going into a query of a database. Now first of all we must make sure that our escaping will not raise above the length limits of our database fields. Because if they will, then we can cause from a data lost to a denial of server/buffer overflow problems (a respected database usually will trunk the data and sometimes not in a good location).

After we made sure that we stand in our limits, we can continue in our attempts. To escape the code we can use several approaches. A less debugging friendly way, but a sure way of correct escaping is to use the parameters technique:

Query1.SQL.Add ('SELECT Password FROM tblUsers WHERE Name=?');
Query1.Parameters.Add (sName);
if (Query1.Execute) then
 ...

This technique allow the database engine to escape the parameter in a way that we could use the content without any problems of illegal characters. The down side is that we can never debug the outcome of the query. That is, we can not see how the content of sName embedded in the SQL statement, and we can never see if our query was correct because of that.

Usually the only escaping we need to do for using a string in a database is to escape only the ticks (') char (although some databases may have problems with more chars then ticks). So all we should do is to represent ticks in a way that will not effect the database engine, like backslash tick (\') or double every single tick to two ticks (''), or maybe even use another char that will be replace the ticks in the query and replace again when we will show it to the user.

Myth and Assumption

One of the biggest problem with myth and assumptions is that we are starting to loose the ability to write efficient code. We all need to remember that there isn't even one program that does not have bugs. But that is also an assumption :) although this assumption was never broken.

Beyond The Document

While in this document I gave a short (yea I know it's an understatement ;)) example and information on how to create better code, there are many issues that I did not touch in this document. Part of them are user privileges for execution of the programs, system root kits and other problems that our code needs to take in consideration (environment variable is only one example).

Please read more resources out there for security issues like

Buffer Overflows:

Denial Of Service:

SQL Injection: