Defensive programming

Defensive programming

Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software in spite of unforeseeable usage of said software. The idea can be viewed as reducing or eliminating the prospect of Murphy's Law having effect. Defensive programming techniques are used especially when a piece of software could be misused mischievously or inadvertently to catastrophic effect.

Defensive programming is an approach to improve software and source code, in terms of:

  • General quality - Reducing the number of software bugs and problems.
  • Making the source code comprehensible - the source code should be readable and understandable so it is approved in a code audit.
  • Making the software behave in a predictable manner despite unexpected input or user actions.

Contents

  • 1 Secure Programming
  • 2 Some Defensive programming techniques
    • 2.1 Reduce source code complexity
    • 2.2 Source code reviews
    • 2.3 Software testing
    • 2.4 Intelligent source code reuse
      • 2.4.1 The legacy problems
    • 2.5 Secure Input / Output Handling
    • 2.6 Canonicalization
    • 2.7 Principle of least privilege
    • 2.8 Low tolerance against "potential" bugs
    • 2.9 Other techniques
  • 3 References
  • 4 See also
  • 5 External links

Secure Programming

Defensive programming is sometimes referred to as secure programming by computer scientists who state this approach minimizes bugs[citation needed]. Software bugs can be potentially used by a cracker for a code injection, denial-of-service attack or other attack.

A difference between defensive programming and normal practices is that few assumptions are made by the programmer, who attempts to handle all possible error states. In short, the programmer never assumes a particular function call or library will work as advertised, and so handles it in the code. An example follows:

int low_quality_programming(char *input){
char str[1000+1]; // one more for the null character
strcpy(str, input); // copy input
...
}

The function will crash when the input is over 1000 characters. Many mainstream programmers may not feel that this is a problem because "Surely no one will enter that long of an input!". A programmer practicing defensive programming would not allow the bug, because if the application contains a known bug, Murphy's Law dictates that the bug will occur in use. This particular bug demonstrates a vulnerability which enables buffer overflow exploits. Here is a solution to this example:

int high_quality_programming(char *input){
char str[1000+1]; // one more for the null character
strncpy(str, input, 1000); // copy input, only copy a maximum length of 1000 characters
str[1000] = '\0'; // add terminating null character
...
}

Some Defensive programming techniques

Here are some defensive programming techniques suggested by some leading computer scientists[citation needed] to avoid creating security problems and software bugs[citation needed] These computer scientists state that while this process can improve general quality of code, it not sufficient to ensure security[citation needed]. See the articles computer insecurity and secure computing for more information.

Defensive software programming principles described by leading proponents[citation needed] include:

Reduce source code complexity

Never make code more complex than necessary. Complexity breeds bugs, including security problems. This goal can conflict with the goal of writing programs that can recover from any error and handle any user input. Handling all unexpected occurrences in a program requires the programmer to add extra code, which may also contain bugs.

Source code reviews

A source code review is where someone other than the original author performs a code audit. A do-it-yourself security audit is insufficient: the review must be made by a non-author, just as when writing a book, it must be proofread by someone other than the author.

Simply making the code available for others to read (see Free software or Open Source Definition) is insufficient: there is no guarantee that the code will ever be looked at, let alone that it will be rigorously reviewed.

Software testing

Software testing should include both whether the software works as intended, and what is supposed to happen when deliberately bad input is supplied.

Testing tools can capture keystrokes associated with normal operations, then the captured keystroke strings can be copied and edited to try out all permutations of combinations, then extended for later tests after any modifications. Proponents of key logging state[citation needed] that programmers who use this method should make sure that the people whose keystrokes are being captured are aware of this, and for what purpose, to avoid accusations of privacy violation.

Intelligent source code reuse

If possible, reuse code instead of writing from scratch. The idea is to capture the benefits of well written and well tested source code, instead of creating unnecessary bugs.

However, re-using code is not always the best way to go forward, particularly when business logic is involved. Reuse in this case may cause serious business process bugs.

The legacy problems

Before reusing old source code, libraries, APIs, configurations and so forth, it must be considered if the old work is valid for reuse, or if it is likely to be prone to legacy problems.

Legacy problems are problems inherent when old designs are expected to work with today's requirements, especially when the old designs were not developed or tested with those requirements in mind.

Many software products have experienced problems with old legacy source code, for example:

  • Legacy code may not have been designed under a Defensive programming initiative, and might therefore be of much lower quality than newly designed source code.
  • Legacy code may have been written and tested under conditions which no longer apply. The old quality assurance tests may have no validity any more. Example 1: legacy code may have been designed for ASCII input but now the input is UTF-8. Example 2: legacy code may have been compiled and tested on 32-bit architectures, but when compiled on 64-bit architectures new arithmetic problems may occur (e.g. invalid signedness tests, invalid type casts, etc.). Example 3: legacy code may have been targeted for offline machines, but becomes vulnerable once network connectivity is added.
  • Legacy code is not written with new problems in mind. For example, source code written about 1990 is likely to be prone to many Code injection vulnerabilities, because most such problems were not widely understood at that time.

Notable examples of the legacy problem:

  • BIND 9, presented by Paul Vixie and David Conrad as "BINDv9 is a complete rewrite", "Security was a key consideration in design" [1], naming security, robustness, scalability and new protocols as key concerns for rewriting old legacy code.
  • Microsoft Windows suffered from "the" Windows Metafile vulnerability and other exploits related to the WMF format. Microsoft Security Response Center describes the WMF-features as "Around 1990, WMF support was added... This was a different time in the security landscape... were all completely trusted" *, not being developed under the security initiatives at Microsoft.
  • Oracle is combating legacy problems, such as old source code written without addressing concerns of SQL Injection and privilege escalation, resulting in many security vulnerabilities which has taken time to fix and also generated incomplete fixes. This has given rise to heavy criticism from security experts such as David Litchfield, Alexander Kornbrust, Cesar Cerrudo (1,2,3). An additional criticism is that default installations (largely a legacy from old versions) are not aligned with their own security recommendations, such as Oracle Database Security Checklist, which is hard to amend as many applications require the less secure legacy settings to function correctly.

Secure Input / Output Handling

Managing input is a difficult problem, which is detailed in Secure input and output handling.

Canonicalization

Crackers are likely to invent new kinds of representations of incorrect data.

For example, if you checked if a requested file is not "/etc/passwd", a cracker might pass another variant of this file name, like "/etc/./passwd".

To avoid bugs due to non-canonical input, employ Canonicalization API's.

Principle of least privilege

Employ Principle of least privilege. Avoid having software running in a privileged mode if possible;

  • Never make UNIX programs setuid unless you're really sure they are secure.
  • Never make Windows programs run as Local System service unless you're really sure they are secure.
  • Don't grant more permissions than necessary to large user groups or public/everyone.
  • Don't grant more permissions than necessary to small user groups or specific users.
  • Prefer granting permissions to small user groups or specific users, rather than granting permissions to large user groups or public/everyone. It is better if a few users have high permissions, rather than many users having high permissions.

Low tolerance against "potential" bugs

Assume that code constructs that appear to be problem prone (similar to known vulnerabilities, etc.) are bugs and potential security flaws. The basic rule of thumb is: "I'm not aware of all types of security exploits. I must protect against those I do know of and then I must be proactive!".

Other techniques

  • One of the most common problems is unchecked use of constant-size structures and functions for dynamic-size data (the buffer overflow problem). This is especially common for string data in C. C library functions like gets should never be used since the maximum size of the input buffer is not passed as an argument. C library functions like scanf can be used safely, but require the programmer to take care with the selection of safe format strings, by sanitising it before using it.
  • Encrypt/authenticate all important data transmitted over networks. Do not attempt to implement your own encryption scheme, but use a proven one instead.
  • All data is important until proven otherwise.
  • All data is tainted until proved otherwise.
  • All code is insecure until proven otherwise.
    • You cannot prove the security of any code in userland, or, more canonically: "never trust the client".
  • If data are checked for correctness, verify that they are correct, not that they are incorrect.
  • Design by Contract
    • Design by contract uses preconditions, postconditions and invariants to ensure that provided data (and the state of the program as a whole) is sanitized. This allows code to document its assumptions and make them safely. This may involve checking arguments to a function or method for validity before executing the body of the function. After the body of a function, doing a check of object state (in Object-oriented programming languages) or other held data and the return value before exits (break/return/throw/error code) is also wise.
  • Assertions
    • Within functions, you may want to check that you are not referencing something that is not valid (i.e., null) and that array lengths are valid before referencing elements, especially on all temporary/local instantiations. A good heuristic is to not trust the libraries you did not write either. So any time you call them, check what you get back from them. It often helps to create a small library of "asserting" and "checking" functions to do this along with a logger so you can trace your path and reduce the need for extensive debugging cycles in the first place. With the advent of logging libraries and Aspect Oriented Programming, many of the tedious aspects of defensive programming are mitigated.
  • Prefer exceptions to return codes
    • Generally speaking, it is preferable to throw intelligible exception messages that enforce part of your API contract and guide the client programmer instead of returning values that a client programmer is likely to be unprepared for and hence minimize their complaints and increase robustness and security of your software.

No comments: