An Introduction to Programming Paradigms

An Introduction to Programming Paradigms

A Simpler Program

“If I had more time, I would have written a shorter letter.” This witticism, sometimes attributed to Blaise Pascal, is surprising because we assume that, in writing, more time translates to more words and greater length.

There exists a similar paradox in programming. Programs with great complexity, with many moving parts and interdependent components, seem initially impressive. However, the ability to translate a real-world problem into a simple or elegant solution requires deep understanding. When writing code, therefore, we might say, “If I had more time, I would have written a simpler program.”

Unlike a bridge, a road, an office building, or some other physical construction, the only limiting factor when writing a program is cognition—the time, attention, and understanding of the programmer. For this reason, when programming, complexity is always the enemy. The more complex a program becomes, the harder it is to work with and reason about. Managing complexity is, arguably, a programmer’s main concern.

A major source of complexity in a program is “state”—basically, what a program has to keep track of as it moves forward through time. If parts of your program refer to, or change, a variable, the program can easily get into an untenable state. If you’ve ever wondered why turning your computer on and off again solves so many problems, the answer is state. If you reset the state of your machine’s memory, you’ve also resolved any ongoing state-based conflicts.

So how do programmers deal with complexity? There are many general approaches that reduce complexity in a program or make it more manageable, and arguably most of the major innovations in programming are concerned with this question. Here, though, we’ll discuss some broad approaches to managing complexity and, specifically, state. These approaches are called programming paradigms.

Programming Paradigms

A programming paradigm is a philosophy, style, or general approach to writing code. Most definitions of the term are so broad as to be fairly useless—the term tends to make more sense when discussing specific paradigms. Here, we’ll be comparing three specific paradigms: imperative, functional, and object-oriented.

If you’ve done programming in Python or C, you’ve used imperative programming. Imperative programming defines the solution to a problem as a series of steps—first do this, then do that, then do the next thing, and so on. The computer steps through each line of code, executing it and moving on to the next step. Programs written in the imperative style often resemble recipes—first crack the eggs, then mix in the flour, then add water. Imperative programs often change the state of the program on each line, assigning new variables and referring to or changing old ones. Though intuitive for solving small problems, imperative programs quickly become unmanageable as they become larger.

Functional programs deal with state by avoiding it as much as possible. Solutions are defined as a series of functions that pass values to one another, leading to a series of transformations. Parts of the program dealing with state, if any, tend to be isolated. A goal of functional programming is predictability, based primarily on the fact that functions, if given a certain input, should always return the same output.

Object-oriented programming deals with state by designating certain functions that operate specifically on that state. Objects in object-oriented programming are a combination of state, or data, with functions that work specifically on that data. Rather than isolate state from the rest of the program,, the object-oriented approach allows only certain parts of the program to operate on certain pieces of data.

In short:

  • Imperative programs have no special way of dealing with state, and tend to transform state frequently.
  • Functional programs tend to avoid and isolate state.
  • Object-oriented programs couple state with functions that work on that state.

So what do these paradigms look like in practice? Let’s write a program that examines a string of characters and answers a few questions about it. First, we’ll write the program in the imperative style. Then we’ll rewrite it in the functional style and, finally, the object-oriented style.

Our Problem

Imagine that we have a block of text in the form of a string and we want to perform a basic analysis on it. We want to know:

  1. How many words are in the string—a word count.
  2. How many occurrences of certain words there are—a concordance.
  3. What words, and how many, start with a certain letter.

In the process of looking for these answers, we’ll have to clean the string by removing commas and periods and making it lowercase. We’ll also have to split our string up into a list of words, a process known as “tokenization.”

Running the Examples

The examples in this tutorial are written in Python, a multiparadigm programming language. You can download the code as a Zip file or view the repository on GitHub. The examples have been tested with Python 3.6 and 2.7.

To run the code, open the terminal on OSX or your cmd environment on Windows and navigate to the folder containing the .py files. Run the program with

python <file-name>

where <file-name> is imperative.py, functional.py, or oo.py.

If you’re having difficulty running the examples, work through the Programming Historian’s instructions for getting started with Python. You may also wish to look at our Python tutorial for the GC Digital Research Institute.

An Imperative Solution

Remember that the imperative solution defines a series of steps to go through to solve the problem, like a recipe. Our first step is to define the data we’ll be working with—in the real world we’d probably be pulling this text in from another source, but here we’ll create a short block of text right in the program.

original_text = "Everything should be built top-down, except the first time."

Then let’s define the characters we want to remove in a list:

unwanted_characters = [',', '.']

It’s possible to loop through a string just like we loop through a list. Let’s do that and build a new string without the unwanted characters:

for character in original_text:
    if character not in unwanted_characters:
        string_without_punctuation += character

Let’s make the string all lower case:

string_lower_case = string_without_punctuation.lower()

and split the string into a list:

word_list = string_lower_case.split()

Now that we have our list, let’s start answering questions. First, how many words are there in total?

word_list_length = len(word_list)
print("Total words:", word_list_length)

Let’s answer the question of how many occurances of a certain word there are. Let’s try searching for “except.”

word_to_search = 'except'
word_match_counter = 0

for word in word_list:
    if word == word_to_search:
        word_match_counter += 1

print('Number of occurances of word match:', word_match_counter)

Finally, we need to get all the words that start with a certain character, and then check how many of those words there are:

match_character = 'e'
words_beginning_with_character = []

for word in word_list:
    if word[0] == match_character:
        words_beginning_with_character.append(word)

print("Words beginning with character:", words_beginning_with_character)
print("Number of words beginning with character:",
      len(words_beginning_with_character))

When run, our program will print out the information we want about our short piece of text. Our full imperative script can be viewed here.

Problems with an Imperative Approach

You may already be seeing some issues with the imperative approach. First, our code is pretty messy. The script does a bunch of things, and we don’t know which part of the script is dedicated to which functionality. Second, it’s not easily reusable. If we try to do another analysis, we’ll be changing variables or copy and pasting code, which violates the programming principle of DRY—don’t repeat yourself. Third, if we need to change the program, there are many parts that are dependent on other parts, which means that one change is likely to require a bunch of other changes to accommodate it.

In the next section, we’ll take a look at a more functional approach to analyzing our string.

Introducing Functional Programming

Functional programming is a programming paradigm that solves problems by moving data from function to function, resulting in a series of transformations. Functional programming recognizes that a source of complexity is state, since assigning variables means that the designer of the program, and the program itself, can’t be sure what state the variable is in when it’s used. In pure functional programming, variables are not used at all, and everything is done by passing values between functions. In Python, functional programming frequently means keeping variable assignment within functions and not defining variables outside functions. Python also supports programming in a more serious functional style, and a version of our program that uses advanced functional concepts is provided at the end of the next section.

Let’s try rewriting our analysis in a more functional style.

A Functional Solution

Let’s first move into a function the part of our script that removes unwanted characters:

def remove_characters(string, unwanted_character_list):
    "Takes unwanted characters as a list and removes them from a string."
    out_string = ''

    for character in string:
        if character not in unwanted_character_list:
            out_string += character

    return out_string

Notice that there is a string after the function definition that tells us what the function does. This is called a “docstring”—it’s used to document our functions and can help us automatically build documentation for our program after we’ve finished writing it, assuming we want to publish the code and share it with others.

Our function does much the same as the imperative code, but serves a few purposes. First, it marks out the purpose of this section of code, making it easier to know what each line of code is for. Second, it can be reused in other parts of the program. Third, it makes the transformation that it’s performing clearer—if we know what comes in and what comes out of each function, it’s easier to reason about our program and the transformations it’s making.

Let’s replicate the rest of our imperative program as functions:

def clean_string(string):
    "Process and clean a string for tokenization."
    string_without_punctuation = remove_characters(string, ['.', ','])
    string_lower_case = string_without_punctuation.lower()

    return string_lower_case


def tokenize(string, preprocess=False):
    """Make string into list of words. \
    If preprocess is True, clean the string first."""
    if preprocess:
        string = clean_string(string)

    word_list = string.split()

    return word_list


def count_word_occurances(word_list, word_to_match):
    """Returns the number of occurances of word in the string."""
    word_match_counter = 0

    for word in word_list:
        if word == word_to_match:
            word_match_counter += 1

    return word_match_counter


def words_matching_first_character(word_list, match_character):

    words_beginning_with_character = []

    for word in word_list:
        if word[0] == match_character:
            words_beginning_with_character.append(word)

    return words_beginning_with_character

These functions don’t make assumptions about what string or list of words will be processed, so they can easily be reused. Notice how the clean_string function incorporates the remove characters function. If we were to add more steps to processing our string, such as removing extra whitespace or deleting commonly used words such as articles, we could just add those steps into the clean_string function. Because we’ve written other parts of the code based on function outputs, rather than specific variables, we can make changes that are less likely to break the code as long as we make sure our functions return the same kind of output.

Notice that our tokenize function takes an argument that looks like this preprocess=False. This means that, if we leave off that argument, the tokenize function won’t clean the string first—that’s our default option. But we can also choose to add True as an argument, which means our string will be processed with our clean_string function before it is split into a list of words.

Finally, let’s use our functions to create output similar to our imperative code:

if __name__ == '__main__':
    original_text = "Everything should be built top-down, except the first time."
    tokens = tokenize(original_text, True)

    print("Total words:", len(tokens))

    print('Number of occurances of word match:',
          count_word_occurances(tokens, 'except'))

    print("Words beginning with character:",
          words_matching_first_character(tokens, 'e'))

    print("Number of words beginning with character:",
          len(words_matching_first_character(tokens, 'e')))

The if __name__ == '__main__': portion of the code means that the code after it will only be run if our script is run directly. If it’s imported like a library, this part won’t run. Mostly, this code just calls the functions we’ve defined in order to print some results.

You can review our full functional code here. Also note that, while this code is more functional than our imperative code, it does not use much in the way of advanced functional techniques. A version of the program using more abstract functional concepts such as closures and higher-order functions can be found here.

Next, we’ll be looking at object-oriented programming, which takes a very different approach to managing state in solving problems.

Introducing Object-Oriented Programming

Object-oriented programming, or OOP, doesn’t try to avoid state like functional programming does. Instead, it combines functions and state into something called an object. An object in object-oriented programming is a container for some data and functions that work on that data.

If functional programming is about keeping data and functions separate, object-oriented programming is about keeping them together.

An Object-Oriented Solution

In our functional solution, we created a set of functions that could interoperate and could pass values between them. Our object-oriented solution will create two object classes, one that processes text given to it as a string and another that manipulates a list of tokens.

Our processor will have two “methods.” A method is a special term for a function inside an object. It will also have one “attribute,” which is a special term for a variable defined inside an object. The attribute will be our original text string. The methods will clean the string (making it lower case and without periods and commas) and tokenize it (turn it into a list of words), respectively.

Our other class object, the TokenManipulator, has an attribute that is a list of words. It also has methods that operate on the list of words, such as counting matches and returning a list of the words that begin with a certain letter.

Let’s take a look at our two class definitions, which serve as templates for our objects:

class StringProcessor(object):
    def __init__(self, string):
        """Create a StringProcessor object. When creating, takes a string."""
        self.string = string

    def clean(self, string):
        out_string = ''
        unwanted_character_list = ['.', ',']

        for character in string:
            if character not in unwanted_character_list:
                out_string += character

        out_string = out_string.lower()

        return out_string

    def tokenize(self):
        """Return a TokenManipulator object
        and pass it a list of tokens from our string."""

        cleaned_string = self.clean(self.string)
        tokens = cleaned_string.split()
        return TokenManipulator(tokens)


class TokenManipulator(object):
    def __init__(self, tokens):
        """Create the TokenManipulator object.
        When creating the object, we need to give it a list of tokens."""

        self.tokens = tokens


    def length(self):
        """Return the number of tokens in the token list."""
        return len(self.tokens)

    def count_match(self, match_string):
        """Count the words in the tokens list that match match_string."""

        word_match_counter = 0

        for word in self.tokens:
            if word == match_string:
                word_match_counter += 1

        return word_match_counter

    def match_first_character(self, match_character):
        words_beginning_with_character = []

        for word in self.tokens:
            if word[0] == match_character:
                words_beginning_with_character.append(word)

        return words_beginning_with_character

Each object has an __init__ method that is run when objects are created from the class. These usually define attributes inside the object based on arguments that are passed when the object is created. When you see the word self in a class, that refers to the object itself. self.string is an attribute—that is, a variable defined inside the object.

Let’s use our classes to create objects that will answer questions about our string:

if __name__ == '__main__':
    original_text = "Everything should be built top-down, except the first time."

    processor = StringProcessor(original_text)

    tokens = processor.tokenize()

    print("Total words:",
          tokens.length())

    print('Number of occurances of word match:',
          tokens.count_match('except'))

    print("Words beginning with character:",
          tokens.match_first_character('e'))

    print("Number of words beginning with character:",
          len(tokens.match_first_character('e')))

In the object-oriented model, state is contained inside the object. Our StringProcessor object contains an attribute that represents a string. Our TokenManipulator object contains an attribute that represents a list of words. These objects also have methods, or functions defined inside them, that work on those attributes. Our clean method, for example, operates on the string attribute inside a StringProcessor object.

The idea behind object-oriented programming is not to isolate state, but instead to couple it closely with the parts of the program that work on it. This hiding of state from other parts of the program by coupling it closely with operations that work on it is called “encapsulation.”

Which Paradigm to Choose?

Some programming languages, like Python or JavaScript, are multiparadigm, and do not strongly favor imperative, functional, or object-oriented styles. In these languages, a programmer can code exclusively in one paradigm or mix and match paradigms. Other languages, like Smalltalk (object-oriented) or Haskell (functional), strongly favor or mandate the use of a particular style.

The choice of which programming paradigm to use is a highly subjective one. Many programmers strongly favor certain paradigms and stick to them whenever possible. Others choose a particular paradigm based on the problem before them. Some programmers claim that object-oriented code is particularly suited to creating graphical user interfaces, for example, while functional programming makes more sense for programs that require a high level of reliability. In general, if your program is inherently state-based, such as a game or user interface, consider object-oriented programming. If your problem can be addressed by a series of transformations or as messages being passed around by parts of a system, reach for functional programming. These guidelines are highly situational, however.

One important consideration when comparing functional and object-oriented programming is concurrency. As more cores are added to CPUs, one way to make programs faster is to break up problems so that they can be executed in parallel. Programming for concurrency or parallelism opens up a host of issues, many of which, for various reasons, are better dealt with in the functional style. Though object-oriented programming was, and to a great extent still is, the dominant paradigm since the 1980s, since the late 2000s there has been increasing interest in functional programming. As concurrency and parallelism continue to become more critical, functional programming is likely to become more widely used over time.

A Few Notes

There are a number of other programming paradigms besides imperative, functional, and object-oriented. Logic programming, for example, defines a program in terms of a set of formal propositions.

Programming paradigm is a loosely defined concept, and many paradigms overlap one another. For example, declarative programming encompasses functional and logic programming, and is defined mostly in contrast to imperative programming. Ultimately, a programming paradigm is simply a big idea in programming, one that completely changes how we think about and write programs. Paradigms that were frequently discussed in the mid-20th century, such as structured programming, are rarely discussed today because their tenets have been taken up by the majority of programmers and programming languages.

Learning More

If you’re interested in functional or object-oriented programming and want to learn more, you may wish to read up on some of these concepts.

Functional Concepts

Higher-order functions – Functions that take functions as arguments. The ability to create higher-order functions leads to some interesting techniques.
Closures – A function that, when you give it some information and call it, returns another function. Basically, a function that allows you to create specialized functions on demand.
Referential transparency / functional purity – The idea that, when given a certain set of arguments, a function should always give the same output, no matter the context.
Recursion – In programming, the ability of a function to be defined in terms of itself. A frequently-used technique in functional languages.
Immutability – The idea that a variable’s value must not change after it is set.
MapReduce – A technique for parallel computation based loosely on two functional programming concepts, map and reduce.
Lambda calculus – A formal mathematical and computational system based entirely on functions and substitutions.

Functional Languages

Lisp – A family of languages with a minimal, tree-like syntax. The idea of Lisp is based on the lambda calculus, and these languages encourage, but don’t mandate, functional programming.
Haskell – This programming language enforces functional purity and has static type checking, allowing for the creation of highly reliable programs.
OCaml – Functional language based on the older ML functional language. Like Haskell, has static type checking. Allows use of other paradigms, such as object-oriented.

Object-Oriented Concepts

Class – An abstraction or template from which objects are created.
Instantiation – Creating an object based on a class.
Attribute – A variable defined in a class or object. Also called fields, members, or class variables.
Encapsulation – In object-oriented programming, requiring that access to attributes of an object go through designated methods. A technique for hiding parts of your program from other parts of your program.
Method – A function inside a class or object.
Polymorphism – Basically, the idea that different types or objects with different functionality will have the same interface.
Inheritance – The ability of a more specific class to copy certain methods or attributes from a more general class. The Cat class, for example, might inherit from the Animal class, meaning that Cat objects will have access to the methods and attributes in the Animal class.

Object-Oriented Languages

Smalltalk – The original object-oriented language. Highly focused on exploring a live environment of objects in memory. Not widely used today, but worth learning if interested in object-oriented programming.
Ruby – A readable, high-level, object-oriented language often used for web development in conjunction with the Ruby on Rails framework. Emphasizes practicality and usability.
Java – A primarily object-oriented language that runs in a specialized environment, the Java Virtual Machine. A large, popular, and relatively verbose language often used in corporations and taught in computer science courses. Also used for Android development.