Object-Oriented Programming - Part 1 Data Access and Encapsulation

I will be writing a series on some general OOP (Object-Oriented Programming) concepts. This is the first part.

This series assumes you understand the syntax and semantics of classical OOP concepts such as classes and inheritance already. Because I use PHP most often, and because it will allow me to show a few examples in a non-OOP style, I will give the examples in PHP. But the general concepts apply to most modern general-purpose programming languages. The principles we'll follow also apply to C++, Java, C#, Objective-C, Ruby, Python, and even JavaScript (in some strange ways, and most accessibly in ES6), among many others.

In this first part, we'll examine some of the original motivation behind OOP, and how it relates to property/method access modifiers (public, private, and protected in most languages).

One of the primary motivations behind OOP originally was to separate related data and the functions (i.e. methods) that act on them from other unrelated data/functions. You can imagine having only variables and no objects to work with on an application-wide basis.

This brings up a special point about classes. If I define a variable (global or in some other scope), that variable can only have one value in that scope. But when you define a class, you are defining a template for a new set of variables to be created. It's similar in some ways to how functions work at a meta-level (except that a function is a template of behavior).

The alternative would be to define a separate variables for each member property and new functions for each method, all of which are visible in the scope that I defined them in.

Let's give a quick example:

$product1Price    = 2.00;
$product1Title    = 'bacon';
$product1Qty      = 2;
$product1Discount = 0;

$product2Price    = 14.00;
$product2Title    = 'steak';
$product1Qty      = 1;
$product1Discount = 10;

function getProductsTotal ()
{
  global $product1Price;
  global $product1Qty;
  global $product1Discount;

  global $product2Price;
  global $product2Qty;
  global $product2Discount;

  return (($product1Price * $product1Qty) - $product1Discount)
    + (($product2Price * $product2Qty) - $product2Discount);
}

$total = getProductsTotal();

There are obviously some ways I could make this more flexible with arrays and by adding parameters to the function, but I'm just illustrating a point. The data and the function that acts on it have no separation from the rest of the application. I could write another function somewhere else that totally screws up my day:

function randomOtherFunction ()
{
  global $product1Price;

  // do something useful

  $product1Price = 12.00;
}

Of course, this would be a stupid thing to do. But in a real application, it becomes a lot to keep track of. Other completely unrelated parts of your application could be changing your data and you have no idea about it. What we want to do is isolate data from the rest of the application, then define only specific ways it can accessed or changed, all in the same place. We call this encapsulation.

public, private, protected.

PUBLIC

Public access should be used only for methods you absolutely want to expose to the outside world (outside of the class). This is called the class's interface. It's best to expose as little as possible, because the more you expose, the more ways other parts of your code become dependent on (or "tightly-coupled" with) changes within this isolated part of your application. If you make a change, you're going to have to look for and reflect that change in more places.

The reason we don't want to expose any properties of the class publicly is because even if right now our class is using and accessing data a certain way, we don't want to rely on that not changing in the future. By defining methods that define HOW data is accessed and changed, instead of opening publicly WHAT data is accessed and changed, we retain the ability to change implementation details inside the class without affecting compatibility with the outside application (i.e. the number of bugs you're going to produce and need to track down).

There is an exception to this. Some languages, i.e. C#, define ways to define custom getter/setter methods when a public property is accessed. In these cases, I don't see a functional difference, only in style.

PROTECTED

Protected access can be used to remove access to a property or method from outside of the class, while retaining access for classes which inherit from it.

I try to be strict about when I make a member property protected. It is in a much smaller way, but a protected property has a lot of similarities to the problem of public members -- we're giving access to modify our data directly to other parts of our application.

My general rule of thumb is this: if an object is a service (it performs a pre-specified task, preferably without retaining state) or I know the chances of the value changing are almost zero (in some languages you can actually define them as const or immutable), I'm okay with using a protected property. Outside of that, private is the way to go.

When it comes to protected methods, I'm quite a bit more liberal. If I think there's a decent chance a subclass could usefully override the method, and I'm pretty sure it's a method that isn't going to be changed or remove often, I'll generally give it protected access.

PRIVATE

Private access should really be your default for properties and methods unless you see that it meets some of the criteria for a public or protected member. It's much easier to increase access to a method/property than it is to decrease it. Use private for implementation details you don't want exposed anywhere else in your application.

Let's go to an example:

class Product
{
  private $price;
  private $title;
  private $qty;
  private $discount;

  public function __construct ($price, $title, $qty, $discount)
  {
    $this->price    = $price;
    $this->title    = $title;
    $this->qty      = $qty;
    $this->discount = $discount;
  }

  public function getPrice ()
  {
    return $this->price;
  }

  public function getTitle ()
  {
    return $this->title;
  }

  public function getQty ()
  {
    return $this->qty;
  }

  public function calculateTotalCost ()
  {
    $baseCost  = $this->getPrice() * $this->getQuantity();
    $totalCost = $baseCost - $this->calculateDiscountTotal();

    return $totalCost;
  }

  public function getDiscount ()
  {
    return $this->discount;
  }

  protected function calculateDiscountTotal ()
  {
    return $this->getDiscount();
  }
}

Here we defined a class for each product. There is a method that determines what the total cost of the product should be. Right now, it multiplies the price times the quantity, and removes the discount as a flat amount off of that total.

Let's extend the class and give the discount different behavior.

class PercentDiscountProduct extends Product
{
  protected function calculateDiscountTotal ()
  {
    $baseCost       = $this->getPrice() * $this->getQuantity();
    $discountFactor = $this->getDiscount() / 100;
    $totalDiscount  = $baseCost * $discountFactor;

    return $totalDiscount;
  }
}

We've created a new class here that will calculate the discount by taking it as a percentage off of the product total. By defining calculateDiscountTotal() as protected, we gave access to the subclass to change its behavior. But because we don't need to use it in the rest of our application, we made it protected.

All member properties are private, and are accessed through getter methods. There are no setter methods, which essentially make this object immutable. The properties of the product cannot be changed after it has been constructed. That reduces the number of code paths we need to look through to detect bugs, because once the Product has been instantiated, we know it will not change. We will come back to this concept in a later article.

The key to realize here is that now we can create object of type Product and PercentDiscountProduct and pass them around our application, treating them exactly the same (because they share the same interface), even though their totals calculations are arrived at differently. They appear interchangeable to the rest of the application, regardless of their internal implementation.

$bacon = new Product(
  2.00,
  'bacon',
  2,
  2
);

$steak = new PercentDiscountProduct(
  14.00,
  'steak',
  1,
  10
);

$products = [
  $bacon,
  $steak,
];

$total = 0.00;

foreach ($products as $product) {
  $total += $product->calculateTotalCost();
}
Grid Image