Chapter 2

Essential Knowledge: Hardware

In order to implement efficient computer programs, it’s essential to understand the basic hardware structure of computers. In this chapter we examine the hardware components of a typical computer (CPU, memory, storage, GPU, etc.) focusing on issues that are relevant for software development and algorithm design. We also explore concepts like binary representations of numbers and strings, assembly language, multiprocessors, and the memory hierarchy.

Note: The examples below are abridged; the book contains more details.

Rounding
Overflow
Underflow
Comparing Real Numbers
The Log-Scale Trick
Use of Assembly Language in High-Level Languages

Rounding

Rounding occurs when a real number x cannot be precisely matched to a fixed- or a floating- point representation. The result of rounding is typically either the nearest representation of x or a truncated version of x obtained by dividing and dropping the remainder. Here’s an example in R:

# Round 0.333.. to a nearby floating-point
print(1 / 3, digits = 22) # print 22 digits of fp(1/3)
## 0.3333333333333333148296

Overflow

Overflow occurs when a number has a big absolute value that’s outside the range of possible binary representations. In the case of integers, this occurs when the number of bits in the representation is too small to represent the corresponding number. When overflow occurs, the number is replaced by either the closest binary representation or is marked as an overflow and considered unavailable. Here’s an example in R:

print(10 ^ 100)   # no overflow
print(10 ^ 500)   # overflow, marked by Inf value 
## 1e+100
## Inf

Underflow

Underflow occurs when the number x is closer to 0 than any of the possible binary representations. In most cases, the number x is then replaced by zero. An underflow example using R code appears below:

print(10 ^ -200, digits = 22) # roundoff, but no underflow
print(10 ^ -400, digits = 22) # underflow
## 9.999999999999999821003e-201
## 0

Comparing Real Numbers

Comparing whether or not two real numbers are identical is problematic due to rounding errors. See the below C++ example for an illustration:

// { autofold
#include <iostream>
#include <math.h>

using namespace std;
// }

int main() {
  cout << (sqrt(3) * sqrt(3) - 3) << endl
       << (sqrt(3) * sqrt(3) == 3) << endl
       << (fabs(sqrt(3) * sqrt(3) - 3) < 0.0000001) << endl;
}

The Log-Scale Trick

A common trick for avoiding overflow and underflow is to work in a logarithmic scale, rather than the original scale:

# The R code below graphs log(x) as a function of x
curve(log, from = 0, to = 1000, xlab = 'x', 
      ylab = 'log(x)', main = 'The logarithm function')

print(3 ^ -800, digits = 22) # underflow
print(log(3 ^ -800), digits = 22) # log of underflow
print(-800 * log(3), digits = 22) # avoiding underflow using the log-scale

# Addition, when using the log-scale, replaces multiplication
print(3 ^ -600 * 3 ^ -100 * 3 ^ 150, digits = 22) # underflow
# Avoid underflow and return results in the log-scale
print(log(3) * (-600 - 100 + 150), digits = 22)
# Avoid underflow and return results in original scale
print(exp(log(3) * (-600 - 100 + 150)), digits = 22)

Use of Assembly Language in High-Level Languages

This example simply executes the RDTSC assembly instruction, which reads the CPU’s timestamp counter, then stores the higher 32 bits into the EDX register and the lower ones into the EAX register. The value of EAX is then assigned to the timeStampCounter variable in C++ and printed out:

// { autofold
#include <iostream>

using namespace std;
// }

int main() {
  uint timeStampCounter = 0;
  asm("rdtsc": "=a" (timeStampCounter));
  std::cout << timeStampCounter << std::endl;
  return 0;
}