A function (from Latin functus "perform") (= mapping = map = transformation = operation) from set A to set B is a relation between A and B that pairs each member of A with exactly one member of B; is a relation in which no two ordered pairs have the same first member. Formally:

\begin{equation*} \forall x \in A, \forall y, z \in B, (x, y) \in f \wedge (x, z) \in f \rightarrow y = z \end{equation*}

For example, both the equation f(x) = x * x + 1 and the set {(3, 6), (8, 2), (1, 2), (0, 0), (-5, 3)} are functions. Some other examples include the Y combinator. Non-mathematical examples include a strategy, for a game, assigning people to seats, the universal law of gravitation, and economic cost functions.


1   Form

A function consists of:

  1. A set called the domain
  2. A set called the range, whose members are assigned to some member of the domain.
  3. A set called the codomain, a superset of the range chosen for convenience

We typically think of the domain of a function as the set of inputs and the range as the set of outputs. If it often helpful to think of a function a sort of computer.

If f(x) = y, we say that y is the image of x under f, and x is a preimage of y under f. We also say that y is the value of f at x and f maps x to y.

If f and g are functions whose domains are disjoint, then union(f, g) is a function. This is the basis for defining function functions by cases.

1.1   Definitions

For the vast majority of functions used in mathematics, the rules used to define them are equations.

Function may also be defined by cases. For example, the absolute value function is defined as f(x) == x if x >= 0 else x * -1.

1.2   Signature

A function has different types of parameters.

Ordinal parameters are parameters whose arguments must be sent in the order established in the method's signature

Variadic ordinal parameters are those for which there can be any numbers.

Keyword arguments can be passed in by name.

2   Properties

A function from A to itself is called a function on A.

2.1   Arity

A function with k arguments is called a k-ary function, and k is called the arity of the function. If k is 1, f is called a unary function. If k is 2, f is a binary function. If k is 3, f is called a ternary function.

2.2   Idempotence

A function is called idempotent if it can be applied multiple times without changing the result beyond the initial application. That is, \(f(f(x)) = f(x)\). For example, the identity function and absolute value function are both idempotent.

2.3   Partial & Total

A partial function from X to Y is a function for some subset of X. A function may be called a total function if X' = X to distinguish it from a partial function, though "function" and "total function" are synonymous.

2.4   Finitary & Infinitary

A function is called finitary if it takes a finite number of input values to produce an output, otherwise it is called infinitary.

For example, + is finitary, but sum is infinitary.

2.5   Injective


A function is called injective (or an injection) iff it is one-to-one; if every element of the codomain is mapped to by at most one element of the domain.

2.6   Surjective


A function is called surjective (= onto = a surjection) if its range is identical to its codomain; if every element of the codomain is mapped to by at least one element of the domain.

2.7   Bijective


A function is called bijective iff it is both injective and surjective; if every element of the codomain is mapped to by exactly one element of the domain.

A bijection from a set A to itself is called a permutation on A.

2.8   Tail recursive

A recursive function is tail recursive if the result of the recursive call is immediately the result of the whole function. For example, the following is not tail recursive because after the recursive call to fact (n - 1) completes, fact n isn't done; a multiplication by n still has to take place:

fact 0 = 1
fact n = n * fact (n - 1)

We can make a tail-recursive variant of the factorial function by adding an extra "accumulating parameter" which accumulates the result:

factTR r 0 = r
factTR r n - factTR (n * r) (n - 1)

To preserve the original interface but make fact tail recursive, we could rewrite fact in terms of factTR:

fact = factTR 1

Any tail recursive function can easily be implemented iteratively.

One method of converting recursive functions into tail recursive function is `continuation passing style`_. The idea is that instead of returning a result directly, we take an extra argument, called a continuation, which specifies "what we do next". We pass the result to this function instead of returning it. For example, here is factorial written using continuation-passing style:

factCPS k 0 = k 1
factCPS k n = factCPS (k . (\r -> n * r)) (n - 1)

In the case that n is 0, we pass it to the continuation k. In this case, to preserve the original interface we can rewrite fact as:

fact = factCPS id

3   Operations

3.1   Composition


The composition of two functions is the new function obtained by performing one followed by the other.

Composition is denoted with a small circle, which is read "circle".

\begin{equation*} z = f(y) = f(g(x)) = f \circ g(x) \end{equation*}

3.2   Currying

Currying takes a function \(f: X \rightarrow Y \rightarrow R\) and turns into a function \(f': X \rightarrow (Y \rightarrow R)\). Thus, uncurried \(f\) is invoked as \(f(3, 5)\) and curried \(f'\) is invoked as \(f'(3)(5)\). The operation is named after `Haskell Curry`_.

To curry a binary function \(f\):

function curry(f) {
  return function(x) {
    return function(y) {
      return f(x, y);

3.3   Partial application

Partial application takes a function \(f: X \rightarrow Y \rightarrow R\) and a fixed value for the first argument to produce a new function \(f': Y \rightarrow R\).

To partially apply a function \(f\) with the value \(x\):

function apply(f, x) {
  return function(y) {
    return f(x, y);

This is more generally just called application? Something like

\begin{equation*} apply(f, x) = \lambda y: f(x, y) \end{equation*}

4   Relations

4.1   Inverse

As a relation, a function has an inverse. But the inverse of a function is not necessarily a function. For example, consider the inverse f(x) = x * x.

4.3   Equivalence

Two functions are equal iff:

  1. Their domains are identical
  2. Their codomains are identical
  3. They contain the same members

4.5   Exponential decay

A quantity is subject to exponential decay if it decreases at a rate proportional to its current value. Exponential decay can be described as decaying quickly, then slowly.

5   Classification

A function is called first-class iff its domain includes other functions.

Functions nest? Inner functions can refer to variables bound in the outer function (e.g partials).

A variable is free if the variable is not defined in subject scope else it is bound.

A term with no free variables is called closed else it is called open.

5.1   Higher-order function

A higher-order function is a function that accepts or returns a function. For example, filter(xs, p).

Function combinators are higher-order functions that accept and return functions.

5.2   Operator

An operation (= product) on a set G is a function from G x G to G. An operation gives a rule for combining two elements of G for produce another element of G. For example, addition on the real numbers.

5.3   Inclusion map

An inclusion map is the identify function with a larger codomain:

\begin{equation*} \forall A, B, A \subseteq B \rightarrow \exists f: A \rightarrow B, \forall x \in A, f(x) = x \end{equation*}

5.4   Characteristic function

Let A and B be sets with B <= A. The characteristic function of B (with domain understood to be A) is the function X_B: A -> {0, 1} defined by cases as:

\begin{equation*} \chi_B(x) = \begin{cases} 1, x \in B \\ 0, \neg(x \in B) \end{cases} \end{equation*}

5.5   Hash function

A hash function (= message digest) is a function that has the following three properties: it's input can be any string of any size, it produces a fixed size output (a "digest"), and it can be computed efficiently. [1]

5.5.1   Cryptographic hash function

A cryptographic hash function (= one-way function) is a short, fixed-length representation of a longer variable-length message. Digest algorithms are designed to produce unique digests for different messages.


For a hash function to be cryptographically secure, it must have the following three properties: collision resistance, hiding, and puzzle-friendliness.

A collision occurs when two distinct inputs produce the same output. A hash function H is said to be collision resistant if its infeasible to find two values, x and y, such x != y, but H(x) == H(y).

We can prove collisions will exist by a simple counting argument. The iput space to the hash function contains all strings of all lengths yet the output space contains only strings of a specific fixed length. Because the input space is larger than the output space (indeed, the input space is infinite, while the outspace space is finite), there must be input strings that map to the same output string.

There are even methods that are guaranteed to find a collision. Consider the follow method for fidning a collision for a has function with a 256-bit output size: pick 2^256 + 1 distinct values, compute the hashes of each of them, and check if are any two outputs are equal. Since we picked more inputs than possible outputs, some two of them must collide when you apply the has function.

The method above is guaranteed to find a collision, but if we pick random inputs and compute the hash values, we'll find a collision with high probabiity long before examining all the outputs. In fact, if we randomly chose just 2^130 + 1 inputs, it turns out there's a 99.8% chance that at least two of them are going to collide. (Due to the birthday paradox.)

The problem is that takes a very long time to do. For a hash function with a 256-bit output, you would have to compute the has function 2^256 + 1 times in the worst case, and about 2^128 on average, an astronomically large number If a computer calculates 10K per second, it would make more than 10^27 years to calculate 2^128 hashes. For another way of thinking about this, we can say that, if every computer ever made by humanity was computing since the beginning of the entire universe, up to now, the odds that they would have found a collision is still infinitesimally small. So small that it’s way less than the odds that the Earth will be destroyed by a giant meteor in the next two seconds.

A more difficult question is: is there some other method that could be used on a particular has function in order to find a collision? Although the generic collision detection algorithm is not feasible to use, there still may be some other algorithm that can effeciently find a collision for a specific hash function.

5.6   Logarithm

The logarithm is the inverse operation to exponentiation. It was discovered by `John Napier`_ in 1610.

5.7   Distance function

A distance function is a function that defines a distance between each pair of point elements of a set. A distance function must satisfy the following three axioms:

  1. Identity of indiscenribles: d(x, y) = 0 <=> x = y
  2. Symmetry: d(x, y) == d(y, x)
  3. Triangle inequality d(x, y) <= d(x, z) + d(z, y)

6   Representation

If f is a function and x is any member of the domain of f, then the expression f(x) denotes the unique output that f assigns to x. More concretely:

\begin{equation*} f(x) = y \leftrightarrow (x, y) \in f \end{equation*}

The letters 'f', 'g', 'F', and 'G' are used to denote functions.

The statement that f is a function from A to B is abbreviated f: A -> B.

6.1   Infix notation, prefix notation & postfix notation

Prefix notation (= Polish notation):

+ 3 4

Infix notation:

3 + 4

Postfix notation (= Reverse Polish notation):

3 4 +

7   Induced set operations

7.1   Forward image

Let f: A -> B be a function. For any C <= A, the set {f(x) | x in C} is called the image of C under F, denoted f(C). Note that for any C <= A, f(C) <= B. In other words we have defined a new function from powerset(A) to powerset(B) which is called the forward set operation induced by f.

For example, let f(x) = x * x on the domain R. Then we can write f(3) = 9 and say that the image of 3 under f is 9. Further, we can write f(N) = {1, 4, 9, ...}, f({-3, 3}) = {9}.

7.2   Preimage

Let f: A -> B. For any C <= B, the set {x in A | f(x) in C} is called the inverse image of C under f and is denoted f^-1(C).

For example, let f(x) = x * x on the domain R. Then we can write f^-1({4, 9}) = {2, -2, 3, -3}.

8   Further reading

9   References

[1](1, 2) Narayanan etal. Jan 1, 2015. Bitcoin and Cryptocurrency Technologies.
[2]Oct 4, 2017. Elliptic Curve Cryptography for Beginners.

The effectiveness of cryptography in essence rests on a single truth about mathematics: that it is impossible to factorise big numbers. For any number, there is no way of working out if a smaller number divides into it, short of actually doing the calculation. This might sound like a small point, but it means that when you have very long numbers – numbers that are hundreds or thousands of digits long – there is no way of breaking them down into factors other than by trying every smaller number and seeing if it fits. With very very big numbers, that process is, in practice, impossibly time-consuming.


There are two main ways to design one-way functions: factorization, and elliptic curve logarithms.

An elliptic curve is defined by the function \(y^2 = x^3 + ax + b\) where \(a\) and \(b\) are parameters of the curve. A constraint is imposed to eliminate curves that have cups or self-intersections.

You can define an elliptic curve over any algebraic field. In cryptography, we usually use the field "integers mod p" (\(Z mod p\)), where \(p\) is a large prime number.

We can define a `algebraic group`_ of the points on the elliptic curve like this: "To add two points, A and B, draw a line through points A and B and label the third intersections with the line and the curve C. We define addition such that A + B = -C". If there is no point C on the curve (for example when \(B = -A\)), then we define a point \(0\) and say that \(A + B = 0\).

The other question that comes up is what should happen in the case that A = B A=B? In this case, we simply take the line tangent to the curve at A, and use that instead of the line going through A and B.

In doing this, we also define multiplication, which involves adding the same point multiple times.

Multiplication of points on an elliptic curve by a scalar is easy, but finding the scalar given the original point and the result of the multiplication is very difficult. This is known as the Elliptic Curve Discrete Logarithm Problem.

reduce() is a function similar to fold that takes a list of objects, a first values, and a function which takes the last value and the current value and produces a new value:

function reduce(comb, b, as) {
    return (as.length == 0) ? b : reduce(as.slice(1), comb, comb(b, as[0]));