:Author: Ammar Hakim
:Date: Oct 23rd 2022
:Completed: 
:Last Updated: Feb 6th 2022

PN2: Concatenative Programming for Data Analysis
================================================

Consider the following function written in a typical programming
language

.. code::

   sq(x) : return x*x

This function takes an input number and multiplies it by itself. The
key observation here is that inside the function body the variable "x"
is bound to the input value of "x". So when calling sq(5), x inside
the function body is bound to the value 5. Although this is a
perfectly good way to design programing languages it requires careful
binding of names to values. This is far from a trivial task, requiring
each function to have a private environment that must be managed in a
way that it does not interact with the calling environment.

An alternative to this approach is to eliminate the need for names
completely, instead using a stack to hold values. In such programming
languages, instead we define the sq function as follows

.. code::

   sq : dup *

To call this function one would write the following program:

.. code::

   5 sq

To run this program, we scan from left to right. When we encounter a
number or a literal (strings, arrays etc) we push it on the
stack. When we encounter a function, we pop the appropriate number of
top values from the stack, run the function with those values and push
the result back in the stack. So, after substituting the definition of
the sq function. the stack looks like

.. code::

   5 dup *
   5 5 *
   25

Note the stack modifier "dup" duplicated the value on top of the stack
and pushed it on the stack. The multiply function, as it is a binary
function, then popped two values, multiplied them, and pushed the
answer back on the stack again. When the function completes, the
answer is on top of the stack.

Note that in this approach we never introduced any variables like "x"
above, hence eliminating the need for binding variables to
values. Instead, we introduced two features: a stack to hold values
and a set of "stack rewriting rules" that modify the stack based on
the function one encounters.

The stack rewrite rule for dup are

.. code::

   dup : (a -- a a)

The left of the "--" is the state of the stack before encountering the
dup function, and the right of "--" the state when the dup function is
complete. The stack rewrite rule of * is

.. code::

   * : ( a -- b)

Note the rule does not say what "b" is. It merely describes the state
of the stack. In this case, the top of the stack is replaced by
another value.     

In this stack-based language, we never need to pass an environment to
a function, nor do we need to worry about binding variables to
values. Everything occurs on the stack, including intermediate
calculations.

Now consider the following function in a typical language

.. code::

   double(x) : return 2*x

To double the square of 5, say, we would need to do double(sq(5)). The
nesting of calls is needed to pass the output of sq to double. In a
concatenative language, on the other hand, the double function is
defined as

.. code::

   double : 2 *

that is, we push two on the stack and then the * function pops two
values, multiplying them and pushing the result back on the stack. So,
to double the sqaure of 5 we would write the following program

.. code::

   5 sq double

When this program completes, the value of 50 will be on top of the
stack. This way of combining functions by merely placing them one
after another is the reason why *concatenative languages* are called
such. Concatenative progamming allows chaining a series of functions,
all working on the present state of the stack, to perform a
computation. All this is achieved without introducing any variables in
the program at all.