Fibonacci in O(log(B))

A homework exercise

On Mou
on Divide and Conquer

Introduction

Fibonacci is so simple, but so complex as to seem organic. It is a great homework exercise for Divacon.

Below are three derivations of a Fibonacci(N) function, (1) for \(N=2^a\), (2) for arbitrary \(N\), and (3) using Divacon to tighten things up.

The n'th (zero based) Fibonacci number F(n) is defined as:

F(0) = F(1) = 1
F(n) = F(n-1) + F(n-2)

Since each requires its immediate predecessors to calculate it, F(n) for large \(n\) seems to require O(n) operations. F(n) for \(n>2^{1000}\) could take a very long time indeed.

George Mou says he can do it in O(log(n)) operations.

That would be O(1000) time to calculate F(\(2^{1000}\)): perhaps a second or two on this web server (since the results below seem to be sub-millisecond), instead of

O(\(2^{1000}\)) ~ O(10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000).

On a system achieving one addition per clock cycle, and terahertz speeds (faster than may ever be possible given the fade-out of Moore's Law), and with the age of the universe at \(4x10^{17}\) seconds, or roughly on the order of \(2^{60}\) seconds ago, it would take \(~2^{1000-40-60}=2^{900}\) times the age of the universe to solve such a problem directly. Just sayin'.

George suggested I use Divacon to do the same as an exercise. Here, then, is my homework.

Domain, Co-Domain, Operation, Co-Operation

The domain of the Fibonacci sequence is the set of Fibonacci numbers themselves, \(\forall\ n\in I\ \ F(n)\).

A convenient codomain would seem to be the set of pairs of Fibonacci summands: F(\(n\)-1), F(\(n\)-2), which sum to F(\(n\)). Can we please have a notation which hides the index arithmetic?

$$ \begin{align} for\ n\ \in \ I: & \\ \text{let}\ N'\ \ &\text{≜}\ \text{F(}n-1)\\ \text{let}\ N''\ &\text{≜}\ \text{F(}n-2) \\ \text{let}\ CAP(n) = N\ \ \ &\text{≜} \begin{bmatrix} \text{F(}n-1)\\ \text{F(}n-2) \end{bmatrix} = \begin{bmatrix} N' \\ N'' \end{bmatrix} \end{align}$$
What operation shall we associate with these vectors? Mapping from codomain to domain is accomplished by adding the summands. But how to combine them? Is there an \(N\# K\) in the codomain corresponding to f(n+k) in the domain?

This Fibonacci recurrence shifts from the n'th to the n+k'th (via a two-sample auto-convolution, meaning that the \(k\)'s count up while the \(n\)'s count down, and they are both taken from the same series):

 F(n+k) = F(k)*F(n-2)+F(k+1)*F(n-1)
This enables a doubling of the index from F(\(n\)) to F(\(2n\)) by setting \(k=n\). A doubling: a bit shift.

Arrays of Fibonacci results can be arranged by doublings to any desired rank. Allow me to overload the square bracket notation, to specify the rank of such a doubling. \(F[a]\), then, is an array of length \(2^{(a+1)}\):

 index: 0 1 2 3 4 5  6  7  8  9  10 11  12  13  14  15
F[0] = [1 1]
F[1] = [1 1 2 3]
F[2] = [1 1 2 3 5 8 13 21]
F[3] = [1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987]
F[\(a\)](n) (with value F(n)) will refer to the n'th element of the \(a\)'th of these doubled arrays, so for example
F[3](8)=34. Arrays F[a] are half redundant since F[a] is always the first half of F[a+1] and F(i)=F[a](i) wherever F[a](i) is defined. But it may help see and think clearly about the indexes involved in going from the end of one level to the end of the next. It is remarkable how confused I can become with such a simple function as Fibonacci!

Do you believe me when I say:

F[\(a\)] contains elements F[\(a\)](0) .. F[\(a\)]\((2^{(a+1)}-1)\).

To develop an intuition for the dependencies here, you might graphically draw out the Fibonacci numbers with their summands and daughters in a tree structure of dependencies. The subtrees responsible for a given sum recur, once then twice, then three, five, and eight times, in succeeding-Fibonacci-number dependency trees. If you skip a single index down in the array, you will find there are one of the first-smaller, two of the second-smaller, three of the third-smaller subtrees, \(F(n)\) of the \(n\)'th smaller subtrees, no matter what the value of \(n\). It's really a lovely self-inclusive mosaic pattern however you look at it. (I thought about drawing a fractal with it, but here's what I came up with, instead.)

Fortunately the F(\(n+k\)) recurrence above gives us a handle on this chaotic fractal. With it we may calculate, given the 2-vector \(N\) characterizing the \(n\)'th Fibonacci number, a second 2-vector, call it \((N+N)\) or \((2N)\) which characterizes the \(2n\)'th Fibonacci number, or shall we say, is that Fibonacci number in the codomaina. We will need an operation on 2-vectors yielding a 2-vector. The double-cross symbol, \(\#\), evokes two pairwise combinations: perfect.

Although a pair of products of 1x2 and 2x1 matrices amounts to the same thing I'll use two dot products in the following definition and derivation:

$$ \begin{align} N\ \#\ K\ \ &\text{≜} \begin{bmatrix} N \cdot (K+1)\\ N \cdot K \end{bmatrix} = \begin{bmatrix} N'(K+1)' + N''(K+1)'' \\ N'K' + N''K'' \end{bmatrix}\\ & =\begin{bmatrix} f(n-1)f(k+1-1) + f(n-2)f(k+1-2) \\ f(n-1)f(k-1) + f(n-2)f(k-2) \end{bmatrix} =\begin{bmatrix} f(n-1)f(k) + f(n-2)f(k-1) \\ f(n+(k-2)) \end{bmatrix} \\ & =\begin{bmatrix} f(n+(k-1)\\ f(n+(k-2)) \end{bmatrix} =\begin{bmatrix} (N+K)'\\ (N+K)'' \end{bmatrix} = CAP(n+k) \end{align}$$
That is, the codomain combination (\(\#\)) calculated according to the definition-given formula, becomes the domain combination (\(+\)) of indexes in the Fibonacci sequence.

The nervous reader, seeking to fit these calculations into a well-founded and incremental process (say, doubling the indexes) may wish to check that it doesn't assume its output before calculating it. Observe that although \((K+1)''=f(k+1-2)=f(k-1)\) is among the available components of \(N\) and \(K\), namely \(K'=f(k-1)\), we do not immediately find \((K+1)'=f(k)\) among those available components, which might suggest we are not ready to make this calculation. However the Fibonacci definition itself provides a substitution: \(f(k)=f(k-1)+f(k-2) = K' + K''\). This lets us proceed to double or shift an index while using only available components in \(N\) and \(K\) to calculate the shifted vector.

In a doubling process which at each step arrives at \(F[a]\) with \(n=2^a+1\) elements, although we may have calculated values \(F(k)\) up to \(k \le n\), we cannot assume availability of \(F(k)\) for \(k>n\). Since we mustn't get ahead of ourselves, our maximum shift will be achieved with \(k=n\), which gives a doubling. Great.

$$ \begin{align} N\ \#\ N\ \ &=\begin{bmatrix} N \cdot (N+1)\\ N \cdot N \end{bmatrix} &= \begin{bmatrix} N'(N+1)' + N''(N+1)''\\ N'N' + N''N'' \end{bmatrix} &= \begin{bmatrix} N' (N' + N'') + N'' N'\\ N'N' + N''N'' \end{bmatrix} &= \begin{bmatrix} (2N)'\\ (2N)'' \end{bmatrix} \end{align}$$
Note the doubled array F[\(a\)+1] has length \(m=2n=2^{(a+2)}\). And its final-element 2-vector (2N) depends (only!) on the final-element 2-vector (N) of the F[\(a\)] array. Who knew that such potentially remote entries in the Fibonacci series had such close relationships?!

Shall we look at an example? Consider the case of \(a=3\) so \(n=8, a+1=4, m=2n=16\). In the pattern we seek F(m-1) and F(m-2), so let's calculate F[4][14] and F[4][15] from F[3][6] and F[3][7].

Fib[4][14] = Fib[3][6]*Fib[3][6] + Fib[3][7]*Fib[3][7]
	   = 13*13 + 21*21
	   = 169 + 441
	   = 610
and

Fib[4][15] = Fib[3][7]*(2*Fib[3][6] + Fib[3][7])
	   = 21 * (2*13 + 21)
	   = 21 * (26+21)
	   = 21 * 47
	   = 987
To calculate each new 2-vector, (e.g., F(\(2^a\)-1) and F(\(2^a\)-2)) requires 2 multiplies and one add for each element.

For F(2^1000-1) requires 4000 multiplies and 2000 adds or in general O(log(N)) operations.

The following table is re-generated every time this page is loaded:

nF(n-2)F(n-1)Δ hrtime
4F(2) = 2F(3) = 3 0.037361ms
8F(6) = 13F(7) = 21 0.00146ms
16F(14) = 610F(15) = 987 0.001029ms
32F(30) = 1346269F(31) = 2178309 0.00098ms
64F(62) = 6557470319842F(63) = 10610209857723 0.00102ms
128F(126) = 1.5557697022053E+26F(127) = 2.5172882568355E+26 0.001911ms
256F(254) = 8.7571595343019E+52F(255) = 1.4169381771406E+53 0.001609ms
512F(510) = 2.7745922289306E+106F(511) = 4.489384531331E+106 0.001771ms
1024F(1022) = 2.7852935506996E+213F(1023) = 4.5066996336778E+213 0.001721ms
2048F(2046) = INFF(2047) = INF 0.001309ms

These results are limited by PHP LONG INT and DOUBLE FLOAT computation, but it seems the algorithm and results are correct as far as the eye can see, comparing to some reference results:

F[14]=610, F[15]=987
F[30]=1346269, F[31]=2178309
F[62]=6557470319842, F[63]=10610209857723
F[126]=155576970220531065681649693, F[127]=251728825683549488150424261
...
F[1022] = 278529355069959292393881241266809350935330735212370380691318266898736\\
	  950320346518362561675961332445274995854966996688219111789542501520845\\
	  546940373127265215824082562848481813148554423082730494051913219529946\\
	  6733282
F[1023] = 450669963367781981310438323572888604936786059621860483080302314960003\\
	  064570872139624879260914103039624487326658034501121953020936742558101\\
	  987106764609420026228520234665586889971108924677841335400410363155392\\
	  5405243


This PHP code generates the above table (live! those durations were re-measured just now).

function Fibonacci($n) {
  $A = log($n,2)-1;
  $Fn_1 = $Fn_2 = 1;
  for ($a=1; $a<=$A; $a++) {
    $Fm_2 = ($Fn_2 * $Fn_2) + ($Fn_1 * $Fn_1);
    $Fm_1 = $Fn_1 * (2 * $Fn_2 + $Fn_1);
    $Fn_1 = $Fm_1;
    $Fn_2 = $Fm_2;
  }
  return "F(" . ($n-2) . ") = $Fm_2, F(" . ($n-1) . ") = $Fm_1\n";
}
Containing an actual log(n) function call and an iteration counting up to that many, this is clearly O(log(n)) in complexity.

So far so good, but so far this only works on powers of 2.

Fib(\(p\)) for \(p\) not a power of 2

Every counting number \(p\) is a sum of powers of 2, like 5 is a sum of \(2^2 + 2^0\), which are the binary digits in \(p\): decimal 5 = binary 101.

Considering \(p\) as a data structure comprising various properties, we might give names to

  • the list of indexes of all the "1" bits in \(p\) (pbits),
  • the count of how many there are (pbits.length), and
  • the highest index of all the "on" bits in p (pbits.max).

For example, for \(p=17=2^0+2^4\), pbits=[0,4], pbits.length=2, pbits.max=4.

Each bit being a power of 2 also corresponds to a doubling of the Fibonacci array, from F[\(a\)] to F[\(a\)+1], etc. And each \(N\)-sized array has its 2-vector, \([\ N'\ N''\ ]^T\).

Now we approach the nub of the question, and the answer is Yes:

Let \(n\) be a power of 2, \(n=2^{(a+1)}\), so F(n-1) is the last element in F[\(a\)].
Let \(0\le k\lt n\). Then F(n+k) is in the 2nd half of F[\(a+1\)]. Sorry for the index jiu jitsu! Please compare definitions above to satisfy yourself.

Using the recurrence equation, F(n+k) depends on values \(N', N'', K', K''\), all within F[\(a\)] or (redundantly) in the first half of F[\(a\)+1].

In a divide-and-conquer approach we might knock off the highest bit of \(p\) by collecting for its highest bit \(2^a=n\) the values \(N', N''\), and then, for one on bit at a time of rank, say, b, such that \(k=2^b\) and \(b We could also proceed in the other direction, accumulating K',K'' for \(k\) comprised of lower bits and calculating (N+K)' and (N+K)'' for the next encountered bit \(n=2^a\) for some larger \(a\).

Either direction requires that we separately solve the trivial cases for n=0 itself as well as the 0th bit n=1, and then with those we can repeatedly build up the Fibonacci number 2-vectors for powers of two.

The modified recurrence allows us to combine the parameters for any two \(n\) and \(k\) elements to create the parameters for the \(n+k\)'th element. That, combined with parameters of the next on bit, lets us repeatedly combine pairs up to the last one, at which the summary is to say, \(P = [ P' P'']^T\).

To summarize the process, we will scroll up from the 0'th to the highest bit in \(p\) collecting intermediate results which are our parameters, N', N'' for each level.

Then we will scroll across again, incrementally incorporating previous \(k\)'th results \(K', K''\), with on-bit-triggered single-bit results N', N'' for each of the on bits, using the recurrence for f(n+k).

Here is an implementation in PHP:

function as_bits($p) {
  $res = [];
  for ($i=$j=1;$i<$p;$i<<=1,$j++)
    ;
  $j--; 
  for (  ; $j>=0; $j--,$i>>=1  ) {
    if ($p&$i) {
      $p-=$i;
      array_push($res,$j); // smaller j's go to the end of the results array.
    }
  }
  return $res;
}

function CombinePairs($pk, $pn) {
  $pth = ($pk["pen"] + $pk["ult"]) * $pn["pen"] + ($pk["pen"] + 2*$pk["ult"]) * $pn["ult"];
  $res = [];
  $res["ult"] = ($pk["ult"] * $pn["pen"]) + (($pk["pen"]+$pk["ult"])*$pn["ult"]);
  $res["pen"] = $pth - $res["ult"];
  return ($res);
}

function AnyFibonacci($p) {
  $pbits = as_bits($p);

  // create per-bit intermediate results
  $Fpair = [];
  $Fpair[0] = array("ult" => 1,"pen" => 0); 
  $Fpair[1] = array("ult" => 1,"pen" => 1);
  if ($pbits[0]>0) {
    for ($a=2; $a<=$pbits[0]; $a++) {
      $Fpair[$a] = CombinePairs($Fpair[$a-1],$Fpair[$a-1]);
    }
  }

  // Combine from a=1 up to the highest in pbits.
  $len = count($pbits);
  if ($len==0) // p has all bits zero
    return 1;
  else if ($len==1) { // p has only one bit set to one
    return $Fpair[$pbits[0]]["ult"]+$Fpair[$pbits[0]]["pen"];
  } else if ($len>1) { // p has more than one bit set to one.
    $intres = CombinePairs($Fpair[$pbits[$len-1]],$Fpair[$pbits[$len-2]]);
    for ($i = $len-3; $i >= 0; $i--) {
      $intres = CombinePairs($Fpair[$pbits[$i]],$intres);
    }
    $res = $intres["ult"] + $intres["pen"];
    return $res;
  }
}

nF(n)Δ hrtime
110.00688ms
220.001069ms
330.00252ms
450.00204ms
580.001891ms
6130.001689ms
7210.00232ms
8340.001691ms
9550.001969ms
10890.00184ms
111440.002269ms
122330.00192ms
133770.002311ms
146100.002231ms
159870.002741ms
1615970.00196ms
1725840.00234ms
1841810.00226ms
1967650.00262ms
20109460.002251ms
3121783090.00346ms
3235245780.0023ms
3357028870.00265ms
35149303520.00297ms
3235245780.00233ms
541395838624450.00347ms
5915480087559200.00385ms
6265574703198420.00386ms
63106102098577230.00425ms
64171676801775650.00265ms
1261.5557697022053E+260.004951ms
1272.5172882568355E+260.00527ms
1284.0730579590408E+260.00324ms
2004.5397369416531E+410.00434ms
2551.4169381771406E+530.00634ms
3003.5957932520658E+620.0054ms
4002.8481229810849E+830.00476ms
5102.7745922289306E+1060.00654ms
5114.489384531331E+1060.006859ms
5127.2639767602616E+1060.00411ms


Incidentally, in our codomain notation, F(-2) = 0'' = 1; F(-1) = 0' = 0; F[0] = 0' + 0'' = 1 + 0 = 1, etc.

Divacon Fibonacci

Consider how to approach Fibonacci using Divacon. More than F(\(n\)) for a single index \(n\), we could simultaneously to calculate an entire rank of Fibonacci numbers F(0) to F(\(2^a-1\)), or F(\(p\)) for \(p \in 0..2^a-1\).

Such a process does require knowledge of the endpoint, the last rank \(a\). Also it ends with all the values F[0]..F(\(2^a-1\)). This is not in the spirit of Divacon, which likes to take arrays of data for its input, such as in this case an array of indexes into the conceptually infinite Fibonacci array.

FibonacciArray \(X, X \subset \{x | x \in I, x\ge 0 \}\)

Consider a new division operation \(d_{01}\), similar to \(d_{eo}\) except that the even/odd decision is not based on the index of an entry in the input array, the 0'th being even, etc., but on the final (or initial) bit in the values of the input array. \(d_{01}\) then divides the inputs (which are to be interpreted as a set of Fibonacci indexes) into those with final bit 0 placed into the left output vector, say Y, and those with final bit 1 placed into the right output vector, say Z. To construct a partition of indexes, Y, those which are 0-final, and Z, those which are 1-final, we could write:

$$\begin{align} d_{01} X\ =\ Y\ Z\ |\ & \forall\ x=(b,k,N,K) \in X\ \\ \text{if}\ & x.k = (x.k\gg 1\ll 1)\ \ \text{// last bit is 0}\\ \text{then}\ & x\ \in\ Y,\\ \text{else}\ & x\ \in\ Z \end{align}$$
Notes:

  • Y or Z or both may be empty, if there are no inputs left at that level. At the leaf of the tree the last, most significant bit in the any original index will always be 1-initial, which implies a preterminal Z node.
  • This is imbalanced in the worst case if the inputs are themselves imbalanced, but if they are uniformly random (or full rank(s)) then it is approximately (or entirely) balanced.
  • Note if we had defined \(d_{01}\) using initial bits being \(0\) vs \(1\), we would have to remove the universal presumption that every number is not zero-initial, and also keep track of the bitstring lengths of everything, so as not to put 2(=0x0010) in the same group as 4(0x0100), but by defining it using final bit, we just automatically let the tree build out until empty, which will separately happen at appropriate bit depths for each case.
  • A possible drawback of this approach is that this is more a shuffle than a sort.

Assuming Y or Z is not yet empty, then each of their elements must be adjusted for its new depth in the tree:

preadjust = \((b=k\&\text{0x1}, k:=k\gg 1, N:=N\# N,K:=(b\ ?\ N\# K;K)\ )\)
That is:
  • set the bit \(b\) to 0 or 1 based on the last bit of \(k\),
  • downshift \(k\) so further operations can work the same way, that is, on the 0'th bit, that is, simply,
  • double \(N\) so that the next and future bits relate to indexes \(2n\) instead of to \(n\), and
  • if this bit \(b\) is on, incorporate the now doubled \(N\) with the previous \(K\) to result in an updated \(K\) for this node.
We have made preadjust communication unnecessary by calculating \(N\#N\) locally, redundant to all the other processes working at this bit level which are doing the same calculation. We could instead communicate with a once-calculated array of \(N\# N\) for \(n=0..2^a\), but the communication distance could be (on average is dominated by) the diameter of the processor mesh which may be prohibitive, so here we calculate \(N\#N\) redundantly, locally.

Next we must stop at the bottom of the fully-expanded terminals of the computation tree. We can now discard the intermediate data \(N\) and make the important transition from codomain to domain here: f(k)=K'+K''. Retaining the rest we have:

  • basepred \(= (k==0)\)
  • basefunc \(= (b:=b, k:=k, f(k)=K'+K'') \)
Incidental observations:
  • So far the computation tree contains nodes with zero, one, or two daughter nodes, and each representing subtrees with a number of terminals countable separately in its daughter's nodes.

  • Now, if a node's \(k\) value is down to a single bit, being the most significant bit of an original index, that bit will always be 1, and it will be sent to the 1 daughter and adjusted to have its \(b:=1,\ k:=k\gg 1\ =\ 0\).

  • And if a node were down to a single member, with perhaps multiple 0's and 1's in its \(k\) but only one such index in the tree, then it will keep having singleton daughters down to its leaf which will represent a 1 at the most significant position, that is, at the end.

  • If we wanted to keep track of counts explicitly, we would allocate counts exhaustively upon division to daughters from parent array X to the partitioning 0-final and 1-final subarrays Y and Z, and coming up the tree, just adding them again, without at that point assuming there is any count for that particular node. So counts always assemble upward from the daughters.

Each node, with its daughters, knows what is needed to build up a tree combining its daughters. We are to separate structure manipulation from local operations. So for the new array we would add the counts if we were tracking array counts, but we are not; concatenate the vectors that would be structure manipulation therefore defined in or as \(c_{lr}\), and for each element shift k and incorporate b for the new k's; this being a local operation and neither structure manipulation nor communication.

postadjust = \( \forall\ v\ \in Y \cup Z\ \ \ v:=( v.b; v.k\ll 1+v.b, v.K )\)
\(c_{lr}\ \ Y\ Z =\ \)concatenate \(\ Y\ Z\ \)
Maybe I should draw it out graphically to be sure, but I think we are there.

The sort order will be whatever falls out rather than the original index sort order, unless we stored a reverse mapping operation somewhere to undo the tail-bits shuffle. It's okay, the original index returns as part of the output tuple. Putting this into one code block we have

PDC(divide = \(d_{01}\), combine = \(c_{lr},\)
pre = \((b=k\&\text{0x1}, k:=k\gg 1, N:=N\# N,\ K:=(b\ ?\ N\# K;K)\ )\),
post = \( \forall\ v\ \in Y \cup Z\ \ \ v:=( v.b; v.k\ll 1+v.b, v.K )\)
basepred = \((k=0)\),
basefunc = \((b:=0,\ k:=k,\ f(k):=K'+K'')\)
)
Given an array of indexes \(k\) this PDC returns a shuffled array of tuples \((k,f(k))\).

Conclusion

To summarize, PDC seems to be overkill for Fibonacci F(\(2^a\)), which is already O(log(N)) on single processor. The F(n+k) recurrence enables calculating the next doubled level's dependencies consecutively in constant time, four multiplies and three adds, O(log(N)) times. On this very page this algorithm is implemented twice to generate the data above, using mere iteration, not even requiring recursion, much less a parallel decomposition.

Still, writing a serial version of this function in Divacon still brings a high degree of code compaction: the Divacon expression of this algorithm uses 1/4 the number of lines of the PHP iterations. We gain benefits from an orthogonalized and systematic definition of the standard ingredients of a Divacon, parallel Divide-and-Conquer, decomposition:

f = PDC(divide, combine, pre, post, basepred, basefunc)

This implements a composition of standard component functions:

f X = ( basepred ? basefunc ; combine : post : !f : pre : divide) X

Submitting to the discipline of this opinionated methodology of function decomposition not only provides but requires a certain simplicity and elegance, a desirable depth of thought, which neatly reveals the essential structure of the computation, the problem, the hidden dimensions, and the relations among levels and elements. I'd say it offers understanding as well as correct results, although the cost is that of some study:

  • first to see and understand the roles of each component function,
  • second to divide pre and post into operation : communication,
  • third to become disciplined in applying to any process concept the trichotomy of
    • structure manipulation,
    • communication, and
    • local operation,
  • fourth, checking that the data structures emitted by each stage in the pipeline are the same as those taken up by the next stage, and
  • fifth, some practice expressing some algorithms in your own words, drawing out the computation graphs that your algorithm builds, making sure it does the right thing.

The exercise is beneficial and empowering.

Here, my next step was to go from a single-input function solved by iteration to a FibonacciArray function solving any number of inputs in parallel, using the Divacon approach. This took some thinking. We've here seen single inputs limited to powers of 2, or then to an unrestricted integer. Would a Divacon function take a single input number, such as a rank \(a\) to request an output array of values F(n) for \(n\in 0..2^{a}-1\)? Or should we prepare to handle a full enumeration of those or anyhow some indexes?

The single-value input approach didn't make sense to me in the Divacon space considering it has no divide operation, although it would require building a tree such as the following recursion:

tree node = tree 0-daughter ; tree 1-daughter

without looking at input indexes. Also it could be done in depth order from high bit to low, or low bit to high, either way; I discovered no good reason to make that choice.

However, a rank's worth of indexes \(0..2^{a}-1\) could be divided down to the last bit, calculating \(N\#N\) from \(n=1\) by doublings up to \(n=2^a\) and on the way up or down combining those N's with accumulated K's as triggered by the current bit. Better yet we might relax the specification of a whole rank \(0..2^a-1\), to allow any set \(\{n|n\ge 0,n\in I\}\) of integral indexes into the conceptually infinite Fibonacci array. Building a tree for them might be imbalanced, but operations are linear in bit-depth (i.e. log(\(n\)) for the largest \(n\), not in the number of indexes. If the index list were complete for a rank \(a\), it offers O(log(2^a))=O(log(n)) time cost, which seems pretty good, since we achieved no better for a single F(n) calculation above.

Once the array function input was thus specified as any list of integers the rest followed workmanlike.

Postscript

A final remark: If the N-length array were previously calculated for some finite but sufficient depth, say a=1000 which requires only 2k entries to store and O(a) time to populate once, then the K's could be calculated in separate nibbles of the original index's bit representation, suggesting a Divacon divide-and-conquer solution with O(log log N) time cost. I have in mind recursively dividing the bits of an index \(k\) into the low half and the high half, and in a postadjustment calculating \(K_{hi}\#K_{lo}\) as the codomain value for the combined low and high bits. This is allowable because \(+\) between indexes in the domain as in \(f(k_{hi}+k_{lo})\) licenses \(\#\) between codomain mappings \(K_{hi}, K_{lo}\) in the codomain. At the base of the division tree with bit \(b\) at depth \(d\), we return K=b?N[d]:0) as the value of a single bit, and in combining sets of bits we add them in the codomain \(K_1\#K_2\).

Since a simple PDC with binary balanced division and fixed-cost postadjustment yields a O(log(a)) parallel decomposition, but our bit count \(a\) is already O(a=log(n)) divisions of the possible \(a\)-deep index range, the composition of these should be O(log(log(n))). Can I write the whole thing in one minute, like George claims? Maybe...

First, let the input be an (unsigned) integer \(k\) divided into an \(a\)-length bit array, tupled with their indexes in that array. Thus an integer k=5=0x101 becomes input \((d=2,b=0x1), (d=1,b=(0x0)), (d=0,b=0x1)\). Then divide can split these bit-array nibbles, each of which knows their depth d.

Let's assume at the base we can refer to \(N[b] = [ N' N'' ]^T \text{for}\ n=2^b\).

Then our PDC would be

\(F k = atom? (b\&0x1 ? K=N[d] ; K=N[0]) ; c_{lr} : K_{l}\ \#\ K_{r} : ! F : d_{lr}\)

or

$$\begin{align} F = PDC(& divide = f_{lr}, combine = c_{lr}, \\ & preadjust = id, postadjust K_{l} \# K_{r}, \\ & basepred = atom?, basefunc = (b \& 0x1 ? K=N[d] ; K=N[0])); \end{align}$$

Basefunc sets outputs K as 1-bit values for \(n=1\ll d: N=(Nā€™\ Nā€)\) or \(0=(0ā€™=0, 0ā€=1)\) by lookup. Having divided the bits of the input log(a) times down to these single-bit elements (b,d), the postadjust phase combines codomain K values associated to the increasingly larger nibbles of bits, in a local calculation, so the next combine operation puts double-sized nibbles together, all in another log(a) time steps.

Since log(a) is log(log(n)), this is O(log log n) on the input index n.

Yes, that was about a minute.

Your thoughts?
(will not be shared or abused)
Comment:
                                          Feedback is welcome.
Copyright © 2000 - 2024 Thomas C. Veatch. All rights reserved.
Modified: July 11, 2024