IntroductionFibonacci is so simple, but so complex as to seem organic. It is a great homework exercise for Divacon.Below are three derivations of a Fibonacci(N) function, (1) for \(N=2^a\), (2) for arbitrary \(N\), and (3) using Divacon to tighten things up. The n'th (zero based) Fibonacci number F(n) is defined as:
F(0) = F(1) = 1Since each requires its immediate predecessors to calculate it, F(n) for large \(n\) seems to require O(n) operations. F(n) for \(n>2^{1000}\) could take a very long time indeed. George Mou says he can do it in O(log(n)) operations. That would be O(1000) time to calculate F(\(2^{1000}\)): perhaps a second or two on this web server (since the results below seem to be sub-millisecond), instead ofGeorge suggested I use Divacon to do the same as an exercise. Here, then, is my homework.
Domain, Co-Domain, Operation, Co-OperationThe domain of the Fibonacci sequence is the set of Fibonacci numbers themselves, \(\forall\ n\in I\ \ F(n)\).A convenient codomain would seem to be the set of pairs of Fibonacci summands: F(\(n\)-1), F(\(n\)-2), which sum to F(\(n\)). Can we please have a notation which hides the index arithmetic?
$$ \begin{align} for\ n\ \in \ I: & \\ \text{let}\ N'\ \ &\text{≜}\ \text{F(}n-1)\\ \text{let}\ N''\ &\text{≜}\ \text{F(}n-2) \\ \text{let}\ CAP(n) = N\ \ \ &\text{≜} \begin{bmatrix} \text{F(}n-1)\\ \text{F(}n-2) \end{bmatrix} = \begin{bmatrix} N' \\ N'' \end{bmatrix} \end{align}$$What operation shall we associate with these vectors? Mapping from codomain to domain is accomplished by adding the summands. But how to combine them? Is there an \(N\# K\) in the codomain corresponding to f(n+k) in the domain? This Fibonacci recurrence shifts from the n'th to the n+k'th (via a two-sample auto-convolution, meaning that the \(k\)'s count up while the \(n\)'s count down, and they are both taken from the same series): This enables a doubling of the index from F(\(n\)) to F(\(2n\)) by setting \(k=n\). A doubling: a bit shift.F(n+k) = F(k)*F(n-2)+F(k+1)*F(n-1)
Arrays of Fibonacci results can be arranged by doublings to any desired rank. Allow me to overload the square bracket notation, to specify the rank of such a doubling. \(F[a]\), then, is an array of length \(2^{(a+1)}\):Fortunately the F(\(n+k\)) recurrence above gives us a handle on this chaotic fractal. With it we may calculate, given the 2-vector \(N\) characterizing the \(n\)'th Fibonacci number, a second 2-vector, call it \((N+N)\) or \((2N)\) which characterizes the \(2n\)'th Fibonacci number, or shall we say, is that Fibonacci number in the codomaina. We will need an operation on 2-vectors yielding a 2-vector. The double-cross symbol, \(\#\), evokes two pairwise combinations: perfect. Although a pair of products of 1x2 and 2x1 matrices amounts to the same thing I'll use two dot products in the following definition and derivation:
$$ \begin{align} N\ \#\ K\ \ &\text{≜} \begin{bmatrix} N \cdot (K+1)\\ N \cdot K \end{bmatrix} = \begin{bmatrix} N'(K+1)' + N''(K+1)'' \\ N'K' + N''K'' \end{bmatrix}\\ & =\begin{bmatrix} f(n-1)f(k+1-1) + f(n-2)f(k+1-2) \\ f(n-1)f(k-1) + f(n-2)f(k-2) \end{bmatrix} =\begin{bmatrix} f(n-1)f(k) + f(n-2)f(k-1) \\ f(n+(k-2)) \end{bmatrix} \\ & =\begin{bmatrix} f(n+(k-1)\\ f(n+(k-2)) \end{bmatrix} =\begin{bmatrix} (N+K)'\\ (N+K)'' \end{bmatrix} = CAP(n+k) \end{align}$$That is, the codomain combination (\(\#\)) calculated according to the definition-given formula, becomes the domain combination (\(+\)) of indexes in the Fibonacci sequence. The nervous reader, seeking to fit these calculations into a well-founded and incremental process (say, doubling the indexes) may wish to check that it doesn't assume its output before calculating it. Observe that although \((K+1)''=f(k+1-2)=f(k-1)\) is among the available components of \(N\) and \(K\), namely \(K'=f(k-1)\), we do not immediately find \((K+1)'=f(k)\) among those available components, which might suggest we are not ready to make this calculation. However the Fibonacci definition itself provides a substitution: \(f(k)=f(k-1)+f(k-2) = K' + K''\). This lets us proceed to double or shift an index while using only available components in \(N\) and \(K\) to calculate the shifted vector. In a doubling process which at each step arrives at \(F[a]\) with \(n=2^a+1\) elements, although we may have calculated values \(F(k)\) up to \(k \le n\), we cannot assume availability of \(F(k)\) for \(k>n\). Since we mustn't get ahead of ourselves, our maximum shift will be achieved with \(k=n\), which gives a doubling. Great.
$$ \begin{align} N\ \#\ N\ \ &=\begin{bmatrix} N \cdot (N+1)\\ N \cdot N \end{bmatrix} &= \begin{bmatrix} N'(N+1)' + N''(N+1)''\\ N'N' + N''N'' \end{bmatrix} &= \begin{bmatrix} N' (N' + N'') + N'' N'\\ N'N' + N''N'' \end{bmatrix} &= \begin{bmatrix} (2N)'\\ (2N)'' \end{bmatrix} \end{align}$$Note the doubled array F[\(a\)+1] has length \(m=2n=2^{(a+2)}\). And its final-element 2-vector (2N) depends (only!) on the final-element 2-vector (N) of the F[\(a\)] array. Who knew that such potentially remote entries in the Fibonacci series had such close relationships?!
Shall we look at an example? Consider the case of \(a=3\) so \(n=8, a+1=4, m=2n=16\).
In the pattern we seek F(m-1) and F(m-2), so let's calculate
F[4][14] and F[4][15] from F[3][6] and F[3][7]. Fib[4][14] = Fib[3][6]*Fib[3][6] + Fib[3][7]*Fib[3][7] = 13*13 + 21*21 = 169 + 441 = 610and
Fib[4][15] = Fib[3][7]*(2*Fib[3][6] + Fib[3][7]) = 21 * (2*13 + 21) = 21 * (26+21) = 21 * 47 = 987To calculate each new 2-vector, (e.g., F(\(2^a\)-1) and F(\(2^a\)-2)) requires 2 multiplies and one add for each element. For F(2^1000-1) requires 4000 multiplies and 2000 adds or in general O(log(N)) operations. The following table is re-generated every time this page is loaded:
These results are limited by PHP LONG INT and DOUBLE FLOAT computation, but it seems the algorithm and results are correct as far as the eye can see, comparing to some reference results: F[14]=610, F[15]=987 F[30]=1346269, F[31]=2178309 F[62]=6557470319842, F[63]=10610209857723 F[126]=155576970220531065681649693, F[127]=251728825683549488150424261 ... F[1022] = 278529355069959292393881241266809350935330735212370380691318266898736\\ 950320346518362561675961332445274995854966996688219111789542501520845\\ 546940373127265215824082562848481813148554423082730494051913219529946\\ 6733282 F[1023] = 450669963367781981310438323572888604936786059621860483080302314960003\\ 064570872139624879260914103039624487326658034501121953020936742558101\\ 987106764609420026228520234665586889971108924677841335400410363155392\\ 5405243
This PHP code generates the above table (live! those durations were re-measured just now). function Fibonacci($n) { $A = log($n,2)-1; $Fn_1 = $Fn_2 = 1; for ($a=1; $a<=$A; $a++) { $Fm_2 = ($Fn_2 * $Fn_2) + ($Fn_1 * $Fn_1); $Fm_1 = $Fn_1 * (2 * $Fn_2 + $Fn_1); $Fn_1 = $Fm_1; $Fn_2 = $Fm_2; } return "F(" . ($n-2) . ") = $Fm_2, F(" . ($n-1) . ") = $Fm_1\n"; }Containing an actual log(n) function call and an iteration counting up to that many, this is clearly O(log(n)) in complexity. So far so good, but so far this only works on powers of 2.
Fib(\(p\)) for \(p\) not a power of 2Every counting number \(p\) is a sum of powers of 2, like 5 is a sum of \(2^2 + 2^0\), which are the binary digits in \(p\): decimal 5 = binary 101.Considering \(p\) as a data structure comprising various properties, we might give names to
For example, for \(p=17=2^0+2^4\), pbits=[0,4], pbits.length=2, pbits.max=4. Each bit being a power of 2 also corresponds to a doubling of the Fibonacci array, from F[\(a\)] to F[\(a\)+1], etc. And each \(N\)-sized array has its 2-vector, \([\ N'\ N''\ ]^T\). Now we approach the nub of the question, and the answer is Yes:
Let \(n\) be a power of 2, \(n=2^{(a+1)}\), so F(n-1) is the last element in F[\(a\)]. Using the recurrence equation, F(n+k) depends on values \(N', N'', K', K''\), all within F[\(a\)] or (redundantly) in the first half of F[\(a\)+1]. |
n | F(n) | Δ hrtime |
1 | 1 | 0.01011ms |
2 | 2 | 0.00139ms |
3 | 3 | 0.00278ms |
4 | 5 | 0.002121ms |
5 | 8 | 0.00192ms |
6 | 13 | 0.00166ms |
7 | 21 | 0.0022ms |
8 | 34 | 0.00165ms |
9 | 55 | 0.00178ms |
10 | 89 | 0.00162ms |
11 | 144 | 0.00198ms |
12 | 233 | 0.00172ms |
13 | 377 | 0.00212ms |
14 | 610 | 0.002ms |
15 | 987 | 0.00241ms |
16 | 1597 | 0.00172ms |
17 | 2584 | 0.00203ms |
18 | 4181 | 0.00198ms |
19 | 6765 | 0.00231ms |
20 | 10946 | 0.00201ms |
31 | 2178309 | 0.00313ms |
32 | 3524578 | 0.002071ms |
33 | 5702887 | 0.0024ms |
35 | 14930352 | 0.00272ms |
32 | 3524578 | 0.00201ms |
54 | 139583862445 | 0.00318ms |
59 | 1548008755920 | 0.003521ms |
62 | 6557470319842 | 0.003489ms |
63 | 10610209857723 | 0.003849ms |
64 | 17167680177565 | 0.00242ms |
126 | 1.5557697022053E+26 | 0.0046ms |
127 | 2.5172882568355E+26 | 0.005471ms |
128 | 4.0730579590408E+26 | 0.003129ms |
200 | 4.5397369416531E+41 | 0.003972ms |
255 | 1.4169381771406E+53 | 0.006229ms |
300 | 3.5957932520658E+62 | 0.005031ms |
400 | 2.8481229810849E+83 | 0.004329ms |
510 | 2.7745922289306E+106 | 0.00646ms |
511 | 4.489384531331E+106 | 0.0063ms |
512 | 7.2639767602616E+106 | 0.00364ms |
Incidentally, in our codomain notation, F(-2) = 0'' = 1; F(-1) = 0' = 0; F[0] = 0' + 0'' = 1 + 0 = 1, etc. Divacon FibonacciConsider how to approach Fibonacci using Divacon. More than F(\(n\)) for a single index \(n\), we could simultaneously to calculate an entire rank of Fibonacci numbers F(0) to F(\(2^a-1\)), or F(\(p\)) for \(p \in 0..2^a-1\).
Such a process does require knowledge of the endpoint, the last rank
\(a\). Also it ends with all the values F[0]..F(\(2^a-1\)).
This is not in the spirit of Divacon, which likes to take arrays of
data for its input, such as in this case an array of indexes into the
conceptually infinite Fibonacci array. FibonacciArray \(X, X \subset \{x | x \in I, x\ge 0 \}\)Consider a new division operation \(d_{01}\), similar to \(d_{eo}\) except that the even/odd decision is not based on the index of an entry in the input array, the 0'th being even, etc., but on the final (or initial) bit in the values of the input array. \(d_{01}\) then divides the inputs (which are to be interpreted as a set of Fibonacci indexes) into those with final bit 0 placed into the left output vector, say Y, and those with final bit 1 placed into the right output vector, say Z. To construct a partition of indexes, Y, those which are 0-final, and Z, those which are 1-final, we could write:
$$\begin{align} d_{01} X\ =\ Y\ Z\ |\ & \forall\ x=(b,k,N,K) \in X\ \\
\text{if}\ & x.k = (x.k\gg 1\ll 1)\ \ \text{// last bit is 0}\\
\text{then}\ & x\ \in\ Y,\\
\text{else}\ & x\ \in\ Z
\end{align}$$
Assuming Y or Z is not yet empty, then each of their elements must be adjusted for its new depth in the tree:
preadjust = \((b=k\&\text{0x1}, k:=k\gg 1, N:=N\# N,K:=(b\ ?\ N\# K;K)\ )\)That is:
Next we must stop at the bottom of the fully-expanded terminals of the computation tree. We can now discard the intermediate data \(N\) and make the important transition from codomain to domain here: f(k)=K'+K''. Retaining the rest we have:
postadjust = \( \forall\ v\ \in Y \cup Z\ \ \ v:=( v.b; v.k\ll 1+v.b, v.K )\)Maybe I should draw it out graphically to be sure, but I think we are there. The sort order will be whatever falls out rather than the original index sort order, unless we stored a reverse mapping operation somewhere to undo the tail-bits shuffle. It's okay, the original index returns as part of the output tuple. Putting this into one code block we have Given an array of indexes \(k\) this PDC returns a shuffled array of tuples \((k,f(k))\).
ConclusionTo summarize, PDC seems to be overkill for Fibonacci F(\(2^a\)), which is already O(log(N)) on single processor. The F(n+k) recurrence enables calculating the next doubled level's dependencies consecutively in constant time, four multiplies and three adds, O(log(N)) times. On this very page this algorithm is implemented twice to generate the data above, using mere iteration, not even requiring recursion, much less a parallel decomposition.Still, writing a serial version of this function in Divacon still brings a high degree of code compaction: the Divacon expression of this algorithm uses 1/4 the number of lines of the PHP iterations. We gain benefits from an orthogonalized and systematic definition of the standard ingredients of a Divacon, parallel Divide-and-Conquer, decomposition: f = PDC(divide, combine, pre, post, basepred, basefunc) This implements a composition of standard component functions: f X = ( basepred ? basefunc ; combine : post : !f : pre : divide) X Submitting to the discipline of this opinionated methodology of function decomposition not only provides but requires a certain simplicity and elegance, a desirable depth of thought, which neatly reveals the essential structure of the computation, the problem, the hidden dimensions, and the relations among levels and elements. I'd say it offers understanding as well as correct results, although the cost is that of some study:
The exercise is beneficial and empowering. Here, my next step was to go from a single-input function solved by iteration to a FibonacciArray function solving any number of inputs in parallel, using the Divacon approach. This took some thinking. We've here seen single inputs limited to powers of 2, or then to an unrestricted integer. Would a Divacon function take a single input number, such as a rank \(a\) to request an output array of values F(n) for \(n\in 0..2^{a}-1\)? Or should we prepare to handle a full enumeration of those or anyhow some indexes? The single-value input approach didn't make sense to me in the Divacon space considering it has no divide operation, although it would require building a tree such as the following recursion:
tree node = tree 0-daughter ; tree 1-daughter without looking at input indexes. Also it could be done in depth order from high bit to low, or low bit to high, either way; I discovered no good reason to make that choice. However, a rank's worth of indexes \(0..2^{a}-1\) could be divided down to the last bit, calculating \(N\#N\) from \(n=1\) by doublings up to \(n=2^a\) and on the way up or down combining those N's with accumulated K's as triggered by the current bit. Better yet we might relax the specification of a whole rank \(0..2^a-1\), to allow any set \(\{n|n\ge 0,n\in I\}\) of integral indexes into the conceptually infinite Fibonacci array. Building a tree for them might be imbalanced, but operations are linear in bit-depth (i.e. log(\(n\)) for the largest \(n\), not in the number of indexes. If the index list were complete for a rank \(a\), it offers O(log(2^a))=O(log(n)) time cost, which seems pretty good, since we achieved no better for a single F(n) calculation above. Once the array function input was thus specified as any list of integers the rest followed workmanlike.
PostscriptA final remark: If the N-length array were previously calculated for some finite but sufficient depth, say a=1000 which requires only 2k entries to store and O(a) time to populate once, then the K's could be calculated in separate nibbles of the original index's bit representation, suggesting a Divacon divide-and-conquer solution with O(log log N) time cost. I have in mind recursively dividing the bits of an index \(k\) into the low half and the high half, and in a postadjustment calculating \(K_{hi}\#K_{lo}\) as the codomain value for the combined low and high bits. This is allowable because \(+\) between indexes in the domain as in \(f(k_{hi}+k_{lo})\) licenses \(\#\) between codomain mappings \(K_{hi}, K_{lo}\) in the codomain. At the base of the division tree with bit \(b\) at depth \(d\), we return K=b?N[d]:0) as the value of a single bit, and in combining sets of bits we add them in the codomain \(K_1\#K_2\).Since a simple PDC with binary balanced division and fixed-cost postadjustment yields a O(log(a)) parallel decomposition, but our bit count \(a\) is already O(a=log(n)) divisions of the possible \(a\)-deep index range, the composition of these should be O(log(log(n))). Can I write the whole thing in one minute, like George claims? Maybe... First, let the input be an (unsigned) integer \(k\) divided into an \(a\)-length bit array, tupled with their indexes in that array. Thus an integer k=5=0x101 becomes input \((d=2,b=0x1), (d=1,b=(0x0)), (d=0,b=0x1)\). Then divide can split these bit-array nibbles, each of which knows their depth d. Let's assume at the base we can refer to \(N[b] = [ N' N'' ]^T \text{for}\ n=2^b\). Then our PDC would be
\(F k = atom? (b\&0x1 ? K=N[d] ; K=N[0]) ; c_{lr} : K_{l}\ \#\ K_{r} : ! F : d_{lr}\) or $$\begin{align} F = PDC(& divide = f_{lr}, combine = c_{lr}, \\ & preadjust = id, postadjust K_{l} \# K_{r}, \\ & basepred = atom?, basefunc = (b \& 0x1 ? K=N[d] ; K=N[0])); \end{align}$$ Basefunc sets outputs K as 1-bit values for \(n=1\ll d: N=(Nā\ Nā)\) or \(0=(0ā=0, 0ā=1)\) by lookup. Having divided the bits of the input log(a) times down to these single-bit elements (b,d), the postadjust phase combines codomain K values associated to the increasingly larger nibbles of bits, in a local calculation, so the next combine operation puts double-sized nibbles together, all in another log(a) time steps. Since log(a) is log(log(n)), this is O(log log n) on the input index n. Yes, that was about a minute.
|