The Power of Division, Part 1

It’s no secret that I love using custom functions in FileMaker. This is probably because writing a big Let statement is about as close as you come in FileMaker to the experience of traditional programming, writing statements, setting variables in plain source code and so forth. Also, you can get a lot done with a well-written custom function, especially since they are tied into FileMaker’s calculation evaluation engine.

If you’ve done much with FileMaker custom functions, you know they’re capable of recursion. This is a computer science term for pieces of logic that call themselves, performing some task over and over until a terminating condition is reached. In FileMaker custom functions, this is useful, since it lets us simulate the looping we can do easily in scripts. The more technical term for looping is iteration. So we can use recursion in a custom function to implement iteration.

There are some limits to recursion, though. FileMaker will only let a custom function call chain build on itself a certain number of times: 10,000 times, or 50,000 if you program carefully and have the right kind of problem. (To explain this further would involve a digression into a concept called “tail recursion” that would make this post a little top-heavy.)

That limit shows us the drawbacks of using recursion for iteration. A loop that can run a maximum of 10,000 or 50,000 times isn’t much of a loop. A computer CPU can run that many iterations of a loop in sub-second time. This suggests that iteration, although easy to implement in a custom function, may not always be the best or most efficient choice.

Let’s consider what looks like a simple problem. I want a custom function that will take a piece of text and repeat it a number of times. If I call my function _Str_RepeatText( text; numberOfRepetitions), then _Str_RepeatText( "a"; 8 ) should give me “aaaaaaaa” and _Str_RepeatText( "one"; 4 ) should give me “oneoneoneone.”

A simple algorithm for this involves looping for the specified number of iterations and building up a result string by repeatedly adding one more copy of the text to the string. If you have much experience with recursive custom functions, this is easy to write in an iteration-as-recursion style. It might look like this.

Case ( numberOfRepetitions > 1;
text & _Str_RepeatText ( text; numberOfRepetitions - 1 ) ;
numberOfRepetitions = 1 ;
text ;
"" )

There are two problems with this approach. First, as mentioned, there will be a hard limit of 50,000 repetitions, which is … blah. Second, the algorithm is what the CS geeks call O(n) (read as “oh of n”), meaning that its running time has a linear relationship with input size — it will take twice as long to produce 5000 characters as it did to produce 2500. O(n) performance is kind of “meh”. It could be worse, but is pretty inefficient.

There is, however, a way to approach this that will break those limits wide open. A different approach will allow the production of strings as long as 2^10,000 characters (yes, two to the ten thousandth power) and run in O(log n) time — time proportional to the logarithm of n, which is about the best most algorithms can hope for. Producing a string of 10,000 characters this way will take about 56 recursive descents, instead of 10,000, thus cutting the running time to about 1/200 of what the “naive” version about would offer. Producing a string of 100,000 would take about 61 descents, not much more than the 10,000 case. It will still run at about 1/200 the time of the naive algorithm, and the naive method can never produce a string of that length — it will exceed FileMaker’s recursion limits first.

So what’s the secret? The secret lies in the divide-and-conquer approach that’s the hallmark of “true” recursion as opposed to iteration-as-recursion. It’s important to know that not all problems are amenable to a divide-and-conquer approach. When this approach is possible though, it can offer huge advantages over iteration.

The core idea here is that building up the repeated string one copy of the text at a time is not very efficient. Instead of going from a to aa to aaa, what if we double the string each time? Go from a to aa to aaaa to aaaaaaaa. When we start to to double things at each pass, they get big in a hurry (like the story about the fellow who asked a king for one grain of rice on the first square of a chessboard, two on the second etc.) Similarly, when we cut things in half at each pass they get small in a hurry. Recursive algorithms that can cut a problem size in half with each pass run very, very quickly. An input size of 1024 will be cut to an input size of 1 in just ten passes. That’s because 2^10 = 1024, and conversely, ten is the base-two logarithm of 1024. That’s why algorithms of this sort are said to run in “log-n” time.

Doubling sounds fine, but you might notice it only works if the requested number of repetitions is an exact power of two. In other words, if the caller wants 2, 4, 8, 16, 32, 64, or 128 repetions, you’re fine. But what about 77 repetitions? Here’s where the notion of divide and conquer comes in. You certainly can’t get to 77 by just doubling. But you can get to 64. Then you just have another 13 repetions to deal with. In other words, the problem _Str_RepeatText( "something"; 77) can be restated as _Str_RepeatText( "something"; 64 ) & _Str_RepeatText( "something"; 13 ). This has the classic hallmark of divide-and-conquer recursion: a problem is split into two similar-looking but smaller problems which are each solved on their own. In this case, solving for 64 is straightforward -- 64 is a power of 2, so we just double the string until we get 64 repetitions. But what about 13 repetions? Well, 13 is not a power of two, so we reapply the same logic: we split it into problems of size 8 (a power of 2) and size 5 (not a power of 2). The 8 is easy, so we’re left with the 5, which we again split, into 4 and 1. 4. is a power of 2, and 1 is, well, 1. We end with 12 operations (6 to get to 64, 3 to get to 8, 2 to get to 4, and 1 to get to 1).

So our algorithm looks like this:

  1. Is the desired number of repetitions a power 2?
    1. If so, double the current string, cut the number of desired repetitions in half (since we just doubled the number of repetitions we have) and repeat the process
    2. If the number of remaining repetitions is less than 1, we’re done
  2. If the desired number of repetitions is not a power of 2, find the largest power of 2 that’s less than the desired number of repetitions
    1. Call the function again for the power of 2
    2. Call the function again for the “remainder” repetitions — everything that exceeds the stated power of 2
    3. Combine the results of these two calls

Here’s the final result, with some explanatory comments. The function is defined as _Str_RepeatText( text; numberOfRepetitions ).

Let( [      // find the largest power of two that fits into the number of repetitions
         log = Int( _Math_Log ( 2 ; numberOfRepetitions ) );
            // we can do this many reps as a pure power of 2 without further splitting
         logReps = _Math_Power ( 2 ; log );
            // remainder of numberOfRepetitions that doesn't fit a power of 2
         remainderReps = numberOfRepetitions - logReps;
         result = Case(       // 1.  we need to split and process both the pure power of 2 and its remainder, then join them
                           remainderReps > 0;
                              _Str_RepeatTextNew( text; logReps ) & _Str_RepeatTextNew( text; remainderReps );

                              // or 2. we are working on a pure power of 2 but have not yet reached the desired length, so …
                           numberOfRepetitions >= 1;
                                 // double text, cut reps in half. and go again.
                              _Str_RepeatTextNew( text & text; numberOfRepetitions / 2 );
                            
                              // or 3. we are done processing a pure power of 2. so just return text.
                           text                                                                                                                     // 
                      )
      ];

      result
)

Again, with this approach, the running time is proportional to the base-2 logarithm of the input size, which is a value that grows much, much slowly than the input size itself. And since n recursions let us process an input of about size 2^n, FileMaker’s recursion limit of 10,000 limits us to a desired data size of around 2^10000 characters, a number so large it could not pose a limit to any modern computer (it’s equivalent to maybe 10^300 Libraries of Congress).

So the next time you approach a problem, certainty think about iterative solutions, but if there’s a way to approach the problem by repeatedly cutting it in half, you stand to save a lot of those precious recursive calls, and reap potentially massive gains in speed and problem size.

Note: a very similar approach to this function, but which takes the simpler approach of doubling until the string is “too big”, then using FileMaker’s Left function to cut it back down to size, is here.

Read The Power of Division, Part 2.

5 thoughts on “The Power of Division, Part 1”

  1. This is an example of Exponentiation by Squaring. While it is often used for math, it can be used any time you repeat an associative binary operation on one element.

    When was the last time you saw the words “associative law?” A binary operation, %, is associative if (a%b)%c = a%(b%c). Addition, multiplication, and string appending are all “associative.” Division and subtraction are not.

    Any time you have an associative binary operation, the repeated application of that operation to one element n times can be done with the “exponentiation by squaring” technique.

  2. Very cool post. I love this kind of stuff.

    That alternative you note (doubling until too big) might be a fair bit faster that iterating the remainder when the number of binary hops (that is, remainders) from a power of 2 to the repeat value is greater than 2. That is,

    _Str_RepeatText( “a”; 127 )

    would have more calculations than

    left ( _Str_RepeatText( “a”; 128 ); 127 )

    where 128 is calculated by ceiling ( log (127 ) / log ( 2 ) )

    I would expect similar execution times for numbers that are only 2 hops, like _Str_RepeatText( “a”; 96 )

    Of course, that’s a very specific solution, but thanks for prompting the brain cells

    r

  3. And sorry, I should have made the call signatures clear:

    _Math_Log( base; number )

    _Math_Power( number; exponent )

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top