Tuesday, July 11, 2017

A matrix formulation of extended Euclid algorithm.

I have written about Extended Euclid Algorithm in the past. However more recently while solving a problem I had to express it in a matrix form which I thought is something worthwhile.

Recall the problem statement: Given two non-zero integers $a,b$ (assume $a\geq b$), the problem is to determine integers $x,y$ such that $\mbox{gcd}(a,b) = a\times x + b \times y$. Notice that the original Euclid's algorithm can compute the gcd but not the coefficients $x,y$ whose linear combination yields the gcd. The idea here is to use the quotient instead of the remainder as in the original algorithm. We express the operands in each iteration a linear combination of the original inputs and use the following to update as follows. The quotient in $i^{th}$ iteration is denoted by $q_i$ leading to the following formulation.

$$

\begin{array}{cc}
\mbox{gcd}(a,b) = \mbox{ExGcd}(\left[\begin{array}{c} a \\ b \end{array}\right]) &\\
\left [
\begin{array}{cc}
x_{0,0} & y_{0,0}\\
x_{1,0} & y_{1,0}
\end{array}
\right ] = \left [ \begin{array}{cc}1 & 0\\ 0 & 1\end{array} \right ] & \mbox{initialization } i=0\\


\mbox{ExGcd}(\left [\begin{array}{cc} a & b \end{array}\right ] \left [ \begin{array}{cc} x_{0,i} & y_{0,i} \\ x_{1.i} & y_{1,i} \end{array} \right]) = \mbox{ExGcd}(\left[\begin{array}{cc} a & b \end{array}\right]
\left [
\begin{array}{cc}
x_{0,i-1} & y_{0,i-1} \\
x_{1,i-1} & y_{1,i-1}
\end{array}
\right ]
\left [
\begin{array}{cc}
0 & 1 \\
1 & -q_{i-1}
\end{array}
\right ]) & i > 0 \\

q_i = \mbox{Quotient}(x_{0,i}a+x_{1,i}b, y_{0,i}a+y_{1,i}b) & \\
\end{array}
$$


Following is an implementation using std::div:
// Input: a,b
// Output: x, y (y0,y1)
void ExtendedEuclid(long a, long b, long& y0, long& y1)
{
 long x0, x1, t0, t1;

 if (a < b) { std::swap(a, b); }
 // Initialize:
 x0 = 1; x1 = 0; // a = 1 * a + 0 * b
 y0 = 0; y1 = 1; // b = 0 * a + 1 * b
 
 std::ldiv_t div_res = std::div(a, b);
 while (div_res.rem) {
  t0 = y0; t1 = y1;

  y0 = x0 - (div_res.quot * y0);
  y1 = x1 - (div_res.quot * y1);
  x0 = t0; x1 = t1;

  // Invariant: (x0*a+x1*b >= y0*a+y1*b)
  div_res = std::div(x0*a+x1*b, y0*a+y1*b);
 }
}

Following are few examples obtained by calling ExtendedEuclid(a,b,x,y):
a=25375,b=667 x=1, y=-38, gcd=29 
29 = 25375 * 1 + 667*38

a=17765,b=13481 x=-343, y=452, gcd=17
17 = 17765 * -343 + 13481 * 452

a=19109,b=18321 x=-23, y=24, gcd=197
197 = 19109 * -23 + 18321 * 24

Sunday, May 28, 2017

Efficient polygon area computation with less multiplications.

Linear algebra computations typically try to prefer mathematical formulations which preserve numerical precision, reduce multiplications and avoid divisions if possible. More recently I came across an elegant formulation of polygon area computation with half the number of multiplications relative to the standard method. I will try to briefly derive it for posterity.

Standard Formulation: Let $P = [v_0, v_1, v_2, v_3\ldots v_{n-1}, v_{n}=v_0]$ be a polygon with $n$ vertices, $v_i = (x_i,y_i)$. Recall the standard formula for the area of the polygon $A(P) = \sum_{i=0}^{n-1} (x_{i}y_{i+1} - y_{i}x_{i+1})$. Every term $T(i) = (x_{i}y_{i+1} - y_{i}x_{i+1})$ corresponds to the edge $(v_i, v_{i+1})$ of the polygon -- more precisely this is the signed area of the triangle ${(0,0), (x_i,y_i), (x_{i+1}, y_{i+1})}$. Clearly computation of $A(P) = \sum_{i=0}^{n-1} T(i)$ needs exactly $2n$ multiplication operations.


Alternate Formulation: Consider $T(i) = x_{i}y_{i+1} - y_{i}x_{i+1}$, now add $x_iy_i$ and subtract $x_{i+1}y_{i+1}$ resulting in the following:
$$
\begin{array}{lll}
D(i) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} \\
D(i) + D(i+1) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} + x_{i+1}y_{i+1} + T(i+1) - x_{i+2}y_{i+2} \\
&=& x_iy_i + T(i) + T(i+1) - x_{i+2}y_{i+2}\\
\sum_{i=0}^{n-1}D(i) &=& x_{0}y_{0} + \sum_{i=0}^{n-1} T(i) - x_ny_n\,\, \mbox{notice that}\, (x_0,y_0) = (x_n,y_n)\\
&=& \sum_{i=0}^{n-1} T(i)
\end{array}
$$

Whats special about $D(i) = x_iy_i + T(i) - x_{i+1}y_{i+1}$ ?. Notice that $D(i)$ can be written as the following:
$$
\begin{array}{lll}
D(i) &=& x_iy_i + T(i) - x_{i+1}y_{i+1} \\
&=& x_iy_i + (x_iy_{i+1} - x_{i+1}y_i) - x_{i+1}y_{i+1} \\
&=& (x_i + x_{i+1})(y_i - y_{i+1}) \,\, \mbox{notice the only one multiplication}.
\end{array}
$$

Finally we state that the area of the polygon is equivalent to $A(P) = \sum_{i=0}^{n-1} (x_i+x_{i+1})(y_i - y_{i+1})$ and needs exactly $n$ multiplications instead of $2n$ with standard formulation.

Friday, January 20, 2017

An Efficient Algorithm to Determine the Order of a Permutation.

Consider at set $S = \{1,2,3\ldots n\}$ and recall that $S_n$ a set of all permutations of $S$ form a group under composition (<$S_n, *$>). E.g. $S=\{1,2,3,4\}$ , $\pi_1 = [3\,4\,2\,1]$ and $\pi_2 = [4\,1\,3\,2]$. To understand composition operation it may be easier to look at a permutation $\pi \in S_n$ as bijective function $\pi : S \rightarrow S$. The composition $\pi_1 * \pi_2$ would be equivalent to $\pi_1(\pi_2(i)), \forall i\in S$ which results in $\pi_1 * \pi_2 = [1\,3\,2\,4] $. We use the notation $\pi^k = \pi*\pi*\pi\ldots *\pi$ to denote a composition applied $k$ times. Since <$S_n,*$> is a group it must have an identity element $e$ which is a permutation such that $e(i) = i, \forall i\in S$.

Now lets focus on the following question:
Given any permutation $\pi \in S_n$ is there an integer $k \geq 0$ such that $\pi^k = e$ ?


Lets try few examples with $\pi_1 = [3\, 4\, 2\, 1]$, $\pi_1^2 = [2\, 1\, 4\, 3]$, $\pi_1^3 = [4\, 3\, 1\, 2]$, $\pi_1^4 = [1\, 2\, 3\, 4]$. So for this example $\pi_1$ we were able to find $k=4$ such that $\pi_1^4 = e$. Later we will show that finding such a $k$ is always possible for any permutation $\pi$ and such a $k$ is called the order of the permutation.

Recall that every permutation can be decomposed into one or more disjoint cycles. We call a permutation cyclic if it has exactly one cycle in it. E.g. $\pi_c = [3\, 1\, 4\, 2]$ is a cyclic permutation in $S_4$ with cycle being $1 \rightarrow 3 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 1$.

Lemma: If $\sigma \in S_n$ is a cyclic permutation then $\sigma^n = e$

Proof: Since $\sigma$ is cyclic all elements $S$ form a directed cycle with $n$ nodes. Starting at any node $i$ we need to traverse $n$ edges to get back to $i$. This can be algebraically expressed as $\underbrace{\sigma(\sigma(\ldots \sigma(i))\ldots)}_{n\text{-times}} =i$
$\Box$.


The above lemma states that if a permutation is cyclic on $n$ elements then composing it $n$ times on itself results in an identity permutation. Now we are ready to prove our main theorem:

Theorem: $\forall \pi \in S_n$ there exists a $k \geq 0$ such that $\pi^k = e$.

Proof: Let $\pi = \sigma_1\sigma_2\ldots \sigma_m$ be the cycle decomposition of $\pi$. Let $|\sigma_i|$ denote size of cycle. Since each of $\sigma_i$ is cyclic permutation in $S_{|\sigma_i|}$ by the previous lemma $\sigma_i^{|\sigma_i|} = e_{|\sigma_i|}$. Where $e_{|\sigma_i|}$ is an identity permutation in $S_{|\sigma_i|}$. Also notice that any multiple of $|\sigma_i|$ is also going to result in an identity permutation (i.e. for any integer $q \geq 1$ , $\sigma_i^{q|\sigma_i|} = e_{|\sigma_i|}$). Let $t = |\sigma_1|\times |\sigma_2|\times \ldots |\sigma_m|$ then $\pi^t = \sigma_1^t\sigma_2^t\ldots \sigma_m^t$ since $t$ is a multiple of $|\sigma_i|, 1\leq i \leq m$ we have $\pi^t = e_{|\sigma_1|}e_{|\sigma_2|}\ldots e_{|\sigma_m|} = e$
$\Box$.


From the above proof its clear that choosing $k = |\sigma_1|\times |\sigma_2|\ldots |\sigma_m|$ which is a multiple of all cycle lengths guarantees that $\pi^k = e$. However from a computational efficiency purpose we are much better if we let that multiple be the least common multiple of $|\sigma_i|,\, 1\leq i\leq m$. Following is an efficient algorithm -- linear time and takes at most $n$ l.c.m computations -- to find the order of any permutation:

// Finds the order of a valid permutation ( 1 \leq perm[i] \leq n).
size_t FindPermutationOrder(const std::vector< unsigned int > &perm) {
  size_t n = perm.size();
  size_t perm_order = 1; // This overflows if the l.c.m does not fit in 64-bits.
  std::vector< bool > cycle_mask(n, true);

  // Find the disjoint cycle lengths in the permutation.
  size_t cycle_index = 0;
  while (cycle_index < n) {
    // Find the start of the next disjoint cycle.
    if (cycle_mask[cycle_index]) {
      size_t i = cycle_index;
      size_t cycle_length = 0;
      do {
        cycle_mask[i] = false;
        i = perm[i] - 1;
        cycle_length++;
      } while (i != cycle_index);
      perm_order = boost::math::lcm(perm_order, cycle_length);
    }   
    cycle_index++;
  }
  return perm_order;
}
Following are few examples on the algorithm:
Order of [  5  3  1  2  4  ] is 5
Order of [  3  4  2  1  ] is 4
Order of [  1  2  3  4  5  ] is 1
Order of [  7  8  9  10  1  2  4  3  6  5  ] is 5