This article explains a technique to avoid calculating the greatest common divisor when performing cycle decomposition in std::rotate, as used in OpenJDK's Collections.rotate method. It provides a C++ implementation that tracks the count of rotated elements to determine when all cycles are complete.
<p>Last time, we looked at <a title="Rotation revisited: Cycle decomposition in clang's libcxx" href="https://devblogs.microsoft.com/oldnewthing/20260604-00/?p=112384"> how clang’s libcxx implementation of <code>std::<wbr />rotate</code> uses cycle decomposition</a> to minimize the number of swaps. Doing so requires calculating the greatest common divisor, but I noted that the OpenJDK implementation of the java standard library uses a trick to <a href="https://github.com/openjdk/jdk/blob/f1e0e0c25ec62a543b9cbfabd630fc4ef17a8b5c/src/java.base/share/classes/java/util/Collections.java#L822"> avoid doing the gcd calculation</a>.</p>
<p>The trick is realizing that the total number of elements is equal to the sum of the lengths of each of its cycles, and each of the initial elements belongs to a different cycle. Therefore, we can just keep rotating elements until the number of elements rotated is equal to the total. We don’t have to precalculate the number of cycles; we just let the counter tell us when we’re done.</p>
<pre>auto a = std::distance(first, mid); // number of "A" elements
auto n = std::distance(first, last); // total elements
auto count = 0;
auto k = 0;
while (count < n) {
// Rotate the elements in the cycle starting at k
auto save = std::move(first[k]);
auto i, next = k;
while (i = next, next = (i + a) % n, next != k) {
first[i] = std::move(first[next]);
++count;
}
first[i] = std::move(save);
++count;
}
</pre>
<p>The post <a href="https://devblogs.microsoft.com/oldnewthing/20260605-00/?p=112389">Rotation revisited: Avoiding having to calculate the gcd when doing cycle decomposition</a> appeared first on <a href="https://devblogs.microsoft.com/oldnewthing">The Old New Thing</a>.</p>
# Rotation revisited: Avoiding having to calculate the gcd when doing cycle decomposition - The Old New Thing
Source: [https://devblogs.microsoft.com/oldnewthing/20260605-00?p=112389](https://devblogs.microsoft.com/oldnewthing/20260605-00?p=112389)
June 5th, 2026
1 reaction

Last time, we looked at[how clang’s libcxx implementation of`std::rotate`uses cycle decomposition](https://devblogs.microsoft.com/oldnewthing/20260604-00/?p=112384)to minimize the number of swaps\. Doing so requires calculating the greatest common divisor, but I noted that the OpenJDK implementation of the java standard library uses a trick to[avoid doing the gcd calculation](https://github.com/openjdk/jdk/blob/f1e0e0c25ec62a543b9cbfabd630fc4ef17a8b5c/src/java.base/share/classes/java/util/Collections.java#L822)\.
The trick is realizing that the total number of elements is equal to the sum of the lengths of each of its cycles, and each of the initial elements belongs to a different cycle\. Therefore, we can just keep rotating elements until the number of elements rotated is equal to the total\. We don’t have to precalculate the number of cycles; we just let the counter tell us when we’re done\.
```
auto a = std::distance(first, mid); // number of "A" elements
auto n = std::distance(first, last); // total elements
auto count = 0;
auto k = 0;
while (count < n) {
// Rotate the elements in the cycle starting at k
auto save = std::move(first[k]);
auto i, next = k;
while (i = next, next = (i + a) % n, next != k) {
first[i] = std::move(first[next]);
++count;
}
first[i] = std::move(save);
++count;
}
```
### Category
### Topics
## Author

Raymond has been involved in the evolution of Windows for more than 30 years\. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie\-jeebies\. The Web site spawned a book, coincidentally also titled The Old New Thing \(Addison Wesley 2007\)\. He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information\.
## Read next
## Stay informed
Get notified when new posts are published\.
Follow this blog
- [https://twitter.com/ChenCravat](https://twitter.com/ChenCravat)
- [](https://www.youtube.com/playlist?list=PLlrxD0HtieHge3_8Dm48C0Ns61I6bHThc)
- [https://github.com/oldnewthing](https://github.com/oldnewthing)
- [https://devblogs.microsoft.com/oldnewthing/feed/](https://devblogs.microsoft.com/oldnewthing/feed/)
The article delves into the cycle decomposition algorithm used in clang's libcxx for rotation, explaining how it achieves the minimum number of swaps by computing the greatest common divisor (gcd) to determine the number of cycles.
Raymond Chen explores gcc libstdc++'s rotation algorithm for random-access iterators, revealing it is fundamentally the same as the forward-iterator rotation algorithm, just viewed from a different perspective. The post is part of a series comparing rotation implementations across compilers.
Raymond Chen revisits a unidirectional rotation algorithm for swapping adjacent memory blocks, explaining its recursive approach and performance characteristics.
Revisits Matt Austern's 2000 paper on segmented iterators, which enable hierarchical algorithms to exploit data structure segmentation for performance, and discusses modern adoption in libc++ and Boost libraries.
This draft book chapter provides an infographic and detailed analysis of operation costs in CPU clock cycles for modern C++ CPUs, covering multiplication, division, and RTTI with latency tables for various architectures.