After Thue's time, repetition-free words have been used in various fields of mathematics. For example, in group theory, in formal languages, in connection with unending games, and in symbolic dynamics (which constitutes a tool for studying chaos). Very recently repetition-free words have also aroused interest in the field of music, see eg. Laakso [5].
Let X be a four letter alphabet {a,b,c,d }. In [3] (Keränen 1992) we presented an Abelian square-free endomorphism g: X* -> X* which we found by the aid of computers. This endomorphism g: X* -> X* is a mapping that preserves the Abelian square-freeness of words. Its size is |g(abcd)| = 340, and it is defined as follows. The image word of a is
and the image words of letters b, c, d, i.e., the words g(b), g(c), g(d) are obtained by cyclic permutation of letters in the preceding words g(x), with x in X.
All currently known methods, see [1] (Carpi 1994), for constructing arbitrarily long Abelian square-free words over a four letter alphabet X are based on the structure of this endomorphism g. However, it is not known whether there exist Abelian square-free endomorphisms over X with size less than the above-mentioned 340. Solving this and other questions concerning algebral structures of words is developing into a challenging area of computer algebra.
In 1961 Erdös [2, s. 240] raised the question whether abelian squares can be avoided in infinitely long words, i.e. whether there exist infinitely many Abelian square-free words over a given alphabet. An Abelian square means a non-empty word uv, where u and v are permutations of each other. For example, abcacb is an Abelian square. A word is called Abelian square-free, if it does not contain any Abelian square as a subword. For example, the word abacaba is Abelian square-free, while abcdadcada is not (it contains the subword cda dca).
Later, in 1970, Pleasants [6] showed that there exists an infinite Abelian square-free word over five letters.
The case of four letters remained open until 1991 when we managed to show, see Keränen [3], that the iteration of the above-mentioned endomorphism g produces as a "limit" an infinite Abelian square-free word. In the picture (necklace), the beginning of this infinite self-reading word, namely g(abcacd ) = g(a) g(b) g(c) g(a) g(c) g(d), is represented by 6*85 coloured polyhedra in six circles. Circles start from below and each of them represents an image word for a letter read from the beginning of the chain. Thus the first circle is g(a), the second is g(b), then come g(c), g(a), g(c) and g(d).
It is easily seen that abelian squares cannot be avoided over a three letter alphabet. Indeed, in this alphabet, each word of length 8 contains an abelian square. Thus a four letter alphabet is special in the sense that it is the smallest alphabet over which Abelian squares can be avoided in words of unbounded length. Could this special state of affairs be in connection with the genetic code used by nature? - DNA is made up of four different characters.
A part of this abstract has been published in Finnish in [4].