star-free


Let and 𝒢 be the families of languagesPlanetmathPlanetmath represented by regular expressionsMathworldPlanetmath and generalized regular expressions respectively. It is well-known that

=𝒢.

This means that the additional symbols and ¬ in a generalized regular expressions are extraneous: removing them will not affect the representational power of the expressions with respect to regular languages.

What if we remove *, the Kleene star symbol, instead? With regard to regular expressions, removing * severely limits the power of the expressions. By inductionMathworldPlanetmath, the represented languages are all finite. In this entry, we briefly discuss what happens when * is removed from the generalized regular expressions.

Definition. Let Σ be an alphabet. A language L over Σ is said to be star-free if it can be expressed by a generalized regular expression without *. In other words, a star-free language is a language that can be obtained by applying the operationsMathworldPlanetmath of union, concatenationMathworldPlanetmath, and complementation to and atomic languages (singleton subsets of Σ) a finite number of times.

If we denote 𝒮 the family of star-free languages (over some alphabet Σ), then 𝒮 is the smallest set of languages over Σ such that

  • 𝒮,

  • {a}𝒮 for any aΣ,

  • if L,M𝒮, then LM,LM,Lc𝒮.

A shorter characterizationMathworldPlanetmath of a star-free language is a language with star height 0 with respect to representations by generalized regular expressions.

In relationsMathworldPlanetmathPlanetmath to finite and regular languages, we have the following:

𝒮, (1)

where denotes the family of finite languages over Σ.

Furthermore, it is easy to see that 𝒮 is closed under Boolean operations, so that 𝒮 contains infiniteMathworldPlanetmath languages, for example ¬ represents Σ*. As a result, the first inclusion must be strict. This example also shows that languages representable by expressions including the Kleene star may still be star-free. Here’s another example: {ab}* over the alphabet {a,b}. This language can be represented as

λ(abΣ*Σ*ab¬(Σ*a2Σ*)¬(Σ*b2Σ*))

The expression above, of course, is not star-free, and includes the symbol λ representing the empty wordPlanetmathPlanetmathPlanetmath. However, Σ* is just ¬, and λ is just Σ*¬(aΣ*)¬(bΣ*). Some substitutions show that {ab}* is indeed star-free.

Is the second inclusion strict? Are there regular languages such that representations by expressions including the Kleene star is inevitable? The following propositionPlanetmathPlanetmathPlanetmath answers the question:

Proposition 1.

A language L is star-free iff there exists a non-negative integer n such that, for any words u,v,w over Σ, uvnwL iff uvn+1wL.

A language satisfying the second statement in the proposition is known as noncounting, so the proposition can be restated as: a language is star-free iff it is noncounting.

As a result of this fact, we see that languages such as {(ab)2}* is not star-free, although it is regularPlanetmathPlanetmath. Indeed, if we pick u=v=w=ab as in the statement of the proposition above, we see that uvnw is in the language iff uvn+1w is not in the language, for any n0. Therefore, the second inclusion in chain (1) above is also strict.

The above proposition also strengthens chain (1): denote by 𝒯() the family of locally testable languages, then

𝒯()𝒮. (2)

The first inclusion is due to the fact that, for any k-testable language L (over some Σ), we have

swk(uvkw)=swk(uvk+1w)

(the definition of swk(u) is found in the entry on locally testable languages). Note the first inclusion is also strict. For example, the language represented by (ab)*(ba)* is star-free but not locally testable.

References

  • 1 A. Ginzburg, Algebraic Theory of Automata, Academic Press (1968).
  • 2 A. Salomaa, Formal LanguagesMathworldPlanetmath, Academic Press, New York (1973).
Title star-free
Canonical name Starfree
Date of creation 2013-03-22 18:59:05
Last modified on 2013-03-22 18:59:05
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 16
Author CWoo (3771)
Entry type Definition
Classification msc 68Q42
Classification msc 68Q45
Synonym star free
Synonym non-counting
Synonym noncounting
Related topic StarHeight