What are Extended Regular Expressions? How Are They Helpful?

Extended Regular Expressions

Extended Regular Expressions


This article is a section to the more extensive guide, The Ultimate Guide To Computational Computer Science Theory.

Regular Expressions are very basic. Suppose you want to create something like a string or an alphabet; you would have to have very many Regular Expressions connected throughout. The idea behind an Extended Regular Expression is that you are breaking a common complex expression down into many Regular Expressions.

Extended Regular Expressions are generally more of an implementation method compared to a theory method. That being said, it does exist in computational theory as well; you just have to define what you are doing to your reader in advance.

Strings

Strings are an Extended Regular Expression. The idea of having to concatenate a string of n length doesn’t seem difficult even on paper since most people just write the letters next to one another. But the moment we move to program a solution, it will become an immensely more daunting task.

If we wanted to implement a programming solution to strings, we would see something like the following.

function build-ere-string(string):
    if string.length == 1:
        return build-re-sym(string[0])
    else:
        return build-re-concat(build-re-sym(string[0]), build-ere-string(string[1:]))

Alphabets

Alphabets can be used in many ways. We typically will want to define a whole language, but sometimes we may want to differentiate a sequence of characters from a sequence of numbers. In computational theory, it’s not unheard of to throw the sigma into a regular expression either.

Programmatically again, this can be brutal to type out since it can be 26+ or statements. So we would implement the code below to allow us an easier time.

function build-ere-alphabet(symbols):
    if symbols.length == 1:
        return build-re-sym(symbols[0])
    else:
        return build-re-concat(build-re-sym(symbols[0]), build-ere-string(symbols[1:]))

Exclude

Exclusion is often a helpful thing to have. If we wanted all the alphabet except for an “a,” this would involve making a whole new alphabet. On paper, it’s quite the task, so we usually use something like the following expression.

$$\Sigma – \{a\}$$

In programming, we have something very similar to the alphabet, just with another parameter that is our excluded symbols to skip.

function build-ere-alphabet(symbols, exclude):
    if symbols[0] in exclude:
        return build-ere-alphabet(symbols[1:], exclude)
    if symbols.length == 1:
        return build-re-sym(symbols[0])
    else:
        return build-re-concat(build-re-sym(symbols[0]), build-ere-string(symbols[1:]))

Optional

Optional allows you to have either one or none of the regular expressions. This is pretty simple as far as theory goes; you simply or your regular expression with epsilon.

Implementing an optional in your regular expressions in programming tends to come in handy from time to time. Here is another implementation to try.

function build-ere-optional(regex):
    return build-re-or(regex, build-re-epsilon())

Plus / At Least One

Plus, or At Least one goes by both names depending on who you are talking to. The basic idea is that you have to have at least one of the regular expressions, but once you have the one, you can have infinite repetitions. This is simply the regular expression concatenated with a Kleene star of the regular expression.

function build-ere-plus(regex):
    return build-re-concat(regex, build-re-star(regex))

Books

Here is a list of books that taught me a great deal about Computational Theory. If there are any books that you have benefited from, feel free to share them in the comments and help grow this list!