# Regular Expressions The family of **regular languages** *(also called or type $3$)* is the simplest language family. - Regular expressions can be ambiguous if there are multiple structurally different derivations that result in the same sentence. Sufficient conditions for ambiguity exist. - Regular expressions can be extended with operators like union, concatenation, and star. - A language on an alphabet is regular if it can be defined by a regular expression. So only if it is defined by concatenation, union and star over the elementary languages of $L$ ($\{a_1\} , \{a_2\} , \{a_3\} \dots$). - Regular expressions have limits, as they cannot represent certain languages, such as those with unbalanced nesting or varying numbers of elements. To represent these languages, **generative grammars** must be used instead. ## Regex Regex basics: - ```x``` the ```x``` character - ```.``` any character except newline - ```[xyz]``` means ```x``` or ```y``` or ```z``` - ```[a-z]``` any character between ```a``` and ```z``` - ```[^a-z]``` any character except those between ```a``` and ```z``` Said ```R``` a regular expression: - ```RS``` concatenation of ```R``` and ```S``` - ```R|S``` either ```R``` or ```S``` - ```R*``` zero or more occurrences of ```R``` - ```R+``` one or more occurrences of ```R``` - ```R?``` zero or one occurrence of ```R ``` - ```R{m,n}``` a number or ```R``` occurrences ranging from ```n``` to ```m ``` - ```R{n,}``` at least n occurrences - ```R{n}``` exactly n occurrences of ```R```