{"id":1357,"date":"2011-03-29T21:51:35","date_gmt":"2011-03-30T01:51:35","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/?p=1357"},"modified":"2011-03-29T21:51:35","modified_gmt":"2011-03-30T01:51:35","slug":"regular-languages","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2011\/03\/29\/regular-languages\/","title":{"rendered":"Regular Languages"},"content":{"rendered":"<p> The simplest family of languages &#8211; and correspondingly, the simplest kind of computing machine &#8211; deal with <em>regular languages<\/em>. A regular language is <em>very<\/em> limited: there&#8217;s no counting, no deep patterns. They really don&#8217;t have much in the way of computational power. But they&#8217;re still very useful &#8211; the ubiquitous regular expression libraries in most programming languages are just easy ways of using regular languages to decompose simple patterned inputs.<\/p>\n<p> The regular languages can be described in three ways: as <em>regular grammars<\/em>; as regular expressions; or as a kind of computing machine called a <em>finite state machine<\/em> or finite state automaton.<\/p>\n<p><!--more--><\/p>\n<p> We&#8217;ll start with the machines. For each regular language, you can define a simple finite state machine that will determine whether or not a particular string is in the language. In a finite state machine, the machine has a small piece of state &#8211; a single atomic value, called <em>the<\/em> state, and it&#8217;s allowed to look at exactly one character of input. It&#8217;s not allowed to sneak a peak ahead; it&#8217;s not allowed to look at any previous characters to see how it got into its current state.<\/p>\n<p> Let&#8217;s get precise. A FSM can be described by a tuple:  <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28Sigma%2C%20S%2C%20i%2C%20f%2C%20t%29&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(Sigma, S, i, f, t)' style='vertical-align:1%' class='tex' alt='(Sigma, S, i, f, t)' \/>, where:<\/p>\n<ul>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=Sigma&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='Sigma' style='vertical-align:1%' class='tex' alt='Sigma' \/> is a set of symbols, called an <em>alphabet<\/em>;<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S' style='vertical-align:1%' class='tex' alt='S' \/> is a set of <em>states<\/em>;<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=i%20in%20S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='i in S' style='vertical-align:1%' class='tex' alt='i in S' \/> is a special distinguished state called the <em>start-state<\/em> or <em>initial state<\/em>;<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=f%20subseteq%20S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='f subseteq S' style='vertical-align:1%' class='tex' alt='f subseteq S' \/> is a subset of the states called the <em>final states<\/em>; and<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=t&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='t' style='vertical-align:1%' class='tex' alt='t' \/> is a <em>transition relation<\/em> which maps pairs of\t\tmachine state and input symbol to target states. This relation\tdefines how the machine actually works: if there is a relation <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28a%2C%20x%29%20rightarrow%20b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(a, x) rightarrow b' style='vertical-align:1%' class='tex' alt='(a, x) rightarrow b' \/>, that means that when the machine is in state <img src='http:\/\/l.wordpress.com\/latex.php?latex=a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='a' style='vertical-align:1%' class='tex' alt='a' \/>, and it sees an input symbol <img src='http:\/\/l.wordpress.com\/latex.php?latex=x&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='x' style='vertical-align:1%' class='tex' alt='x' \/>, it will transition to state <img src='http:\/\/l.wordpress.com\/latex.php?latex=b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='b' style='vertical-align:1%' class='tex' alt='b' \/>.<\/li>\n<\/ul>\n<p> The machine starts to look at an input string in state <img src='http:\/\/l.wordpress.com\/latex.php?latex=i&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='i' style='vertical-align:1%' class='tex' alt='i' \/>. For each symbol in the input string in sequence, it performs a single transition, <em>consuming<\/em> that input symbol. When it&#8217;s consumed every symbol in the input string, if it&#8217;s in a state that&#8217;s part of the set <img src='http:\/\/l.wordpress.com\/latex.php?latex=f&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='f' style='vertical-align:1%' class='tex' alt='f' \/>, then it <em>accepts<\/em> the string.<\/p>\n<p> For example, we can create a machine which accepts strings that consist of any string containing at least one a, followed by at least one b. In tuple form, that would be (<img src='http:\/\/l.wordpress.com\/latex.php?latex=Sigma%3D%7Ba%2Cb%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='Sigma={a,b}' style='vertical-align:1%' class='tex' alt='Sigma={a,b}' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=S%3D%7BS_%7B0%7D%2C%20S_a%2C%20S_b%2C%20S_%7Berr%7D%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S={S_{0}, S_a, S_b, S_{err}}' style='vertical-align:1%' class='tex' alt='S={S_{0}, S_a, S_b, S_{err}}' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=i%3DS_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='i=S_0' style='vertical-align:1%' class='tex' alt='i=S_0' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=f%3D%7B%20S_a%2C%20S_b%20%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='f={ S_a, S_b }' style='vertical-align:1%' class='tex' alt='f={ S_a, S_b }' \/>, t = { <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_0%2C%20a%29rightarrow%20S_a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_0, a)rightarrow S_a' style='vertical-align:1%' class='tex' alt='(S_0, a)rightarrow S_a' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_0%2C%20b%29%20rightarrow%20S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_0, b) rightarrow S_{err}' style='vertical-align:1%' class='tex' alt='(S_0, b) rightarrow S_{err}' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_a%2C%20a%29rightarrow%20S_a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_a, a)rightarrow S_a' style='vertical-align:1%' class='tex' alt='(S_a, a)rightarrow S_a' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_a%2C%20b%29rightarrow%20b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_a, b)rightarrow b' style='vertical-align:1%' class='tex' alt='(S_a, b)rightarrow b' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_b%2C%20a%29rightarrow%20S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_b, a)rightarrow S_{err}' style='vertical-align:1%' class='tex' alt='(S_b, a)rightarrow S_{err}' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_b%2Cb%29rightarrow%20S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_b,b)rightarrow S_b' style='vertical-align:1%' class='tex' alt='(S_b,b)rightarrow S_b' \/>, <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_%7Berr%7D%2C%20a%29rightarrow%20S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_{err}, a)rightarrow S_{err}' style='vertical-align:1%' class='tex' alt='(S_{err}, a)rightarrow S_{err}' \/>,  <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28S_%7Berr%7D%2C%20b%29%20rightarrow%20S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(S_{err}, b) rightarrow S_{err}' style='vertical-align:1%' class='tex' alt='(S_{err}, b) rightarrow S_{err}' \/> }). In graphic form:<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/scientopia.org\/blogs\/goodmath\/files\/2011\/03\/fsm-ab.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/scientopia.org\/blogs\/goodmath\/files\/2011\/03\/fsm-ab.png?resize=351%2C218\" alt=\"\" title=\"fsm-ab\" width=\"351\" height=\"218\" class=\"inset size-full wp-image-1364\" srcset=\"https:\/\/i0.wp.com\/www.goodmath.org\/blog\/wp-content\/uploads\/2011\/03\/fsm-ab.png?w=351 351w, https:\/\/i0.wp.com\/www.goodmath.org\/blog\/wp-content\/uploads\/2011\/03\/fsm-ab.png?resize=300%2C186 300w\" sizes=\"auto, (max-width: 351px) 100vw, 351px\" \/><\/a><\/p>\n<p> In the picture, the initial state is marked with a big blue arrow, and the final states are outlined in green.<\/p>\n<p> Let&#8217;s step through a couple of examples.<\/p>\n<ol>\n<li> Suppose the input is &#8220;aaabb&#8221;. The machine starts in state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_0' style='vertical-align:1%' class='tex' alt='S_0' \/>. It consumes the first &#8220;a&#8221;, and follows the transition labelled &#8220;a&#8221; into state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_a' style='vertical-align:1%' class='tex' alt='S_a' \/>. The remaining input is &#8220;aabb&#8221;. It follows the transition from state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_a' style='vertical-align:1%' class='tex' alt='S_a' \/> labeled &#8220;a&#8221;, which again leads to state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_a&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_a' style='vertical-align:1%' class='tex' alt='S_a' \/>,         leaving the remaining input a &#8220;abb&#8221;. It consumes another &#8220;a&#8221;, doing the same thing, leaving the remaining input &#8220;bb&#8221;. It consumes a &#8220;b&#8221;,  following the transition labelled &#8220;b&#8221; to state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_b' style='vertical-align:1%' class='tex' alt='S_b' \/>, and leaving the remaining\tinput as just &#8220;b&#8221;. Finally, it does the transition labeled <img src='http:\/\/l.wordpress.com\/latex.php?latex=b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='b' style='vertical-align:1%' class='tex' alt='b' \/> from state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_b' style='vertical-align:1%' class='tex' alt='S_b' \/>, and again goes to state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_b' style='vertical-align:1%' class='tex' alt='S_b' \/>. The input is gone, and it&#8217;s in state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_b' style='vertical-align:1%' class='tex' alt='S_b' \/>. Since <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_b' style='vertical-align:1%' class='tex' alt='S_b' \/> is one of the final states, the machine accepts the string.<\/li>\n<li> Suppose the input is &#8220;baab&#8221;. The machine starts in <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_0' style='vertical-align:1%' class='tex' alt='S_0' \/>. It\tconsumes the first <img src='http:\/\/l.wordpress.com\/latex.php?latex=b&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='b' style='vertical-align:1%' class='tex' alt='b' \/>, which moves it to state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_{err}' style='vertical-align:1%' class='tex' alt='S_{err}' \/>. The\trest of the characters get consumed one at a time, but all of the  transitions from <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_{err}' style='vertical-align:1%' class='tex' alt='S_{err}' \/> in this machine return to <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_{err}' style='vertical-align:1%' class='tex' alt='S_{err}' \/>: it&#8217;s a dead end. Once it&#8217;s consumed all of the characters, it will \t\tstill be in <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_%7Berr%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_{err}' style='vertical-align:1%' class='tex' alt='S_{err}' \/>, so it won&#8217;t accept the string.<\/li>\n<li> Suppose the input is the empty string. The machine starts in <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_0' style='vertical-align:1%' class='tex' alt='S_0' \/>,\t\tand never runs any transitions at all, so it ends in state <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_0' style='vertical-align:1%' class='tex' alt='S_0' \/>.  Since <img src='http:\/\/l.wordpress.com\/latex.php?latex=S_0&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S_0' style='vertical-align:1%' class='tex' alt='S_0' \/> isn&#8217;t a final state, it doesn&#8217;t accept the empty  string.<\/li>\n<\/ol>\n<p> When you look at this, it&#8217;s really a trivial machine. It seems like there&#8217;s very little that it can do. And yet, at least in theory, any computation that can be performed with a fixed, finite amount of state can be implemented with a machine this simple. Everything that I can do on the computer that I&#8217;m typing this on is doable using a finite state machine. It would have a huge number of states, and an insanely complex state transition function, but it would be doable.<\/p>\n<p> Let&#8217;s move on to grammars. As I said in the last post, each of the Chomsky grammar levels can be characterized by some way of constraining the grammar. There are two versions of regular grammars, called <em>right regular<\/em> and <em>left regular<\/em> grammars. The idea of both is  similar; we&#8217;ll use right-regular as an example.<\/p>\n<ul>\n<li> there is exactly one non-terminal on the left-hand side of the \trule;<\/li>\n<li> There can <em>at most<\/em> one non-terminal symbol on the right-hand side of the rule;<\/li>\n<li> There can be <em>at most<\/em> one terminal symbol on the right-hand side of the rule<\/li>\n<li> If there&#8217;s both a non-terminal and a terminal symbol on the right-hand side of the rule, then the terminal will come <em>before<\/em> the non-terminal.<\/li>\n<\/ul>\n<p> So, for example, the following are all valid right-regular grammar rules.<em>(I&#8217;m writing terminals as boldface, and non-terminals as italics.)<\/em><\/p>\n<ul>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20a%7D%20rightarrow%20%7Bbf%20Y%7D%20%7Bem%20b%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em a} rightarrow {bf Y} {em b}' style='vertical-align:1%' class='tex' alt='{em a} rightarrow {bf Y} {em b}' \/><\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20b%7D%20rightarrow%20%7Bbf%20X%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em b} rightarrow {bf X}' style='vertical-align:1%' class='tex' alt='{em b} rightarrow {bf X}' \/><\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20c%7D%20rightarrow%20%7Bbf%20Z%7D%20%7Bem%20c%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em c} rightarrow {bf Z} {em c}' style='vertical-align:1%' class='tex' alt='{em c} rightarrow {bf Z} {em c}' \/><\/li>\n<\/ul>\n<p> But the following are <em>not<\/em> right regular:<\/p>\n<ul>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20a%7D%20rightarrow%20%7Bbf%20X%7D%20%7Bem%20b%7D%20%7Bbf%20X%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em a} rightarrow {bf X} {em b} {bf X}' style='vertical-align:1%' class='tex' alt='{em a} rightarrow {bf X} {em b} {bf X}' \/>. (The non-terminal is in the middle of two terminals, not at the end.)<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20b%7D%20rightarrow%20%7Bbf%20a%7D%20%7Bbf%20W%7D%20%7Bbf%20X%7DB&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em b} rightarrow {bf a} {bf W} {bf X}B' style='vertical-align:1%' class='tex' alt='{em b} rightarrow {bf a} {bf W} {bf X}B' \/>. (The non-terminal\tis not at the end.)<\/li>\n<li> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%7Bem%20c%7D%20rightarrow%20%7Bbf%20W%7D%20%7Bbf%20X%7D%20%7Bbf%20W%7D%20%7Bbf%20Z%7D%20%7Bem%20c%7D%20%7Bem%20d%7D&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='{em c} rightarrow {bf W} {bf X} {bf W} {bf Z} {em c} {em d}' style='vertical-align:1%' class='tex' alt='{em c} rightarrow {bf W} {bf X} {bf W} {bf Z} {em c} {em d}' \/><\/li>\n<\/ul>\n<p> There&#8217;s a clear connection between the structure of a regular grammar, and a finite state machine. If you think of each non-terminal symbol as a state, each terminal symbol as an the label on a transition between states, and final states as states reached by a rule without a non-terminal on the right-hand side &#8211; you&#8217;ve pretty much got the a finite state machine. They&#8217;re really similar, and you can see how the regular grammar restrictions constrain the grammar to exactly the same languages that you can recognize with a finite state machine.<\/p>\n<p> FSMs are great, but if you want to describe a regular language, you probably don&#8217;t want to write one down. And grammars aren&#8217;t the easiest way to write down a regular language: for something so simple, they&#8217;re very verbose and not particularly easy to read. That&#8217;s why we have a third way of describing regular grammars: regular expressions. Regular expressions are compact, easy to read, and there&#8217;s a pretty clear correspondence between a regular expression and the corresponding regular grammar.<\/p>\n<p> A regular expression is:<\/p>\n<dl>\n<dt>A single character, e.g., &#8220;a&#8221;<\/dt>\n<dd> This is a literal match: the language matched by a single character  regular expression is exactly that character.<\/dd>\n<dt>Concatenation<\/dt>\n<dd> If <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> and <img src='http:\/\/l.wordpress.com\/latex.php?latex=S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S' style='vertical-align:1%' class='tex' alt='S' \/> are regular expressions, then <img src='http:\/\/l.wordpress.com\/latex.php?latex=RS&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='RS' style='vertical-align:1%' class='tex' alt='RS' \/> is a regular expression. The language matched by <img src='http:\/\/l.wordpress.com\/latex.php?latex=RS&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='RS' style='vertical-align:1%' class='tex' alt='RS' \/> is the concatenation of a string matched by <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> followed by a string matched by <img src='http:\/\/l.wordpress.com\/latex.php?latex=S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S' style='vertical-align:1%' class='tex' alt='S' \/>.<\/dd>\n<dt>Alternation<\/dt>\n<dd> Alternation describes a choice. If <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> and <img src='http:\/\/l.wordpress.com\/latex.php?latex=S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S' style='vertical-align:1%' class='tex' alt='S' \/> are regular expressions, then <img src='http:\/\/l.wordpress.com\/latex.php?latex=R%20%7C%20S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R | S' style='vertical-align:1%' class='tex' alt='R | S' \/> is a regular expression, which matches\tanything matched by either <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> or <img src='http:\/\/l.wordpress.com\/latex.php?latex=S&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='S' style='vertical-align:1%' class='tex' alt='S' \/>.<\/dd>\n<dt>Repetition (aka Kleene closure)<\/dt>\n<dd> If <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> is a regular expression, then <img src='http:\/\/l.wordpress.com\/latex.php?latex=R%5E%2A&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R^*' style='vertical-align:1%' class='tex' alt='R^*' \/> is a regular expression. <img src='http:\/\/l.wordpress.com\/latex.php?latex=R%5E%2A&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R^*' style='vertical-align:1%' class='tex' alt='R^*' \/> matches any sequence of zero or more strings matched by <img src='http:\/\/l.wordpress.com\/latex.php?latex=R&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='R' style='vertical-align:1%' class='tex' alt='R' \/> concatenated together. <\/dd>\n<\/dl>\n<p> You can also use grouping, written using parens, so that you can have alternation between concatenated groups.<\/p>\n<p> A few examples of regular expressions:<\/p>\n<dl>\n<dt> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28a%7Cb%7Cc%7Cd%29%5E%2A&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(a|b|c|d)^*' style='vertical-align:1%' class='tex' alt='(a|b|c|d)^*' \/><\/dt>\n<dd> any string of any length made from the characters &#8220;a&#8221;, &#8220;b&#8221;, &#8220;c&#8221;, and  &#8220;d&#8221;: &#8220;abcd&#8221;, &#8220;aaaa&#8221;, &#8220;ab&#8221;, &#8220;dabcbad&#8221;, etc.<\/dd>\n<dt> <img src='http:\/\/l.wordpress.com\/latex.php?latex=%28a%7Cb%29%5E%2A%28c%7Cd%29%5E%2A&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='(a|b)^*(c|d)^*' style='vertical-align:1%' class='tex' alt='(a|b)^*(c|d)^*' \/><\/dt>\n<dd> a string of any length made of &#8220;a&#8221;s and &#8220;b&#8221;s, followed by a string of  any length of &#8220;c&#8221;s and &#8220;d&#8221;s. &#8220;ababbbacdcdcccc&#8221;, &#8220;ab&#8221;, &#8220;b&#8221;, &#8220;c&#8221;, &#8220;cdcdddcc&#8221;,  &#8220;ddcc&#8221;.<\/dd>\n<dt> <img src='http:\/\/l.wordpress.com\/latex.php?latex=ab%28ab%29%5E%2A%28c%7Cd%29%2A&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='ab(ab)^*(c|d)*' style='vertical-align:1%' class='tex' alt='ab(ab)^*(c|d)*' \/><\/dt>\n<dd> a string of any number of repetitions of &#8220;ab&#8221;, followed by any number of &#8220;c&#8221;s and &#8220;d&#8221;s. &#8220;ababababcccd&#8221;, &#8220;abcc&#8221;,  &#8220;ab&#8221;, etc.<\/dd>\n<dt><img src='http:\/\/l.wordpress.com\/latex.php?latex=aa%5E%2Ab%5E%2A%28cd%29%5E%2A%28e%7Cf%29&#038;bg=FFFFFF&#038;fg=000000&#038;s=0' title='aa^*b^*(cd)^*(e|f)' style='vertical-align:1%' class='tex' alt='aa^*b^*(cd)^*(e|f)' \/><\/dt>\n<dd> strings consisting of at least one A; followed any number of &#8220;b&#8221;s (including zero); followed by any number of repetitions of &#8220;cd&#8221;; followed by either a single &#8220;e&#8221; or a single &#8220;f&#8221;.<\/dd>\n<\/dl>\n","protected":false},"excerpt":{"rendered":"<p>The simplest family of languages &#8211; and correspondingly, the simplest kind of computing machine &#8211; deal with regular languages. A regular language is very limited: there&#8217;s no counting, no deep patterns. They really don&#8217;t have much in the way of computational power. But they&#8217;re still very useful &#8211; the ubiquitous regular expression libraries in most [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[79],"tags":[],"class_list":["post-1357","post","type-post","status-publish","format-standard","hentry","category-computation"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-lT","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/1357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=1357"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/1357\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=1357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=1357"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=1357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}