LangStudies/App/RegM/Lectures/Lecture.2.Notes.md

75 lines
1.7 KiB
Markdown

# Symbols, alphabets, and langauges and Regular Grammars
Alphabet : A set of characters.
Sigma = { a, b }
Langauge : A set of strings over a particular alphabet.
L1(Sigma) = { a, aa, b, ab, ba, bba, .. } (Infinite)
L2(Sigma) = { aa, bb, ab, ba }; (Length = 2, Finite)
Any time you constrain a langauge you are
defining a formal grammar.
## Formal Grammars:
FormalGrammer = (Non-Terminals, Terminals, Productions, Starting Symbol)
Non-Terminals : Variables (can be subsituted with a value)
Terminals : Cannot be replaced by anything (constant)
Productions : Rule in the grammar
**G = (N, T, P, S)**
Ex:
```
S -> aX
X -> b
```
**(This notation is known as BNF : Bakus-Naur Form)**
Ex.Non-Terminals = S, X
Ex.Terminals = a, b
Ex.Productions = S -> aX, X -> b (2)
Ex.Starting Symbol = S
Only valid string : "ab"
## Chomsky Hierachy :
0. Unrestricted : Natural Langauges, Turing Machines
1. Context-Sensitive : Programming Languages (Almost all in production)
2. Context-Free : Programming Langauges (Parsing Syntax only)
3. Regular : Regular Expressions
The lower in the hiearchy the less expressive it is.
RegExp is a vomit inducing terse notation that is equivalent to BNF.
BNF : RegExp
S -> aS :
S -> bA : `a*bc*`
A -> epsilon :
A -> cA :
epsilon : "The empty string".
Regular expressions may only have one non-terminal:
* A the very right side (right-linear, RHS)
* At the very left side (left-linear, LHS)
Regular expression have no support for *NESTING*
They can be *RECURSIVE*
Context-free grammers support nesting.
Ex:
(( () ))
`Parenthesis balacing`
Non-regular RegExp can support nesting but are not pure
finite automata and are slower implementation.