mirror of
				https://github.com/Ed94/LangStudies.git
				synced 2025-10-31 15:00:56 -07:00 
			
		
		
		
	RegEx : Complted lectures 1-7. NOT TESTED.
This commit is contained in:
		
							
								
								
									
										35
									
								
								App/RegM/Lectures/Lecture.1.Notes.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										35
									
								
								App/RegM/Lectures/Lecture.1.Notes.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,35 @@ | ||||
| # Automata Theory: Building a RegExp machine | ||||
|  | ||||
| ## Content: | ||||
| State Machines | ||||
| Formal Grammars | ||||
| Implement a regular expression processor | ||||
|  | ||||
| ## History: | ||||
|  | ||||
| *Pioneers:* | ||||
|  | ||||
| 1951 - Stephen Kleene invented reg exp (sets). | ||||
|  | ||||
| Reuglar Langauge : Langauge recognized by a finite automata (state machines). | ||||
| Kleene's Therem  : Equivalence of regular expressions and finite automata. | ||||
|  | ||||
| Has a notation named after him: | ||||
| Kleene-Closure (AKA: Kleene star) : A* (Stands for repetition) | ||||
|  | ||||
| 1956 - Chomsky defines his hiearchy fo grammers | ||||
|  | ||||
| Regular grammers are considered a type 3. | ||||
| See: https://en.wikipedia.org/wiki/Chomsky_hierarchy | ||||
|  | ||||
|  | ||||
|  | ||||
| Thus they are the weakest form of grammars. | ||||
|  | ||||
| 1968 - Ken Thompson used them for pattern matching in strings, and | ||||
| lexical analysis (scanners) | ||||
|  | ||||
| NFA - Thompson construction | ||||
|  | ||||
|  | ||||
|  | ||||
							
								
								
									
										74
									
								
								App/RegM/Lectures/Lecture.2.Notes.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										74
									
								
								App/RegM/Lectures/Lecture.2.Notes.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,74 @@ | ||||
| # Symbols, alphabets, and langauges and Regular Grammars | ||||
|  | ||||
| Alphabet : A set of characters. | ||||
|  | ||||
| Sigma = { a, b } | ||||
|  | ||||
| Langauge : A set of strings over a particular alphabet. | ||||
|  | ||||
| L1(Sigma) = { a, aa, b, ab, ba, bba, .. } (Infinite) | ||||
| L2(Sigma) = { aa, bb, ab, ba }; (Length = 2, Finite) | ||||
|  | ||||
| Any time you constraint a langauge you are  | ||||
| defining a formal grammar. | ||||
|  | ||||
| ## Formal Grammars: | ||||
|  | ||||
| FormalGrammer = (Non-Terminals, Terminals, Productions, Starting Symbol) | ||||
|  | ||||
| Non-Terminals : Variables (can be subsituted with a value) | ||||
| Terminals     : Cannot be replaced by anything (constant) | ||||
| Productions   : Rule in the grammar | ||||
|  | ||||
| **G = (N, T, P, S)** | ||||
|  | ||||
| Ex: | ||||
| ``` | ||||
| S -> aX | ||||
| X -> b | ||||
| ``` | ||||
| **(This notation is known as BNF : Bakus-Naur Form)** | ||||
|  | ||||
| Ex.Non-Terminals   = S, X | ||||
| Ex.Terminals       = a, b | ||||
| Ex.Productions     = S -> aX, X -> b (2) | ||||
| Ex.Starting Symbol = S | ||||
|  | ||||
| Only valid string : "ab" | ||||
|  | ||||
| ## Chomsky Hierachy : | ||||
|  | ||||
| 0. Unrestricted      : Natural Langauges, Turing Machines | ||||
| 1. Context-Sensitive : Programming Languages (Almost all in production) | ||||
| 2. Context-Free      : Programming Langauges (Parsing Syntax only) | ||||
| 3. Regular           : Regular Expressions | ||||
|  | ||||
| The lower in the hiearchy the less expressive it is. | ||||
|  | ||||
| RegExp is a vomit inducing terse notation that is equivalent to BNF. | ||||
|  | ||||
| BNF          : RegExp | ||||
| S -> aS      :  | ||||
| S -> bA      : `a*bc*` | ||||
| A -> epsilon : | ||||
| A -> cA      : | ||||
|  | ||||
| epsilon : "The empty string". | ||||
|  | ||||
| Regular expressions may only have one non-terminal: | ||||
| * A the very right side (right-linear, RHS) | ||||
| * At the very left side (left-linear, LHS) | ||||
|  | ||||
| Regular expression have no support for *NESTING* | ||||
| They can be *RECURSIVE* | ||||
|  | ||||
| Context-free grammers support nesting. | ||||
| Ex: | ||||
| (( () )) | ||||
| `Parenthesis balacing` | ||||
|  | ||||
| Non-regular RegExp can support nesting but are not pure | ||||
| finite automata and are slower implementation. | ||||
|  | ||||
|  | ||||
|  | ||||
							
								
								
									
										85
									
								
								App/RegM/Lectures/Lecture.3.Notes.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										85
									
								
								App/RegM/Lectures/Lecture.3.Notes.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,85 @@ | ||||
| # Finite Automata | ||||
| ***(AKA: Finite State Machine)*** | ||||
|  | ||||
| Mechanism and abstraction used behind regular grammars. | ||||
|  | ||||
| Usually has its state represented using nodes and edges. | ||||
|  | ||||
| Regular grammar: | ||||
| ``` | ||||
| S -> bA | ||||
| A -> epsilon | ||||
| ``` | ||||
| Equivalent to: `\b\` | ||||
|  | ||||
| State transition: | ||||
|  | ||||
| --label--> : Transition symbol  | ||||
| O          : State Symbol | ||||
| (o)        : Accepting State | ||||
| ->O.Start  : Starting State (State transition to Start) | ||||
|  | ||||
| Ex: | ||||
|  | ||||
| ->O.*Start* --*transition*--> (o).*Accepting* | ||||
|  | ||||
| *ε* - Epsilon (Empty String) | ||||
| `I will be spelling it out as I do not enjoy single glyth representation` | ||||
|  | ||||
| Two main types of Finite Automtata : | ||||
|  | ||||
| FA w/ output | ||||
| * Moore machine | ||||
| * Mealy machine | ||||
|  | ||||
| FA w/o output | ||||
| * DFA - Deterministic | ||||
| * NFA - Non-deterministic | ||||
| * epsilon-NFA - (Epsilon Transition) special case | ||||
|  | ||||
| NFA : Non-deterministic FA - Allos transition on the same symbol to | ||||
| different states | ||||
|  | ||||
| ``` | ||||
| 	a->o | ||||
|    / | ||||
| ->o.1---b-->o | ||||
|    \ | ||||
| 	a->o  | ||||
| ``` | ||||
|  | ||||
| epsilon-NFA : Extension of NFA that allows *epsilon* transitions | ||||
|  | ||||
| ``` | ||||
| 	a--->o---epsi--->(o) | ||||
|    /		 	    / | ||||
| ->o----b-->epsi--->o | ||||
|    \ | ||||
| 	a-->o--epsi-->(o) | ||||
| ``` | ||||
|  | ||||
| DFA : A state machine which forbids multiple transitions on the same symbol, and *epsilon* transitions | ||||
|  | ||||
| ``` | ||||
| 	a--->o | ||||
|    / | ||||
| ->o----b-->o | ||||
| ``` | ||||
|  | ||||
| Use case: | ||||
|  | ||||
| Implementation Transformations: | ||||
| ```RegExp -> epsilon-NFA -> ... -> DFA``` | ||||
|  | ||||
| ## Formal Definition: | ||||
|  | ||||
| Non-deterministic finite automata is a tuple of five elements: | ||||
| * All possible states | ||||
| * Alphabet | ||||
| * Transition Function | ||||
| * Starting State | ||||
| * Set of accepting states | ||||
|  | ||||
| NFA = ( States, Alphabet, TransitionFunction, StartingState, AcceptingStates ) | ||||
|  | ||||
| NFA = ( Q, Σ, Δ, q0, F ) | ||||
							
								
								
									
										28
									
								
								App/RegM/Lectures/Lecture.4.Notes.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										28
									
								
								App/RegM/Lectures/Lecture.4.Notes.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,28 @@ | ||||
| # Basic NFA Fragments | ||||
|  | ||||
| ### Single Character | ||||
| RegExp: `/^A$/` | ||||
| Psuedo: | ||||
| `str.start glyph(A) str.end` | ||||
|  | ||||
| ^ : Beginning of string	: Str.Start | ||||
| $ : End of a string		: Str.End | ||||
|  | ||||
| Machine: | ||||
| ->o.*Start* ---**Glyph**---> (o).*Accepting* | ||||
|  | ||||
| ### Epsilon-Transition | ||||
| RegExp: `/^$/` | ||||
| Psuedo: `str.start str.end` | ||||
|  | ||||
| Machine: | ||||
| ``` | ||||
| ->o --epsilon--> (o) | ||||
| ``` | ||||
|  | ||||
| Everyhing else can be built on top of these machines. | ||||
|  | ||||
| ``` | ||||
| Start = Input, Accepting = Output | ||||
| ``` | ||||
|  | ||||
							
								
								
									
										39
									
								
								App/RegM/Lectures/Lecture.5.6.7.Notes.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										39
									
								
								App/RegM/Lectures/Lecture.5.6.7.Notes.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,39 @@ | ||||
| ## Concatenation | ||||
|  | ||||
| Regex : `/^AB%/` | ||||
| Psuedo: `str.start str(AB) str.end` | ||||
|  | ||||
| Machine: | ||||
| ``` | ||||
| ->o --A--> o --epsilon--> o --B--> (o) | ||||
|  | ||||
| Submachine_A --epsilon--> Submachine_B | ||||
| ``` | ||||
|  | ||||
| ## Union | ||||
|  | ||||
| Regex : `/^A|B$/` | ||||
| Psuedo: `str.start glyph(A) | glyph(B) str.end` | ||||
|  | ||||
| Machine: | ||||
| ``` | ||||
| 	epsilon--> o --A--> o --epsilon | ||||
|    /					 		   \ | ||||
| ->o						  			->(o) | ||||
|    \					 		   / | ||||
| 	epsilon--> o --B--> o --epsilon | ||||
| ``` | ||||
|  | ||||
| ## Kleene Closure | ||||
|  | ||||
| Regex : `/^A*$/` | ||||
| Psuedo: `str.start glyph(A).repeating str.end` | ||||
|  | ||||
| Machine: | ||||
| ``` | ||||
| 				   <------episolon------- | ||||
| 				  /						 \ | ||||
| ->o --epsilon--> o --A--> o --epsilon--> (o) | ||||
|    \		  							 / | ||||
| 	-------------epsilon----------------> | ||||
| ``` | ||||
		Reference in New Issue
	
	Block a user