The Chomsky Hierarchy classifies formal grammars by their generative power, breaking them into four types. This framework helps us understand how different languages relate to their grammars, revealing the complexity and capabilities of each grammar type.
-
Definition of Chomsky Hierarchy
- A classification of formal grammars based on their generative power.
- Consists of four types of grammars, each corresponding to a specific class of languages.
- Provides a framework for understanding the relationships between different types of languages and their grammars.
-
Four types of grammars: Type 0, Type 1, Type 2, and Type 3
- Type 0: Unrestricted grammars, can generate any language that can be recognized by a Turing machine.
- Type 1: Context-sensitive grammars, can generate context-sensitive languages and are more restrictive than Type 0.
- Type 2: Context-free grammars, generate context-free languages and are used in programming languages and compilers.
- Type 3: Regular grammars, generate regular languages and can be represented by finite automata.
-
Type 0: Unrestricted grammars
- No restrictions on the production rules; can have any form.
- Equivalent in power to Turing machines, capable of expressing any computable function.
- Can generate complex languages, including those that require memory beyond finite limits.
-
Type 1: Context-sensitive grammars
- Production rules must be context-sensitive; the length of the string on the left side must be less than or equal to the right side.
- Can describe languages that require context for interpretation, such as natural languages.
- More powerful than context-free grammars but less powerful than unrestricted grammars.
-
Type 2: Context-free grammars
- Production rules have a single non-terminal symbol on the left side.
- Widely used in programming languages and parsing algorithms.
- Can be represented by pushdown automata, which have a stack for memory.
-
Type 3: Regular grammars
- Production rules are limited to a specific form, either right-linear or left-linear.
- Can be represented by finite automata, making them the simplest type of grammar.
- Suitable for describing regular languages, such as those used in lexical analysis.
-
Relationship between grammar types and language classes
- Each grammar type corresponds to a specific class of languages, with Type 0 being the most powerful and Type 3 the least.
- Every language generated by a Type 3 grammar is also generated by Type 2, Type 1, and Type 0 grammars.
- The hierarchy illustrates the increasing complexity and computational power of each grammar type.
-
Chomsky Normal Form for context-free grammars
- A specific form of context-free grammar where each production rule is either A → BC or A → a.
- Facilitates parsing and simplifies the analysis of context-free languages.
- Any context-free grammar can be converted to Chomsky Normal Form.
-
Pumping lemma for regular and context-free languages
- A property that provides a method to prove that certain languages are not regular or context-free.
- For regular languages, states that long enough strings can be "pumped" (repeated) without leaving the language.
- For context-free languages, a similar property exists, but it involves splitting the string into parts that can be repeated.
-
Closure properties of each language class
- Regular languages are closed under union, intersection, complementation, and concatenation.
- Context-free languages are closed under union, concatenation, and Kleene star, but not under intersection or complementation.
- Context-sensitive languages are closed under union, intersection, and concatenation, and are also closed under complementation.