A compiler is a translation software that converts the programs from high level language into corresponding machine code for processing by the computer. In general, the Compilation is a two step process called the Front End and Back End of the compiler. Each step comprises of sequence of various phases.
FRONT END
The Front end of the compiler is responsible for translating high level source code into an intermediate code. The following phases constitute the front end of the compiler.
- Lexical Analyzer
- Syntax Analyzer
- Semantic Analyzer
- Intermediate Code Generator
- The first phase of the compiler architecture is Lexical Analysis and the software program that performs this analysis is called the Lexical Analyzer (Lexers) or Tokenizer.
- Role of Lexer :
- A Lexer scans the input source code and produces sequences of Tokens as output, i.e., it divides the string of input characters into clusters called the Tokens .
- Stores the discovered token in the symbol table for later use.
- Removes comments and white spaces (like tab, newline, blank) from the source program.
- Displays error messages along with its line number.
- Expansion of Macros (#define directive).
- The Lexer identifies tokens with the help of Regular Expression and Pattern Rules.
- The output of Lexical Analyzer (i.e., Tokens) is given as input to the next Syntax Analysis phase.
- Token, Lexeme, Pattern
- Token : Represents a basic element of a program.
- Pattern : Description of the token under consideration.
- Lexeme : Instance of a token. Sequence of character that comprises a token.
- Types of token : Identifier, Keyword, Operator, Constants and Special symbol.
- The second phase of the Compiler Architecture is Syntax Analysis and the software program that performs this analysis is called Syntax Analyzer or Parser.
- Role of Parser
- In this phase, the tokens identified during lexical analysis are given as input to the parser. The output generated by the parser is a Parse Tree or Syntax Tree which is a symbolic representation of symbols.
- During this phase, the parser checks whether the identified token satisfies with Syntax of the programming languages being used.
- The parser displays Syntax Errors if any.
- It also performs common error recovery activities.
- Examples of Syntax errors
Semantic Analyzer
- The output Parse Tree of the Syntax Analyzer along with the symbol table acts as a input to Semantic Analyzer. The output produced is the semantically consistent code .
- The main role of this phase is to check whether the structure and tokens identified in the source program derives appropriate meaning.
- The tasks performed by Semantic Analyzer are as follows
- Types of errors identified by Semantic Analyzer
- Incompatible operand types
- Uninitialized variable.
- Mismatch of data types.
- Trying to access out of scope variable.
- Mismatch of actual and formal argument.
- During this phase, an Intermediate code is generated by the compiler for the target machine.
- The intermediate code generator takes annotated parse tree of the semantic analysis phase as input and produces intermediate code as the output.
- The intermediate code lies between the High level language and the machine language.
- It is machine independent and thus portability is enhanced.
- The intermediate codes are represented using one of the following ways,
- Postfix Notation
- Address code
- Syntax tree
BACK END
The Back end of the compiler is responsible for translating the intermediate code into the targeted machine code for processing. It is considered as the Synthesis phase of the compiler. The following phases constitute the back end of the compiler.
- Code Optimization
- Code Generation
- The purpose of code optimization is to optimize the generated intermediate code in terms of speed, CPU usage, memory management and other resources.
- The main goal is to improve the intermediate code by making sure that the code utilizes the resource in an efficient way.
- The optimized code should meet the following objectives
- After optimization, it should not change the meaning and logic of the code.
- The optimization should provide higher speed and optimal use of CPU and memory.
- Promotes code reusability.
- Types of code optimization
- Machine Independent Optimization
- Machine Dependent Optimization
- The final phase of the compilation process is the code generation and the software program that perform this process is called code generator.
- The code generator translate optimized intermediate code into machine code for processing.
- Symbol table is an important data structure maintained throughout all the phases of the compiler.
- It is created by the compiler to keep track of information about variables, identifiers, classes, objects, function and scope etc.
- The data structures used for implementing symbol table are as follows
- List
- Hash table
- Binary Search Tree
- Linked List
0 Comments