Architecture of Compiler Design : Front End and Back End of the Compiler

A compiler is a translation software that converts the programs from high level language into corresponding machine code for processing by the computer. In general, the Compilation is a two step process called the Front End and Back End of the compiler. Each step comprises of sequence of various phases.

Phases of compiler

FRONT END 

The Front end of the compiler is responsible for translating high level source code into an intermediate code. The following phases constitute the front end of the compiler.

  1. Lexical Analyzer
  2. Syntax Analyzer
  3. Semantic Analyzer
  4. Intermediate Code Generator
The role of these phases is to provide better understanding of the source language. The front end of the compiler is independent of the target machine code. 

Lexical Analyzer
  • The first phase of the compiler architecture is Lexical Analysis and the software program that performs this analysis is called the Lexical Analyzer (Lexers) or Tokenizer. 
  • Role of Lexer
    • A Lexer scans the input source code and produces sequences of Tokens as output, i.e., it divides the string of input characters into clusters called the Tokens .
    • Stores the discovered token in the symbol table for later use.
    • Removes comments and white spaces (like tab, newline, blank) from the source program.
    • Displays error messages along with its line number.
    • Expansion of Macros (#define directive).
    • The Lexer identifies tokens with the help of Regular Expression and Pattern Rules.
  • The output of Lexical Analyzer (i.e., Tokens) is given as input to the next Syntax Analysis phase.
  • Token, Lexeme, Pattern
    • Token : Represents a basic element of a program.
    • Pattern : Description of the token under consideration. 
    • Lexeme : Instance of a token. Sequence of character that comprises a token.
  • Types of token : Identifier, Keyword, Operator, Constants and Special symbol.
Example: Consider the following C program that is given as input to the Lexical Analyzer

Lexical analyzer example code

Tokens generated in Lexical analysis

Syntax Analyzer
  • The second phase of the Compiler Architecture is Syntax Analysis and the software program that performs this analysis is called Syntax Analyzer or Parser.
  • Role of Parser
    • In this phase, the tokens identified during lexical analysis are given as input to the parser. The output generated by the parser is a Parse Tree or Syntax Tree which is a symbolic representation of symbols.
    • During this phase, the parser checks whether the identified token satisfies with Syntax of the programming languages being used. 
    • The parser displays Syntax Errors if any.
    • It also performs common error recovery activities.
  • Examples of Syntax errors
Example of syntax error

Semantic Analyzer

  • The output Parse Tree of the Syntax Analyzer along with the symbol table acts as a input to Semantic Analyzer. The output produced is the semantically consistent code .
  • The main role of this phase is to check whether the structure and tokens identified in the source program derives appropriate meaning.
  • The tasks performed by Semantic Analyzer are as follows
Semantic analysis
  • Types of errors identified by Semantic Analyzer
    • Incompatible operand types
    • Uninitialized variable.
    • Mismatch of data types.
    • Trying to access out of scope variable.
    • Mismatch of actual and formal argument.
Intermediate Code Generator

  • During this phase, an Intermediate code is generated by the compiler for the target machine.
  • The intermediate code generator takes annotated parse tree of the semantic analysis phase as input and produces intermediate code as the output.
  • The intermediate code lies between the High level language and the machine language.
  • It is machine independent and thus portability is enhanced.
  • The intermediate codes are represented using one of the following ways,
    • Postfix Notation
    • Address code
    • Syntax tree

BACK END

The Back end of the compiler is responsible for translating the intermediate code into the targeted machine code for processing. It is considered as the Synthesis phase of the compiler. The following phases constitute the back end of the compiler.

  1. Code Optimization
  2. Code Generation
Code Optimization
  • The purpose of code optimization is to optimize the generated intermediate code in terms of speed, CPU usage, memory management and other resources.
  • The main goal is to improve the intermediate code by making sure that the code utilizes the resource in an efficient way.
  • The optimized code should meet the following objectives
    • After optimization, it should not change the meaning and logic of the code.
    • The optimization should provide higher speed and optimal use of CPU and memory.
    • Promotes code reusability.
  • Types of code optimization
    • Machine Independent Optimization 
    • Machine Dependent Optimization
Code Generation
  • The final phase of the compilation process is the code generation and the software program that perform this process is called code generator.
  • The code generator translate optimized intermediate code into machine code for processing. 

SYMBOL TABLE
  • Symbol table is an important data structure maintained throughout all the phases of the compiler. 
  • It is created by the compiler to keep track of information about variables, identifiers, classes, objects, function and scope etc.
  • The data structures used for implementing symbol table are as follows
    • List
    • Hash table
    • Binary Search Tree
    • Linked List

Post a Comment

0 Comments