PHP is one of the most popular server-side programming languages in the world. When you write a PHP script, you don't just write text for a computer to display - it has to be processed, understood, and executed. This is where the PHP interpreter comes in. But what actually happens behind the scenes when you execute a PHP script? Let's dive into the internal workings of PHP and explore its stages: Lexing, Parsing, Compilation, and Interpretation.

PHP code is executed on the server. When a PHP file is requested (for example, via a web browser), the PHP interpreter performs several stages to transform your human-readable code into machine-executable instructions. These stages are:

  1. Lexing - Breaking the code into tokens.
  2. Parsing - Understanding the structure of the code.
  3. Compilation - Translating the code into an intermediate form called opcodes.
  4. Interpretation - Executing the opcodes to produce output.

Stage 1. Lexing (Tokenization)

Lexing, also known as tokenization, is the first stage in PHP execution. Here, the interpreter reads the PHP script as a string of characters and converts it into meaningful chunks called tokens.

A token is a small unit of code, such as:

  • Keywords (if, else, function)
  • Variables ($username, $total)
  • Operators (+, -, *, =)
  • Punctuation (;, {, })

- Example:

$number = 10;
if ($number > 5) {
    echo "Greater than 5";
}

During lexing, PHP breaks this script into tokens like:

T_OPEN_TAG       (<?php)
T_VARIABLE       ($number)
T_EQUAL          (=)
T_LNUMBER        (10)
T_IF             (if)
T_OPEN_PAREN     (()
T_VARIABLE       ($number)
T_IS_GREATER     (>)
T_LNUMBER        (5)
T_CLOSE_PAREN    ())
T_OPEN_CURLY     ({)
T_ECHO           (echo)
T_CONSTANT_ENCAPSED_STRING ("Greater than 5")
T_SEMICOLON      (;)
T_CLOSE_CURLY    (})

Lexing ensures the interpreter knows what each piece of the code is before understanding the structure.


Stage 2. Parsing (Syntax Analysis)

Once the PHP code is tokenized, the next step is parsing. Parsing checks whether the sequence of tokens follows PHP's syntax rules and builds a structured representation of the code, called an Abstract Syntax Tree (AST).

The AST represents the logical structure of your code:

  • Nodes represent operations like addition, function calls, or conditionals.
  • Leaf nodes represent operands like variables or literals.

- Example:

For the previous PHP code, the AST might conceptually look like this:

Program
 ├─ Assignment
 │    ├─ Variable: $number
 │    └─ Literal: 10
 └─ IfStatement
      ├─ Condition
      │    ├─ Variable: $number
      │    └─ Literal: 5
      └─ Body
           └─ Echo: "Greater than 5"

During parsing:

  • The interpreter verifies syntax correctness.
  • If syntax errors exist (e.g., missing semicolons, unmatched braces), parsing fails and an error is returned immediately.

Parsing is essentially PHP understanding what your code is supposed to do.


Stage 3. Compilation (Opcode Generation)

After parsing, PHP enters the compilation stage, where the AST is converted into opcodes (operation codes).

  • Opcodes are low-level, intermediate instructions that the Zend Engine (PHP's core) can execute.
  • This stage allows PHP to optimize code execution.

What happens in this stage:

  • AST nodes are mapped to corresponding opcodes.
  • Variables and constants are resolved.
  • Functions, loops, and expressions are transformed into executable instructions.

- Example:

$sum = 5 + 10;
echo $sum;

Might be compiled into these opcodes conceptually:

ASSIGN $sum, ADD 5, 10
ECHO $sum

Compiling doesn't execute the code yet; it prepares it in a format the interpreter can efficiently run.


Stage 4. Interpretation (Opcode Execution)

Finally, the PHP interpreter executes the compiled opcodes. This is handled by the Zend Virtual Machine (Zend VM).

During interpretation:

  • PHP reads each opcode in sequence.
  • Executes operations in memory (arithmetic, function calls, loops, etc.).
  • Produces output (e.g., HTML sent to the browser).

- Example:

Opcodes from previous example:

ASSIGN $sum, ADD 5, 10   -> $sum = 15
ECHO $sum                -> outputs: 15

The interpreter manages:

  • Memory allocation for variables.
  • Function stack and scopes.
  • Error handling at runtime.
  • Interactions with server resources (database, files, etc.).

This stage is where your PHP code actually does something and generates output.


Source: Orkhan Alishov's notes