Open
Description
Is your feature request related to a problem? Please describe.
- Recently I was looking for a Python script to extract objects from pcap files. I found that pyshark is a tshark wrapper, however non of it methods fullfilled my expectations in this field.
Describe the solution you'd like
- I wrote code which wraps tshark
--export-objects
command, what allows you to pass pcap file path and retrieve exported objects. The code is shown below:
import os
import subprocess
import sys
import tempfile
from pathlib import Path
from typing import NamedTuple
from rich import print
OBJECT_TYPES = ("dicom", "http", "imf", "smb", "tftp")
class Colors:
"""
https://replit.com/talk/learn/ANSI-Escape-Codes-in-Python/22803
"""
BLACK = "\u001b[30m"
RED = "\u001b[31m"
GREEN = "\u001b[32m"
YELLOW = "\u001b[33m"
BLUE = "\u001b[34m"
MAGENTA = "\u001b[35m"
CYAN = "\u001b[36m"
WHITE = "\u001b[37m"
RESET = "\u001b[0m"
class PcapExport(NamedTuple):
protocol: str
name: str
data: bytes
def __repr__(self):
cls_name = self.__class__.__name__
if len(self.data) > 30:
data = self.data[:27] + b"..."
else:
data = self.data
protocol = self.protocol
name = self.name
return f"{cls_name}({protocol=}, {name=}, {data=})"
def export_all_objects(pcap_path: str, tshark_path: str = "tshark"):
"""export objects of all known types
Args:
pcap_path str: pcap filename
tshark_path str: alternavite tshark path
"""
exported_objects = []
for object_type in OBJECT_TYPES:
exported = export_objects(pcap_path, tshark_path, object_type)
exported_objects.extend(exported)
return exported_objects
def export_objects(
pcap_path: str, tshark_path: str = "tshark", object_type: str = "http"
) -> list[PcapExport]:
"""export objects from pcap by specified object_type
Args:
pcap_path str: pcap filename
tshark_path str: alternavite tshark path
object_type str: export object type (dicom, http, imf, smb, tftp)
"""
if not object_type in OBJECT_TYPES:
raise TypeError(
f"Not allowed export object used - '{object_type}'\nAllowed types: {OBJECT_TYPES}"
)
with tempfile.TemporaryDirectory() as tmpdir:
# prepare tshark command
export_args = f"{object_type},{tmpdir}"
command = [tshark_path, "-Q", "-r", pcap_path, "--export-objects", export_args]
print(f"tshark command: {command}")
# execute tshark command
try:
subprocess.run(command, check=True)
except FileNotFoundError as err:
custom_message = (
f"{Colors.RED}tshark executable not found: {tshark_path}{Colors.RESET}"
)
err.strerror = custom_message
raise err
except subprocess.CalledProcessError as err:
returncode = err.returncode
if returncode == 2:
strerror = str(err)
custom_message = f"{Colors.RED}The system cannot find the file specified.\nPath to file: {pcap_path}{Colors.RESET}"
err.add_note(f"{strerror}\n{custom_message}")
raise err
# collect extracted objects
tmpdir_path = Path(tmpdir)
files = map(tmpdir_path.joinpath, tmpdir_path.iterdir())
exported = [
PcapExport(object_type, tmp_file.name, tmp_file.read_bytes())
for tmp_file in files
]
return exported
if __name__ == "__main__":
if os.name == "nt":
os.system("color")
args = sys.argv[1:]
pcap_path = args[0]
exported_objects = export_all_objects(pcap_path=pcap_path)
for item in exported_objects:
print(item)
Example of pcaps for tests/debug:
- https://wiki.wireshark.org/SampleCaptures
- https://wiki.wireshark.org/uploads/__moin_import__/attachments/SampleCaptures/http_with_jpegs.cap.gz
- https://www.malware-traffic-analysis.net/training/exporting-objects.html
How to quickly test this code?
- download
http_with_jpegs.cap.gz
file (middle one from the above list) - the only dependency is rich package (for better view), so type the following commands in your terminal, to run the code:
pip install rich
python .\pcap_parser.py .\http_witp_jpegs.cap
Screenshots:
I wonder if this could be a part of pyshark library. The code is a concept and using colors/print here (termcolor/rich/ansi coloring) is just for better view/debug, however personally I would leave some sort of red color styling for errors handling.
I'm interested if you like this idea. I could modify the code to fit your package needs if so. I'm waiting for your feedback. Thanks!
Describe alternatives you've considered
- I haven't found any Python like alternative for this purpose