Context
Recently I was analyzing binary files containing grayscale images for which I did not know the size of the contained images; but Binocle helped me a great deal to visualize the contents.
What I felt was missing, was a feature to help me tune the width for viewing the data. This is related to what is proposed in #44, but it goes a bit further: I didn't want to look for identical patterns, I wanted to look for "similar" patterns (since I am looking at images with noise).
Proposed Algorithm
I made a small prototype in Python for an algorithm that could propose some widths to give nice images (at least for my situation of having simple 8bit grayscale images; but extension to RGB or uint16 or something should certainly be possible).
The algorithm in a few steps:
- autocorrelate the data,
- accentuate the peaks by removing the baseline correlation,
- perform peak detection, each peak is a good proposal for the width
Python prototype
This is a basic Python implementation/prototype of the idea
import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import scipy
from scipy.signal import find_peaks, correlate, correlation_lags
file = r"path to the file under analysis"
data = np.fromfile(file, dtype=np.uint8)
MAX_CORRELATION_LENGTH = 10000 # don't bother with anything more than 10000 wide
MEDIAN_KERNEL_SIZE = 9 # this will depend a bit on the data,
PEAK_THRESHOLD = 10 # this depends on the normalization of the correlation and the signal, probably this can be determined in a smart way
def autocorr(data, dtype=np.float32, optimize=False):
# here I cast to float32 out of laziness: the algorithm needs a bit of precision to compute
# in uint8 it overflowed immediately, so this was the easy way out
# the algorithm is quite okay for large data, it uses FFT (so O( N log N ) ) behind the scenes
cast_data = data.astype(dtype)
autocorr = correlate(cast_data, cast_data, mode="full") / len(cast_data)
lags = correlation_lags(len(cast_data), len(cast_data))
# only half of the correlation spectrum is really informative, the other is mirrored
autocorr = autocorr[len(autocorr)//2:]
lags = lags[len(lags)//2:]
if optimize:
# attempt to optimize
autocorr = autocorr[:MAX_CORRELATION_LENGTH]
lags = lags[:MAX_CORRELATION_LENGTH]
return lags, autocorr
# step 1: compute autocorrelation
lags, corr = autocorr(data)
# step 2: accentuate the peaks
# I like to see that autocorrelation signal as 3 contributions
# 1. a slowly varying "baseline" (memory locations close to each other (i.e. pixels left and right of each other) likely describe something corelated, memory locations far away are likely describing uncorelated things)
# 2. peaks (i.e. in structured data such as images, pixels that are above/below each other are also strongly corelated)
# 3. "noise" (i.e. anything that does not fit in the categories above)
# with the median filter, I try to locally guess correlation between adjacent memory locations and remove it
# the size of the kernel will vary a bit per dataset and might not be enough for e.g. RGB images or wider datatypes)
# by subtracting the baseline, I am left with mostly parts 2 and 3 and based on the assumption that 2 >> 3, our peak detection will do the rest
baseline = scipy.signal.medfilt(corr, MEDIAN_KERNEL_SIZE)
excess_corr = corr - baseline
# step 3: peak detection
pks, _ = find_peaks(excess_corr, height=PEAK_THRESHOLD)
pks = pks[lags[pks] < MAX_CORRELATION_LENGTH]
proposed_widths = lags[pks]
plt.figure()
plt.grid()
plt.plot(lags, excess_corr)
plt.plot(proposed_widths , excess_corr[pks], "xk") # black crosses are the suggested widths
Possible improvements
There are a few computational/speed optimizations that I can see as possible:
- instead of filtering
lags[pks] < MAX_CORRELATION_LENGTH, this could be taken into account when computing the autocorrelation (only until those lags) and by manipulating the FFT spectra directly instead of using an existing correlation function
- The peak threshold and peak detection probably can be replaced with a simpler mechanism: retaining the N largest correlation values and corresponding lags.
- A further refinement that can be done is to remove what I would call "harmonics" from the lags: the peaks will contain multiples of each other and those can be removed. Intuitively, if you are visualizing an NxM image in binary form, you will get a peak at N, 2N, 3N, in the autocorrelation, since at with a stride of N, you get the pixel that is below the current one, with a stride 2N it will be 2 pixels lower than the current one, etc. (and those tend to be corelated with the pixel itself).
Context
Recently I was analyzing binary files containing grayscale images for which I did not know the size of the contained images; but Binocle helped me a great deal to visualize the contents.
What I felt was missing, was a feature to help me tune the width for viewing the data. This is related to what is proposed in #44, but it goes a bit further: I didn't want to look for identical patterns, I wanted to look for "similar" patterns (since I am looking at images with noise).
Proposed Algorithm
I made a small prototype in Python for an algorithm that could propose some widths to give nice images (at least for my situation of having simple 8bit grayscale images; but extension to RGB or uint16 or something should certainly be possible).
The algorithm in a few steps:
Python prototype
This is a basic Python implementation/prototype of the idea
Possible improvements
There are a few computational/speed optimizations that I can see as possible:
lags[pks] < MAX_CORRELATION_LENGTH, this could be taken into account when computing the autocorrelation (only until those lags) and by manipulating the FFT spectra directly instead of using an existing correlation function