Skip to content

improve distance binning for FR,RR,FR,RF pairs "scalings" in stats output #81

@sergpolly

Description

@sergpolly

https://github.com/mirnylab/pairtools/blob/d1ddf9c39a336662f7fc725fa5a70ec68df9ba95/pairtools/pairtools_stats.py#L147

consider replacing it with something more readable and usable, e.g. @mimakaev 's robust bins:

# ~10 bins per order of magnitude
bins = np.logspace(0,9, num = 9*10+1,dtype=int)
bins = np.unique(bins)
bins = np.cumsum(np.sort(np.r_[1,np.diff(bins)]))

currently we have:

min_log10_dist=0
max_log10_dist=9
og10_dist_bin_step=0.25
bins = np.r_[0, np.round(10**np.arange(min_log10_dist, max_log10_dist+0.001, log10_dist_bin_step)).astype(np.int)]

which are also non-decreasing, but are too sparsely spaced ... - and code is hard to read

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions