Replies: 11 comments
-
|
also should we expose
|
Beta Was this translation helpful? Give feedback.
-
|
So the problem is matching bins between cooler and pairtools. It's a little tricky. There are a few options.
Current proposal for nice cooler bins: 1,2,3,4,5,6,8,10,13,16,20,25,32,40,50,63,79,100, at 1kb those bins become: 1000,2000,3000,4000,5000,6000,8000,10000,13000,16000,20000,25000,32000,40000,50000,63000,79000... The two are clearly different pairtools bins not matched to cooler: pairtools= bins matched to cooler at 1kb: pairtools bis matched to cooler at 1kb and 100bp/200bp if 100bp/200bp cooler uses modified bins: And bins matched to cooler at 100, 200, 1000bp resolutions with extra bins for pairs. I couldn't think of a more general solution. Powers of two obviously has one, but not here... |
Beta Was this translation helpful? Give feedback.
-
|
@golobor @sergpolly @agalitsyna - what do you guys think? We should probably decide on this before we merge in cooltools logbin_expected. -- Should we aim at matching at one resolution, or at two, or matching at all? |
Beta Was this translation helpful? Give feedback.
-
|
alternatively, we can let users decide between two options: (a) keep bins
nice within all orders of magnitude, (b) do not use nice bins at all.
|
Beta Was this translation helpful? Give feedback.
-
|
IMHO - ~100 bp is needed , at least for pair-level stuff because of DNase/MNase-based methods like microC, OmniC, and whateverC might happen "tomorrow" 100bp coolers for microC isn't a crazy thing to do, so perhaps it makes sense to match it like @mimakaev suggested:
but this would only work for high-resolution coolers and wouldn't be applicable to sparse data - <50-100M pairs of usable pairs in a cooler. So like @golobor is suggesting - this matching between bins for coolers and pairs could be optional another IMHO - i don't think it is THAT crucial to match bins for |
Beta Was this translation helpful? Give feedback.
-
|
Yeah, that would probably be ideal. I will a little better engineer that set and make sure it is actually matched. |
Beta Was this translation helpful? Give feedback.
-
|
These are ratios of neighboring pair bins in the current version of bins. bins = [10,13,16,20,25,32,40,50,63,79,100,126,159,200,240,300,400,490,600,800,1000,1200,1600,2000,2400,3000,4000,5000,6000,8000,10000,13000,16000,20000,25000,32000,40000,50000,63000] Bins for 100bp and 200bp (just without 100 and 300) |
Beta Was this translation helpful? Give feedback.
-
|
i'll repeat this, but - why not making bins nice in all orders of magnitude?
(
[1,2,3,4,5,6,8,10]
+ [1,2,3,4,5,6,8,10] * 10
+ [1,2,3,4,5,6,8,10] * 100
)
What negative consequences would this have?
…On Fri, 6 Mar 2020 at 01:55, Maksim Imakaev ***@***.***> wrote:
[image: image]
<https://user-images.githubusercontent.com/9454715/76039728-a7af5080-5f23-11ea-9501-8f288256dfa4.png>
These are ratios of neighboring pair bins in the current version of bins.
bins =
[10,13,16,20,25,32,40,50,63,79,100,126,159,200,240,300,400,490,600,800,1000,1200,1600,2000,2400,3000,4000,5000,6000,8000,10000,13000,16000,20000,25000,32000,40000,50000,63000]
Bins for 100bp and 200bp (just without 100 and 300)
100,200,300,400,600,800,1000,1200,1600,2000,2400,3000,4000,5000,6000,8000,10000,13000,16000,20000,25000,32000,40000,50000,63000
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#81>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG64CRFPJEFYQW2JFOCFDDRGBCZ5ANCNFSM4KZJNMVQ>
.
|
Beta Was this translation helpful? Give feedback.
-
|
ok, now I get it. A large negative consequence is a two-fold jump from 1 to 2. Could have used 1 2 5 10 instead - that's at least even. A partial remedy is to use these bins, and drop #2,3,5 in the first order of magnitude |
Beta Was this translation helpful? Give feedback.
-
|
I will convert this to the discussion for now, but feel free to comment or open an issue if binning improvements are needed! |
Beta Was this translation helpful? Give feedback.
-
|
Since 1.0.0 we'll have |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
https://github.com/mirnylab/pairtools/blob/d1ddf9c39a336662f7fc725fa5a70ec68df9ba95/pairtools/pairtools_stats.py#L147
consider replacing it with something more readable and usable, e.g. @mimakaev 's robust bins:
currently we have:
which are also non-decreasing, but are too sparsely spaced ... - and code is hard to read
Beta Was this translation helpful? Give feedback.
All reactions