-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uniform noise op #554
base: master
Are you sure you want to change the base?
Add uniform noise op #554
Conversation
do { | ||
final double newVal = (rangeMax - rangeMin) * rng.nextDouble(true, true) + | ||
rangeMin + input.getRealDouble(); | ||
if (newVal <= input.getMaxValue() && newVal >= input.getMinValue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what would be the least surprising behavior here. When I have an UnsignedByteType
image with all pixels at 255
, this would create negative noise only, thereby shifting the image mean to be <255
. When adding any noise, I'd expect the image mean to be essentially unchanged. Although that's a bit special case, how about making it optional if the result should be 1) clipped, 2) overflowing, or 3) some smarter way of adding noise, such as the one here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reply!
I think that clipping can be pretty easily done by manipulating the rangeMin
and rangeMax
parameters. rangeMax
just has to be set to 0
in this scenario.
I do not understand what you mean by overflow. Are you suggesting that you would like to have values in the ouptut that are greater than 255
? If so you can just convert your image to ShortType
or something else that allows larger values. Or are you meaning that it would be desired to wrap values with positive noise values back around to the lower values of the type (i.e. if given rangeMin = -4
and rangeMax = 4
we generate a noise value of 4
for your input
of 255
, thus the output
is set to 3
)? If so this is not correct, seeing as output < rangeMin
. Can you elaborate on what you mean, if you mean something different?
I guess by this I mean that I do not see any reason to add these options, since it all can be controlled by the type of the input and the rangeMin
and rangeMax
parameters. And you are right in that the mean of your example will be less than 255. However in this scenario I think that is expected, because there is no way to go above the starting value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, when I said "overflow", I meant wrapping values from max to min and vice versa, e.g. 255 + 4 -> 3
. When working with ImgLib2 types (such as UnsignedByteType
), this is the observed behavior, and it actually comes in handy when working with cyclic colormaps (see also http://peterkovesi.com/projects/colourmaps/) to represent e.g. directionality. For that special case, it would be nice to be able to choose between wrapping, clipping, or possibly type-widening behavior for ops in general (from a user perspective).
I suggest caution: there is no guarantee that all ImgLib2 types will behave in this way. (@tpietzsch @axtimwalde please correct me if I am wrong. ) Behavior in this case is left up to the implementation. E.g., IIRC, there are some types that when you |
I don't remember a problem with |
Yes, I figured it would be undefined behavior (as always in ImgLib2, it's the callers responsibility to conform to types and intervals...), that's why i said "observed" behavior 😄 Ok, I will not rely on that, but the "cyclic scale" use case came to my mind as something where this would be really useful.
BTW, are there terms to distinguish between out-of-bounds for intervals (i.e. x,y,z,...) and out-of-bounds for types (i.e. values)? Anyhow, this is now far off-topic for this PR. I suggest to simply ignore my comments (and maybe just keep in mind that it would be useful to have consistent clipping/wrapping behavior across ops in general). |
This op was renamed to make it clear that the noise generated by this op is NON-uniform. This distinction is being made to separate this op from a uniform noise generation op (coming in a later commit)
In previous commits the op did not allow input image data to influence the output values. Now the algorithm allows, given a nonzero input image (in which all of the pixels do not have to be the same value), the input image to influence the output noise, i.e. if it is desired to add uniform noise with range [-4, 4] upon an imported image, the algorithm will now add a value x such that -4 <= x <= 4 to each pixel in the image, whatever value that may be, as long as it does not extend beyond the range of the type.
Previously the Op would recompute the calculation whenver the Mersenne Twister provided a random value that would result in the input going outside of the type bounds, in the hopes that another computation would provide a result that was inside the type bounds. Not only can this be slow, but also does not line up with what the user might expect. Suppose that the input to the Op is an UnsignedByteType of value 254, with rangeMin=0 and rangeMax=4. Since we have uniform noise, we would expect that the probability that the output has an equal probability of being 254, 255, 256, 257, 258 (with a probability of 0.2 for each). But we know that this UnsignedByteType cannot hold a value greater than 255. If the Op redoes the calculation every time the output is greater than 255, then we know that the output will be 254 half of the time and 255 half of the time. Therefore no probability matches what might be expected. Consider the same example, but instead of a recomputation we clamp the output when it is outside the type bounds. We now have P(output=254)=0.2, P(output=255)=0.8. In this case, all outputs within the type range will have the expected probability of occurring, which makes more sense than the scenario explained above. This is why the change was made.
67f7e97
to
04af3fe
Compare
Alright, I gave this PR some TLC. @imagejan, after some thinking I realized that I liked your solution a bit better, as I described in 5440061. I found it difficult to treat floating point and integer types the same, however (see 04af3fe), so I created two different Ops. I did, however, leave the computations for the integer type Op in |
This allows us to handle integer types and realTypes separately. The reason for this is as follows: It seems clear that we would want to allow floating point randomness when our type allows floating points If we always base our calculations on floating points, this results in edge effects when we add noise to integer types. This is due to rounding: suppose we have a pseudorandomly generated double between 0 and 2. Most IntegerTypes simply round that double to a long, integer, etc in setReal. Thus the probability we add 0 to the input is 0.25, 1 to the input is 0.5, and 2 to the input is 0.25. To get around this case we want to generate the random number differently when we have an integer type. For these reasons we have two different Ops to perform the different noise generation and have shared logic for setting the output. N.B. We do not allow wrapping for floating point types. This is because we operate within the following constraint: the number after the max of the type should map to the minimum of the type. In the case of floating points, what then should happen to a number equal to 0.5 + the maximum of the type? In this paradigm, we would expect it to be the minimum of the type - 0.5; this is impossible, however, since this would be outside of the range of the type. For this reason, we always clamp.
The default seed resulted in non-random noise. Removing it ensures that the result is different every time (unless the user passes a seed)
Thanks for catching that @rimadoma! Should be updated in the uniform noise Ops. It looks like
Sure, yes. I used |
The previous default seed parameter was correct. One rule of ops is that they are deterministic. If you want TRULY RANDOM noise in your script, then you must generate a random number (externally to ops) and pass that as the seed. I know it's annoying, but it's one of the guarantees of ops. Therefore, we also need to be careful about multithreaded pseudorandomness. Just because |
This PR adds an op to add uniformly distributed noise throughout an image, as an alternative to the non-uniform
AddGaussianNoise
op, renamed in this PR fromAddNoise
.