I think the outer shortcut connection is unnecessary for RRDB. There is already a shortcut connection for Dense blocks, thus the original input is already propagated through the dense blocks. Adding an extra outermost shortcut connection will be equivalent to
output = (1 + beta) * input + beta * residual(input)
where beta is 0.2 in this repo. In other words, the input is scaled by a factor of (1 + beta). When stacking a lot of RRDB blocks, the input grows exponentially (1 + beta)^(n_blocks). Perhaps this is why residual scaling is necessary for ESRGAN?