Description
I think there are a number of flaws in corr_mask_2_rsp_rsp
from https://github.com/fortran-lang/stdlib/blob/stdlib-fpm/src/stdlib_stats_corr.f90. It is listed below.
module function corr_mask_2_rsp_rsp(x, dim, mask) result(res)
real(sp), intent(in) :: x(:, :)
integer, intent(in) :: dim
logical, intent(in) :: mask(:,:)
real(sp) :: res(merge(size(x, 1), size(x, 2), mask = 1<dim)&
, merge(size(x, 1), size(x, 2), mask = 1<dim))
integer :: i, j
real(sp) :: centeri_(merge(size(x, 2), size(x, 1), mask = 1<dim))
real(sp) :: centerj_(merge(size(x, 2), size(x, 1), mask = 1<dim))
logical :: mask_(merge(size(x, 2), size(x, 1), mask = 1<dim))
select case(dim)
case(1)
do i = 1, size(res, 2)
do j = 1, size(res, 1)
mask_ = merge(.true., .false., mask(:, i) .and. mask(:, j))
centeri_ = merge( x(:, i) - mean(x(:, i), mask = mask_),&
0._sp,&
mask_)
centerj_ = merge( x(:, j) - mean(x(:, j), mask = mask_),&
0._sp,&
mask_)
res(j, i) = dot_product( centerj_, centeri_)&
/sqrt(dot_product( centeri_, centeri_)*&
dot_product( centerj_, centerj_))
end do
end do
case(2)
do i = 1, size(res, 2)
do j = 1, size(res, 1)
mask_ = merge(.true., .false., mask(i, :) .and. mask(j, :))
centeri_ = merge( x(i, :) - mean(x(i, :), mask = mask_),&
0._sp,&
mask_)
centerj_ = merge( x(j, :) - mean(x(j, :), mask = mask_),&
0._sp,&
mask_)
res(j, i) = dot_product( centeri_, centerj_)&
/sqrt(dot_product( centeri_, centeri_)*&
dot_product( centerj_, centerj_))
end do
end do
case default
call error_stop("ERROR (corr): wrong dimension")
end select
end function corr_mask_2_rsp_rsp
The spacing is poor. Why is there a continuation line such as
0._sp,&
mask_)
Continuation lines should only be used when a line is too long. Another problem is that merge is used when one of the tsource
and fsource
arguments require computation. Since Fortran does not mandate short-circuiting, an if block should be used instead. It also looks like the row and column means are computed more often than needed. If they are needed, they should be computed once and stored. The correlation equals the covariance divided by the product of the standard deviations. The standard deviations of the rows or columns should be computed once and stored. Instead, in a line such as
res(j, i) = dot_product( centeri_, centerj_)&
/sqrt(dot_product( centeri_, centeri_)*&
dot_product( centerj_, centerj_))
the standard deviations for the denominator are computed repeatedly instead of being stored and reused.
Finally,
mask_ = merge(.true., .false., mask(:, i) .and. mask(:, j))
should be written
mask_ =mask(:, i) .and. mask(:, j)
as I wrote in another issue #953.