Comments on stdlib_stats_corr.f90

I think there are a number of flaws in `corr_mask_2_rsp_rsp` from https://github.com/fortran-lang/stdlib/blob/stdlib-fpm/src/stdlib_stats_corr.f90. It is listed below.

```fortran
    module function corr_mask_2_rsp_rsp(x, dim, mask) result(res)
      real(sp), intent(in) :: x(:, :)
      integer, intent(in) :: dim
      logical, intent(in) :: mask(:,:)
      real(sp) :: res(merge(size(x, 1), size(x, 2), mask = 1<dim)&
                          , merge(size(x, 1), size(x, 2), mask = 1<dim))

      integer :: i, j
      real(sp) :: centeri_(merge(size(x, 2), size(x, 1), mask = 1<dim))
      real(sp) :: centerj_(merge(size(x, 2), size(x, 1), mask = 1<dim))
      logical :: mask_(merge(size(x, 2), size(x, 1), mask = 1<dim))

      select case(dim)
        case(1)
          do i = 1, size(res, 2)
            do j = 1, size(res, 1)
             mask_ = merge(.true., .false., mask(:, i) .and. mask(:, j))
             centeri_ = merge( x(:, i) - mean(x(:, i), mask = mask_),&
                0._sp,&
                mask_)
             centerj_ = merge( x(:, j) - mean(x(:, j), mask = mask_),&
                0._sp,&
                mask_)

              res(j, i) = dot_product( centerj_, centeri_)&
               /sqrt(dot_product( centeri_, centeri_)*&
                     dot_product( centerj_, centerj_))

            end do
          end do
        case(2)
          do i = 1, size(res, 2)
            do j = 1, size(res, 1)
             mask_ = merge(.true., .false., mask(i, :) .and. mask(j, :))
             centeri_ = merge( x(i, :) - mean(x(i, :), mask = mask_),&
                0._sp,&
                mask_)
             centerj_ = merge( x(j, :) - mean(x(j, :), mask = mask_),&
                0._sp,&
                mask_)

              res(j, i) = dot_product( centeri_, centerj_)&
               /sqrt(dot_product( centeri_, centeri_)*&
                     dot_product( centerj_, centerj_))
            end do
          end do
        case default
          call error_stop("ERROR (corr): wrong dimension")
      end select

    end function corr_mask_2_rsp_rsp
```

The spacing is poor. Why is there a continuation line such as

```fortran
                0._sp,&
                mask_)
```
Continuation lines should only be used when a line is too long. Another problem is that merge is used when one of the `tsource` and `fsource` arguments require computation. Since Fortran does not mandate short-circuiting, an if block should be used instead. It also looks like the row and column means are computed more often than needed. If they are needed, they should be computed once and stored. The correlation equals the covariance divided by the product of the standard deviations. The standard deviations of the rows or columns should be computed once and stored. Instead, in a line such as

```fortran
              res(j, i) = dot_product( centeri_, centerj_)&
               /sqrt(dot_product( centeri_, centeri_)*&
                     dot_product( centerj_, centerj_))
```

the standard deviations for the denominator are computed repeatedly instead of being stored and reused. 

Finally,

`mask_ = merge(.true., .false., mask(:, i) .and. mask(:, j))`

should be written

`mask_ =mask(:, i) .and. mask(:, j)`

as I wrote in another issue https://github.com/fortran-lang/stdlib/issues/953.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments on stdlib_stats_corr.f90 #955

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Comments on stdlib_stats_corr.f90 #955

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions