Skip to content

Wrong derivatives for cholesky and symmetric eigen decomposition #1803

Open
@IvanYashchuk

Description

@IvanYashchuk

Description

Currently, derivatives of cholesky_decompose and eigenvalues_sym return lower triangular matrix.
This is what finite differences return because the implementation of those functions considers only part of the input matrix. That's why this bug is ignored by the test system. I think that the correct derivative matrix should be symmetric and not lower triangular. This also affects any other function involved in the computation and the calculated autodiff derivative is not correct anymore.

Example

Consider calculating log det A when A is an invertible symmetric matrix. This can be done using log_determinant or with cholesky_decompose, eigenvalues_sym. The derivative of logdetA is the inverse of A. The code below demonstrates that a.adj() is the lower triangular part of the inverse of A, while it is expected to be the full symmetric matrix.

#include <stan/math.hpp>
#include <iostream>

int main() {
    using stan::math::eigenvalues_sym;
    using stan::math::cholesky_decompose;
    using stan::math::determinant;
    using stan::math::log_determinant;
    using stan::math::diagonal;
    using stan::math::log;
    using stan::math::sum;

    stan::math::matrix_d a(4, 4);
    // Random symmetric matrix
    a << 1.8904,  0.7204, -0.1599,  1.2028,
         0.7204,  7.3394,  2.0895, -0.6151,
        -0.1599,  2.0895,  2.4601, -1.7219,
         1.2028, -0.6151, -1.7219,  4.5260;

    stan::math::matrix_v a1(a);
    stan::math::matrix_v a2(a);
    stan::math::matrix_v a3(a);
    stan::math::matrix_v a4(a);

    std::cout << "Here is the input matrix a:\n" << a << std::endl;

    // Way 1
    stan::math::matrix_v chol_mat = cholesky_decompose(a1);
    auto logdet1 = 2 * sum(log(diagonal(chol_mat)));

    // Way 2
    auto w = eigenvalues_sym(a2);
    auto logdet2 = sum(log(w));

    // Way 3
    auto det = determinant(a3);
    auto logdet3 = log(det);

    // Way 4
    auto logdet4 = log_determinant(a4);

    std::cout << "logdet1 = " << logdet1 << std::endl;
    std::cout << "logdet2 = " << logdet2 << std::endl;
    std::cout << "logdet3 = " << logdet3 << std::endl;
    std::cout << "logdet4 = " << logdet4 << std::endl;
    std::cout << "Are these all log(det(A))? " << ( (logdet1 - logdet2 < 1e-8) && (logdet2 - logdet3 < 1e-8) ? "True" : "False") << std::endl;

    stan::math::matrix_d a_inv = stan::math::inverse(a);
    std::cout << "Here is the matrix a_inv:\n" << a_inv << std::endl;

    stan::math::set_zero_all_adjoints();
    logdet1.grad();
    std::cout << "Here is the matrix a1_adj:\n" << a1.adj() << std::endl;
    bool way_1 = a_inv.val().isApprox(a1.adj());

    stan::math::set_zero_all_adjoints();
    logdet2.grad();
    std::cout << "Here is the matrix a2_adj:\n" << a2.adj() << std::endl;
    bool way_2 = a_inv.val().isApprox(a2.adj());

    stan::math::set_zero_all_adjoints();
    logdet3.grad();
    std::cout << "Here is the matrix a3_adj:\n" << a3.adj() << std::endl;
    bool way_3 = a_inv.val().isApprox(a3.adj());

    stan::math::set_zero_all_adjoints();
    logdet4.grad();
    std::cout << "Here is the matrix a4_adj:\n" << a4.adj() << std::endl;
    bool way_4 = a_inv.val().isApprox(a4.adj());

    std::cout << "Is way 1 correct? " << (way_1 ? "True" : "False") << std::endl;
    std::cout << "Is way 2 correct? " << (way_2 ? "True" : "False") << std::endl;
    std::cout << "Is way 3 correct? " << (way_3 ? "True" : "False") << std::endl;
    std::cout << "Is way 4 correct? " << (way_4 ? "True" : "False") << std::endl;
}

Current Version:

v3.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions