QR Decompositions and Rank Deficient Matrices

How to decompose rank deficient matrices

2020-10-04 5 min read Mathematics

Problem Formulation

We have developed fairly accurate algorithms for QR decompositions in a series of recent posts. However, all algorithms that we considered so far are specifically tailored towards full rank matrices. If we consider for example the vectors $v=(1, 2, 3)^\top$ and $w=(7, 3, 1)^\top$ and build the matrix $M=vw^\top$, then $M$ is clearly a 3x3 matrix with rank 1. Applying the QR decomposition with classical Gram-Schmidt yields the matrix product.

$$ \underbrace{\begin{pmatrix} 0.267 & \text{NaN} & \text{NaN} \\ 0.534 & \text{NaN} & \text{NaN} \\ 0.801 & \text{NaN} & \text{NaN} \end{pmatrix}}_{=:Q} \begin{pmatrix} 26.191 & 11.225 & 3.741 \\ 0.0 & \text{NaN} & \text{NaN} \\ 0.0 & 0.0 & \text{NaN} \end{pmatrix} $$

We get similar results with the modified Gram-Schmidt algorithm as well as with the variant with reorthogonalisation.

The first column vector of $Q$ is correct, however all these NaNs are a hindrance. If we had 0s instead of the NaNs, the product $QR$ would yield our original input matrix.

Where do these NaNs come from?

The orthogonalisation process generates orthogonal vectors that span the same space as our original input column vectors. We discussed this in the post about Gram-Schmidt. If our input matrix doesn’t have full rank, then at some point there will be a vector which can be expressed as a linear combination of the previous ones. In this case the orthogonalisation process will return a 0 vector. In that case the normalization step will attempt to divide 0 by 0 which, according to our floating point standards yields a NaN.

Obviously the idea should be to check the norm of a column vector in the $Q$ matrix before normalizing it. In view of potential rounding issues (we know they are there), one should however come up with a strategy that is a bit more flexible than just checking whether the norm of a certain column is exactly 0. As before we use ideas from [1].

Real matrix rank vs. numerical matrix rank

The numerical rank of a matrix may differ from its true rank. Rounding issues with floating point numbers can suggest that a vector is in the span of the previously computed basis vectors even though in reality it is only very close and should have generated an additional orthogonal basis vector. The converse might also happen. We might compute an additional basis vector even though the current column vector was in the span of the previous ones. In our algorithms we must make a hard cut at a certain threshold, which might cause our column vectors of the Q matrix to be slightly off.

Handling rank deficiency

Previously we argued that if the norm of the current orthogonalized column vector is significantly smaller than the current column vector of our input matrix, then a reorthogonalisation is necessary. Now we extend that criteria and say that whenever the norm is below a small positive threshold, then that column vector is already in the span of the previously computed basis vectors and can be set to 0. In terms of code, this looks as follows.

function qr_reorth_rank(matrix, reorth_threshold=10, rank_deficiency_factor=100, update_r=false)
    # perform a QR decomposition using classical Gram Schmidt with reorthogonalisation
    # We use the rule of thumb suggested by Rutishauser to decide when to reorthogonalise
    # Note that this version does not work for rank deficient setups!
    num_vectors = size(matrix)[2]
    orth_matrix = copy(matrix)
    r_matrix = zeros(num_vectors, num_vectors)
    for vec_idx = 1:num_vectors
        norm_orth_column = 0
        perform_reorthogonalisation = true
        while (perform_reorthogonalisation)
            norm_current_column = norm(orth_matrix[:, vec_idx])
            for span_base_idx = 1:(vec_idx - 1) # orthogonalisation
                projection_length = dot(orth_matrix[:, span_base_idx], orth_matrix[:, vec_idx])
                if (norm_orth_column == 0)
                    # do not update R when performing reorthogonalisation
                    # norm_orth_column is exactly 0 on the first pass thgough
                    # this ensures that R gets updated properly
                    r_matrix[span_base_idx, vec_idx] = projection_length
                elseif (update_r)
                    r_matrix[span_base_idx, vec_idx] += projection_length
                end
                orth_matrix[:, vec_idx] -= projection_length * orth_matrix[:, span_base_idx]
            end
            norm_orth_column = norm(orth_matrix[:, vec_idx])
            # Handle rank deficiency and check if reorthogonalisation is necessary
            if ((norm_orth_column < norm_current_column / reorth_threshold) &&
                (norm_orth_column > rank_deficiency_factor * eps() * norm_current_column))
                perform_reorthogonalisation = true
            else
                perform_reorthogonalisation = false
                if (norm_orth_column <= rank_deficiency_factor * eps() * norm_current_column)
                    orth_matrix[:, vec_idx] .= 0
                end
            end
        end
        if (norm(orth_matrix[:, vec_idx]) > eps())
            # linear dependend columns, we cannot normalise
            orth_matrix[:, vec_idx] = orth_matrix[:, vec_idx] / norm(orth_matrix[:, vec_idx])
        else
            orth_matrix[:, vec_idx] .= 0
        end
        r_matrix[vec_idx, vec_idx] = dot(orth_matrix[:, vec_idx], matrix[:, vec_idx])
    end
    return (orth_matrix, r_matrix)
end

The default values for the threshold parameters coincide with those suggested in [1]. In practice, they work surprisingly well.

Evaluation

We use similar test criteria as before. However, this time we cannot compare $Q^\top Q$ to the identity matrix. Instead, we compare the trace of $Q^\top Q$ with the rank of the matrix. Since Hilbert matrices have full rank, we cannot use them either. Therefore, we consider sums of rank 1 matrices with randomly generated vectors. All our matrices have size 512 times 512 but their rank varies from 1 to 512.

error_qr_reorth_rank_1 = []
error_qr_reorth_rank_2 = []

error_qr_reorth_rank_rupdate_1 = []
error_qr_reorth_rank_rupdate_2 = []

siz_m = 512
for max_rank = siz_m:-2:1
    H = zeros(siz_m, siz_m)
    for count = 1:max_rank
        vector = rand(siz_m, 1)
        H += vector*vector'
    end
    (Q_reorth_rank, R_reorth_rank) = qr_reorth_rank(H, 10, 100, false)
    (Q_reorth_rank_rupdate, R_reorth_rank_rupdate) = qr_reorth_rank(H, 10, 100, true)

    append!(error_qr_reorth_rank_1,         rank(H) - tr(Q_reorth_rank'*Q_reorth_rank))
    append!(error_qr_reorth_rank_rupdate_1, rank(H) - tr(Q_reorth_rank_rupdate'*Q_reorth_rank_rupdate))

    append!(error_qr_reorth_rank_2,         norm(H - Q_reorth_rank*R_reorth_rank, Inf))
    append!(error_qr_reorth_rank_rupdate_2, norm(H - Q_reorth_rank_rupdate*R_reorth_rank_rupdate, Inf))
end

The orthogonality of the non-zero column vectors in the matrix Q is comparable to what we had before in our analysis for the reorthogonalisation. As already mentioned, the number of non-zero columns in the $Q$ matrix is an estimate for the rank of the matrix. As we can see in the plot below, we are off by at most 2 for all tested matrices. In all cases the rank was overestimated.

The distribution in the rank errors is given in the following histogram. From 512 matrices, we computed the correct rank in 288 cases and were off by one in 212 cases. Only in 12 cases we were off by 2.