Hi Andres and Christoph 
Here are three ways for computing the sensitivities dQ, dR in
QR + \varepsilon \, dQ\, R + \varepsilon \, Q \, dR + \mathcal{O}(\varepsilon^2) = A + \varepsilon \, dA.
using FiniteDifferences
function dqr_fdm(A,dA)
    @assert size(A) == size(dA)
    m,n = size(A)
    dQ,dR = jvp(
        fdm,
        A->begin
            Q,R = qr(A)
            return Q[:,1:n],R
        end,
        (A,dA)
    )
    return dQ,dR
end
using ForwardDiff
function dqr_dual(A,dA)
    @assert size(A) == size(dA)
    m,n = size(A)
    Q,R = qr(ForwardDiff.Dual.(A,dA))
    dQ = ForwardDiff.partials.(Matrix(Q),1)
    dR = ForwardDiff.partials.(R,1)
    return dQ,dR
end
function dqr_expl(A,dA)
    @assert size(A) == size(dA)
    m,n = size(A)
    Q,R = qr(A)
    Q = Matrix(Q)
    QdA = Q'*dA
    QdAR = QdA/R
    X = tril(QdAR,-1)
    X = X - X'
    dQ = Q*X + dA/R - Q*QdAR
    dR = QdA - X*R
    return dQ,dR
end
using Test
function test()
    m,n = 5,3
     A = rand(m,n)
    dA = rand(m,n)
    dQ_fdm,dR_fdm = dqr_fdm(A,dA)
    dQ_dual,dR_dual = dqr_dual(A,dA)
    dQ_expl,dR_expl = dqr_expl(A,dA)
    @test dQ_dual ≈ dQ_fdm
    @test dR_dual ≈ dR_fdm
    @test dQ_expl ≈ dQ_fdm
    @test dR_expl ≈ dR_fdm
end
function benchmark()
    m,n = 10000,500
     A = rand(m,n)
    dA = rand(m,n)
    println("Dual numbers:")
    @time dqr_dual(A,dA)
    println()
    println("Explicit derivative:")
    @time dqr_expl(A,dA)
    println()
end
qdr_expl is based on the formulae derived in here, and it significantly outperforms qdr_dual on large problems.
Dual numbers:
 12.101934 seconds (18 allocations: 349.053 MiB, 0.31% gc time)
Explicit derivative:
  2.076238 seconds (54 allocations: 360.773 MiB
The likely reason for this is that dqr_dual uses a Julia implementation of the QR factorisation, while dqr_expl can leverage BLAS and LAPACK.
The paper that I linked above also presents formulae for adjoints / reverse mode differentiation. These would allow you to compute gradients much more efficiently than forward differentiation.