Digression about privacy over OpenTelemetry.jl

Privacy is not listed in OTel Mission, Vision, and Values, at the time of writing.

The julia devs listened to the negative community feedback about telemetry (what a relief !):

Thanks for contributing to julia, but I’d like to mitigate your post,
and prevent repetition (the linked thread is extensive):
not everyone thinks telemetry should be widespread.

2 Likes

I’m not seeing any documentation, so I’m not sure if opt-in or opt-out is possible by the user. I would recommend making all telemetry collection opt-in, and using Preferences to allow users to toggle telemetry collection and transmission via some function in OpenTelemetry.jl (which would then enable it for all packages in the current Project).

Additionally, the transmission endpoint (where logs are sent) should also be configurable via Preferences per-package, but can default to whatever location the library author prefers.

Thanks for contributing to julia, but I’d like to mitigate your post,
and prevent repetition (the linked thread is extensive):
not everyone thinks telemetry should be widespread.

But “telemetry” and “privacy” have nothing to do with each other inherently – outside of what any software system that stores or processes information whose privacy (or other things like security) might need to be contemplated.

To me, telemetry in this context just refers to distributed tracing/logging. opentelemetry (and opentracing before it) helps deal with the number of proprietary standards and the lack of interoperability between different tracing services.

Usually telemetry happens in cloud machines and usually has little to do with user data, architecturally it usually looks like this:
https://opentracing.io/docs/overview/

4 Likes

I feel like there might need to be some clarification. When I saw this, and first looked at the project, I interpreted the intent as to provide a better, more unified telemetry system for a user to implement. As someone who does a lot of distributed computing, and has worked with multiple pieces of software that need to talk to each other, this looked great to me.

The comments so far seem to imply that the system would send telemetry to some external location, like the Pkg telemetry. Is the intent that a package creator is imbedding telemetry collection in their package, or someone that is writing complicated software can better manage their logs/backtraces/etc that may come from multiple locations?

I will look through the package more carefully (and probably answer my own question doing so), but I thought it might help to state it more explicitly.

2 Likes

OpenTelemetry is a merging of https://opentracing.io/ and OpenCensus,
as seen from the top of the latter:
Screenshot_20211104_154156
Partners and contributors can be seen from the OpenCensus introduction page.

So while I can believe that for you it “just refers to distributed tracing/logging”,
it may actually encompass the kind of telemetry that has been rejected.
That’s why I’d hope OpenTelemetry.jl to be developed and used in other packages
with specific caution about privacy
(it’s the general culture in julia, so I’m pretty confident the outcome will be satisfying).

To put in other words: when working on a local machine or grid (no cloud of course),
I want to be confident that no data is leaked elsewhere.
For now this seems to be the default,
which allows to add packages without losing time to investigate “privacy policies”.

If packages using OpenTelemetry.jl can still respect that, then it’s fine.

This is all a wholly off-topic digression based purely upon a superficial reading of the name. It’s about logging, not some sort of scheme to violate your privacy. Could someone use those logs to “phone home” user data in a manner that violates your view of privacy? Sure. But it’s not relevant to the functionality implemented nor the intended use-cases.

This would be like objecting to the registration of an HTTP package because it allows POSTing data to arbitrary servers.

I’ve split this from the announcement thread.

12 Likes

I agree with the split, if it helps.
But your summary is grossly exaggerated hence counter-productive.
IMO, it has little to do with posts that were more carefully balanced.
(in particular, I never thought to prevent registration of a package that
might be good if designed and used correctly)
This is my last post in this thread.

I’m not seeing any documentation, so I’m not sure if opt-in or opt-out is possible by the user. I would recommend making all telemetry collection opt-in, and using Preferences to allow users to toggle telemetry collection and transmission via some function in OpenTelemetry.jl (which would then enable it for all packages in the current Project).

Additionally, the transmission endpoint (where logs are sent) should also be configurable via Preferences per-package, but can default to whatever location the library author prefers.

Yes, it is opt-in. Several transmission endpoints will be provided by default. But application users can also control what libraries to collect data from and how to transmit the collected data.

3 Likes

[referring to the top post of this thread]
I want to add that my post, which you quoted, should not be interpreted as an endorsement of telemetry. I simply wanted to ask, whether any statistics will be published (I understand this will eventually be the case).