Any good alternative to GitHub?

GitLab is not exactly a saint either. Its current “good policy” is largely the result of past backlash and GitHub shooting itself in the foot.

GitLab has its own history with data usage. In 2019, it planned to add product usage telemetry to GitLab.com and some proprietary self-managed packages, involving third-party analytics services like Pendo and Snowplow. Unsurprisingly, self-hosted users and the open source community pushed back hard. GitLab then rolled back the related Terms of Service change on 2019-10-24 and said it would not enable user-level product usage tracking on GitLab.com or self-managed instances before re-evaluating the plan. After Sid Sijbrandij’s apology, GitLab further narrowed the scope to GitLab.com only, explicitly stating that self-hosted customers would not be affected. Third-party media also summarized the incident as GitLab walking back its telemetry plan after user pushback.

The AI side also has some history. According to GitLab’s own Terms of Use history page, GitLab’s early AI Functionality Terms (V1V2, 2023-08-14 to 2024-08-29) classified AI feature Input / Output as Customer Content, but still did not clearly say, in today’s wording, that GitLab would not use customer code to train models.

The more obvious backlash happened around 2024-04-22, in the Hacker News discussion on GitLab Duo. At the time, GitLab Duo’s FAQ said: “GitLab does not train generative AI models based on private (non-public) data. The vendors we work with also do not train models based on private data.” That wording only protected private / non-public data, so people immediately pointed out the obvious implication: public code might not be protected.

Then on 2026-04-20, GitLab used GitHub Copilot’s policy change as an opportunity to fire back. In GitHub Copilot’s policy for AI training: A governance wake-up call, GitLab’s wording became much stronger: it does not train AI models on customer code at any tier, and its vendors are contractually prohibited from using customer inputs / outputs for their own purposes.

So yes, GitLab’s current “we don’t train on user code” stance is a commercial strategy to compete against GitHub. Codeberg’s position, on the other hand, is an ideological choice rooted in free software and community self-governance.