How to Ensure You Don’t Have Sourcegraph Secrets in Source Code

Nir Valtman

September 4, 2023

•

5 min

Share this post

Sourcegraph is an AI-powered platform utilized by developers from the largest software companies in the world including Uber, Reddit, Dropbox and more. On August 30, 2023, Sourcegraph’s Head of Security, Diego Comas, shared that a hacker was able to take over administrative functions of Sourcegraph and may have accessed Sourcegraph user information.

In this blog we explore the causes and possible paths to prevent similar issues. We also share an update to Arnica’s secret detection and validation, which now includes a Sourcegraph token validator.

What happened? Sourcegraph admin credentials found in source code and exploited

According to Sourcegraph’s account of the incident, on July 14^th, a Sourcegraph engineer mistakenly committed code that included an active site-admin token. This token had extensive privileges to Sourcegraph.com, including the ability to view and alter account information. Existing safeguards failed to detect and eliminate the token.

On August 28^th, a user created a new Sourcegraph account and, two days later using the exposed side-admin credentials, updated their permissions to be a site-admin. This gave the user full administrative access to the Sourcegraph admin dashboard.

Next, the user created a proxy application which was able to make direct calls to Sourcegraph’s APIs and their underlying LLM (a.k.a. large language models). On the same day, due to the massive spike in API calls, the Sourcegraph security team identified the malicious user, revoked their permissions, and began their investigation.

What was the impact? Sensitive user data possibly exposed

The application created by the malicious user, which granted free access to Sourcegraph APIs, generated 2 million views within hours. As the app spread, users generated free Sourcegraph accounts and were able to access Sourcegraph APIs.

According to Comas, the exposure was to Sourcegraph.com, which contains public code only. The scope of data exposure due to the user having admin access was the name, email, and license key for paid users and email addresses for free users. However, the Sourcegraph team stated that they were unsure if the data, while exposed, was actually viewed, changed, or extracted.

What can be done? We built a custom Sourcegraph secret validator

At Arnica, we preach about helping companies implement a “No new secrets in source code” policy so that they can stop the bleed and shift their focus to the backlog. In service of that principle, Arnica builds validators for a wide range of common secret types, and we’ve added a custom validator for Sourcegraph tokens.

How does the Sourcegraph secret validator work?

Arnica runs secrets detection and validation on a scheduled basis and on every code push to all git repositories, without the need to integrate into any CI/CD pipelines. As part of this capability, we’ve developed a defined pattern for Sourcegraph access tokens. If we get a regular expression match, Arnica validates the access token against Sourcegraph’s systems to establish the context of the token.

Based on the response, Arnica can both validate the token, and clarify the severity and risk of a given Sourcegraph token based on the context. For example, if the token grants site-admin access, the risk severity is high, otherwise it is medium.

What it means for your secrets posture: A “No new secrets in code!” policy

There is a running joke among application security leaders that “we don’t have any secrets in our source code,” which is *always* said with a sly smile. It’s a problem that has existed for a while and continues to wreak havoc on organizations who experience a breach or lead due to exposed secrets.

With the Sourcegraph token validator now in Arnica, any time a developer pushes a Sourcegraph token – or another secret type – in a PR or commit, they’ll get a real time message in chat (Slack, Teams, etc) that lets them know they’re pushing a secret. Should the developer choose, they can click “Fix it for me” in the chat and, behind the scenes, Arnica will eliminate the secret from the commit (and all git history) before resubmitting.

Here is our Sourcegraph validator in action: