script> !function(w,d){if(!w.rdt){var p=w.rdt=function(){p.sendEvent?p.sendEvent.apply(p,arguments):p.callQueue.push(arguments)};p.callQueue=[];var t=d.createElement("script");t.src="https://www.redditstatic.com/ads/pixel.js",t.async=!0;var s=d.getElementsByTagName("script")[0];s.parentNode.insertBefore(t,s)}}(window,document);rdt('init','a2_gak5ncm47xiy');rdt('track', 'PageVisit');
SECURITY 101

Detecting & Preventing Source Code Exfiltration

Simon Wenet
VP, Marketing & Product Growth
July 5, 2023
Simon has spent the last decade in security leading product management & growth teams at various companies focused on DNS security, DLP, and now application security.

TL;DR

Source code exfiltration is a rising form of cybercrime. It occurs when business competitors, disgruntled employees, or malicious actors obtain access to and disclose the contents of private code repositories. Exfiltration is commonly used as a form of extortion, but it also facilitates commercial espionage and intellectual property theft. To detect source code exfiltration, you need tools that can spot anomalous activity, provide immediate alerts, and help you investigate suspected historical losses.

{{arnica-top-signup-banner="/template-pages/try-arnica-banner"}}

How to Detect Source Code Exfiltration

Source code exfiltration occurs when closed-source components of a company's systems are leaked outside the organization. It can happen as a result of malware attacks, cyber espionage, or blackmail attempts by disgruntled employees and contractors.

Developers and security personnel often remain unaware that exfiltration has occurred until an attacker demands a ransom or leaks the code online. Being able to anticipate and prevent exfiltration is essential because code losses can be more devastating than physical asset theft; they can let competitors replicate your IP, or expose sensitive information about your system’s operations or customers. Threat actors could use the leaked information to plan further attacks using their inside knowledge of the vulnerabilities in your code.

In this article, we'll explore the risks of source code exfiltration and share some mitigation techniques. We will explore how to defend robustly against exfiltration attempts by implementing tools that block code misappropriation and detect breaches quickly and effectively.

The Risks of Source Code Exfiltration

The loss of source code should be treated as an existential threat. Any code leaks diminish the unique value of your product. When exfiltration occurs as part of a commercial espionage campaign, your competitors may use your assets to develop their own platforms. Less scrupulous firms might copy your product wholesale and resell it with modified branding.

If your source contains hardcoded secrets, those credentials will also be exposed, permitting further attacks against your infrastructure.

As if that wasn’t enough, cybercriminals also use code exfiltration as a means of extortion. In 2022, hacker group Lapsus$ made headlines for stealing source code from companies including NVIDIA, Samsung, and Microsoft. The group demanded ransoms before publishing the stolen data. This kind of exfiltration affects reputation, compliance standing, and can result in costly regulatory penalties.

How Source Code Exfiltration Occurs

Exfiltration can be carried out in multiple ways, often unpredictably. Source can leave organizations as a result of deliberate human actions, or due to oversights such as setting improper access controls in source management systems.

Here are a few ways in which exfiltration can happen:

  • Employees with grievances can leak their employer's source code. Copied source code could be exploited as leverage when a disgruntled employee seeks reparations for a layoff, or faces disciplinary action. Additionally, discontented workers may leak code as a form of whistle-blowing or activism, such as if they have ethical or moral objections to their organization's direction.
  • CI/CD servers can be compromised so that code is sent to an attacker. This facilitates ongoing code exfiltration each time a CI/CD pipeline is run.
  • Malicious source code dependencies can clone their environments. Code can exfiltrate itself if it includes malicious dependencies that have been added by supply chain attacks. Innocuous packages, such as linter plugins, could stream code to external servers when developers or CI/CD pipelines test your project.
  • Staff can be tricked into distributing code by social engineering campaigns. A seemingly innocent request to download a repo and send it to the project manager for distribution to a client could be the start of a disastrous code exfiltration situation.
  • Exfiltration can occur anywhere in the software delivery process. It's not just your own access that needs to be protected. Code could be exfiltrated from—or by—contractors and other organizations you work with. Protecting yourself starts with enforcing strong security standards for your entire supply chain.

The variability in how exfiltration occurs makes incident detection more challenging. Nonetheless, using a combination of automated tools and employee vigilance can allow you to recognize exfiltration attempts as they occur.

Detecting Source Code Exfiltration

While ideally source code exfiltration would be prevented in all cases, in reality you need to be prepared for it to occur by setting up alert mechanisms so that you can detect it as it happens. You also need tools that retrospectively report the sequence of events leading up to an attempt, and help you find stolen code that has been published online. Here are four techniques for finding and blocking source exfiltration attempts:

1. Use Source Control Audit Trails

Audit trails are a simple but effective way to analyze source code movements. Enabling audit logging for your repositories lets you track who has accessed a project, when they were active, and the location from which they logged in. You can use this information as the basis for detecting abnormal clones and pulls.

Major source control providers offer integrated support for audit logs on their team and organization plans. GitHub, GitLab, and Bitbucket all have similar event-based systems that record when repositories are downloaded, mirrored, migrated, and forked. Enabling these features in your environment will provide vital information if you suspect unauthorized source access has occurred. They also provide evidence to support your case in any disciplinary or legal proceedings that may arise.

Audit trails alone are not enough to defend against exfiltration, though. They’re a historical record of access attempts, useful when you’re forensically investigating known or suspected breaches. Audit logs must be combined with other tools to detect exfiltration and alert you when it happens.

2. Monitor Public Repositories for your Code

Unfortunately, not every instance of code exfiltration is systematically detectable as it occurs. Some events are inherently invisible, such as when an employee locally copies a repo they've previously cloned, then re-uploads it to the internet. No audit event will be recorded in this circumstance.

You can gain awareness of this kind of activity by monitoring public repos for the presence of unique portions of your code. Use the search APIs of GitHub, GitLab, and similar platforms to detect stolen code and analyze whether it is being spread further across the internet. Write your own script that regularly calls these APIs to find occurrences of uniquely named files from your repositories. 

Repository scans are a useful additional defense layer, but are naturally limited in scope. Stolen code might not appear online for months after the theft, if it shows up at all. Attackers that exfiltrate code to incorporate into their own products may never publicize the theft, leaving you oblivious to the incident.

3. Implement Automated Anomaly Detection

Automated anomaly detection tools are the most sophisticated defense against source code exfiltration. These agents spot suspicious source code access patterns, alert relevant developers and security teams, and apply automatic actions that prevent the malicious activity from continuing, until the alert has been triaged.

Policy-driven anomaly detection systems make decisions about access attempts based on predefined rules and knowledge of permissible past behavior. You can, for example, automatically block clients that try to pull a large number of repositories, guess different repo URLs, or initiate a download from an unknown location.

Automated anomaly detection empowers security teams by increasing the speed and precision with which exfiltration attempts can be addressed. You're informed each time code reaches your perimeter, without having to manually inspect audit logs or wait for it to appear in public repos.

Preventing Source Code Exfiltration

Preventing exfiltration starts with knowing when it's happening, using the strategies outlined above. Once you've established this basic informational layer, you can implement additional hardening measures to increase your protection.

Collectively, these techniques allow you to stay ahead of code exfiltration by limiting the situations in which it's possible, alerting you when it happens, and allowing you to investigate suspected but undetected leaks. It's still not possible to catch every incident, however; there's always the risk of lone developers copying repos onto private storage devices, or malware succeeding in slipping through your net.

Restrict User Access to Repositories

Reducing the number of people with access to source code helps to lower the risk of its exposure. Restricting repos to essential users only means that there exist fewer credentials that could be stolen, and the chance of a disillusioned employee misappropriating your assets is reduced. Non-developers, such as support and admin staff, rarely read or modify code, so they’re unlikely to require access.

Enforce Least Privileged Access to Source Code

For developers, testers, security teams, and others who must interact with code, you should precisely scope their repository access to the minimum set of privileges they require. Very few engineers need to download complete archives of a repository regularly, or set up mirroring to another server, for example.

Secure Your Software Supply Chain

Take steps to secure your supply chain by scanning for malicious and vulnerable packages. Malware can be unwittingly added by legitimate developers; once it has entered your project, the code could exfiltrate details of its surroundings. Avoid using external dependencies from sources you don't trust.

Use Anomaly Detection and Developer Behavior Analysis Tools

Developer anomaly detection automates the discovery and prevention of exfiltration attempts. Centralized policy management and tuned awareness of your team's normal access patterns increase the accuracy and coverage of the alerts you receive. You can choose to respond with immediate actions, such as revoking an access token that has been used in a suspicious location.

Conclusion

Source code exfiltration differs from other kinds of cyber threats because it relates to information leaving your organization, as opposed to DDoSes, account takeovers, and ransomware, which are normally targeted attacks from outside. Preventing code from crossing company perimeters requires a dedicated prevention strategy.

Addressing the challenge of code exfiltration starts with deploying tools that can detect when exfiltration happens. Combining public repo monitoring, granular audit logs, and network traffic analysis solves this part of the problem, but still leaves you several steps behind the perpetrator.

Anomalous developer behavior detection is the most comprehensive method for defending against source exfiltration. These tools work in real time, identifying exfiltration attempts as they occur and providing relevant administrators with immediate alerts and possible actions. Try Arnica today to explore how early, effective developer anomaly detection can secure your repositories.

THE LATEST UPDATES

More from our blog

SAST vs. DAST: A Comparative Analysis
SAST vs. DAST: A Comparative Analysis
January 17, 2025
EPSS vs CVSS vs KEV for Nuanced Risk Management
EPSS vs CVSS vs KEV for Nuanced Risk Management
November 27, 2024
How to prioritize third-party package (SCA) vulnerabilities
How to prioritize third-party package (SCA) vulnerabilities
October 30, 2024

{{arnica-bottom-signup-banner="/template-pages/try-arnica-banner"}}