ATTACK

Featured

Malicious Code Campaign on GitHub Repos: Is it Hype or a Dire Threat?

Nir Valtman

March 5, 2024

•

7 mins

Share this post

Malicious Code Campaign on GitHub Repos: Is it Hype or a Dire Threat?

So, you've probably heard the buzz about the attack that created hundreds of thousands of malicious open-source repositories on GitHub, right? It's been all over the news, with articles warning us about the dangers lurking in forked repos. But let's take a step back and assess this situation.

Is the attack being overhyped, or are we looking at a genuine cybersecurity apocalypse? Let's dive in.

What is a repo confusion attack?

Picture this: threat actors decide to fork a reputable repository. Then they sprinkle some malicious code into the mix. They automate that process to for each repo thousands of times. Now, an unsuspecting developer clones one of these compromised repos and runs the program – and its malicious code – which might be handing over data on a silver platter, or worse, infecting the developer’s own system!

Scary, right?

Well, that is what happened across 100,000+ GitHub repositories in late February 2024. But before we all panic, let's not get carried away just yet...

What is the real risk of a repo confusion attack?

Here's where things get a bit more nuanced. For this dastardly plan to work, you'd first need to somehow convince a developer to clone the forked open-source repo from GitHub, which contains malicious code. Remember, the original open-source repo is just fine. It's a bit like convincing someone to take a wrong turn on their drive to work – possible, but it requires some social engineering effort.

For the attack to be executed, the developer needs to run the code from the spoofed repo on their workstation. But, this is only on the developer workstation, assuming it doesn’t propagate across the network.

If the developer decides to import the repo to the organization’s source code management solution, other developers might trust the spoofed repo as legitimate as is now a company asset. That's when the alarm bells should really start ringing. Why? Because other developers might trust it by default.

But let's be clear: this attack isn't executed when packages are merely downloaded via pip or other package managers; it requires cloning from GitHub and the execution of the code within those cloned (spoofed) repos.

Defending against a repo confusion attack

As we just saw, the path for this type of attack to work is long, with many opportunities for effective security controls to limit its impact.

Git posture hardening

First off, harden your git posture. Limit who can create repos in your organization to prevent unwanted "guests." Do this by going to your organization settings on GitHub, then to “Member Privileges” and prevent repository creation by members.

A screenshot of a computerDescription automatically generated

This step gives you the ability to narrow the group of people within your organization who can import or create repos, thus proportionately limiting your org’s exposure to this type of repo confusion attack.

Identify indicators of compromise in source code

Next up, make sure that you are scanning all source code repos for indicators of compromise. And if you spot something fishy, notify your developers and the security team straightaway.

At Arnica, we embedded the indicators of compromise in our platform for all of our customers, but you can use it as well by simply running a custom Semgrep rule that we developed.

Here is the code:

<script src="https://gist.github.com/nir-valtman/a4f743dc0570b68ab20743b2123b65ac.js"></script>

Pipelineless application security testing

Implementing a pipelineless security approach to application security removes a major gap for CI/CD or IDE based risk detection: 100% code coverage from day one until the end of time! This means that even if malicious code was pushed by mistake, no matter where it was pushed, the risk would be scanned and detected. A pipelineless security approach even gives you the opportunity to message the developer directly so that they can fix the problem before it is ever introduced.

Security training

Training is key. Educate your devs about the signs of a low-reputation repo – think fewer stars than a cloudy night, no recent commits, or lack of releases. It's like teaching someone to spot a fake designer bag; the details matter!

So, are we getting too worked up about repo confusion attacks?

Look, I'm not saying there's no risk. What I am saying is that with the right precautions, the threat becomes far less menacing.

The media may love a good cyber scare story, but the reality is, with a little bit of savvy and a lot of precaution, we can avoid and otherwise navigate repo confusion attacks with our usual cool-headed competence. While malicious repositories on GitHub represent a real threat, a bit of awareness, some good practices, and voilà, you're already on your way to safeguarding your development ecosystem. So, let's keep our heads, educate and empower our teams, and remember that not every headline spells imminent doom.

Share this post

Reduce Risk and Accelerate Velocity

Integrate Arnica ChatOps with your development workflow to eliminate risks before they ever reach production.

Try Arnica

Malicious Code Campaign on GitHub Repos: Is it Hype or a Dire Threat?

Malicious Code Campaign on GitHub Repos: Is it Hype or a Dire Threat?

What is a repo confusion attack?

What is the real risk of a repo confusion attack?

Defending against a repo confusion attack

Git posture hardening

Identify indicators of compromise in source code

Pipelineless application security testing

Security training

So, are we getting too worked up about repo confusion attacks?

Recommended posts

Shai Hulud 2.0: How to Immediately Identify Your Exposure with Arnica’s New SBOM View

GitHub Actions Supply Chain Attack: What Arnica Customers Need to Know

How Arnica's Low-Reputation Package Detection Could Have Stopped the XML-RPC npm Breach

Reduce Risk and Accelerate Velocity