This blog provides a comprehensive framework for evaluating secret detection solutions compatible with modern software development. It delves into key aspects such as the types of secrets detected, flexibility, and completeness of detection coverage, the role of context and validation in assessing severity, and effective mitigation strategies. The goal is to help organizations choose the right tools that not only detect but also manage and mitigate secrets effectively.
There is a heaping pile of secret detection tools in the market today. Open source, commercial, point solutions, platforms, lions, tigers, and bears, oh my!
In this blog, I hope to provide a framework for evaluating secret detection solutions, specifically from the lens of compatibility with the modern software development lifecycle (SDLC). Though we start here, we aim to go beyond the obvious platitudes of what types of secrets can be detected and the obvious requirement of accuracy. Consider it point #0 that a waterfall of false positives is a bad product experience. Okay, next.
In evaluating secret detection tools, we will explore four core questions:
Let’s dive in!
As a starting point, you should have a clear sense of what types of secrets are most important to you. Where is your critical infrastructure and data, such as cloud providers (Azure, GCP, AWS, etc.) and specific hosted systems and services (Databricks, Azure Storage, DynamoDB, etc.)? Are you connected to external processors, such as payment providers or email distribution systems? As a baseline, your tool selection should be able to detect secrets from these environments with a high degree of accuracy.
There are plenty of secret detection tools that tout a nauseatingly long list of secret types. But, they often stop at detection and alerting. The results are an all too familiar set of side effects that include: alert fatigue, annoyed devs, bloated backlogs, and ultimately… a lack of clarity as to what is actually important.
In fact, it’s highly debatable whether secret detection functionality alone is even worth paying for at all. There are piles of free open source secret scanners like GitLeaks, Git-Secrets and Detect-Secrets. I would be remiss if I didn’t mention that Arnica’s secret detection is free as well.
Developers may tell you they’ve never pushed a secret into code and never will in the future… just to be safe, it’s critical to consider if and how your tools handle existing secrets as well as code committed containing secrets in the future.
In all likelihood, there are secrets that exist in your git history today. If there are none, please write a blog on how you accomplished this so that the rest of us can learn from it! But for the rest of us, we need to ensure that our secret detection tools take a thoughtful approach to historical secrets. An effective detection approach should include regular scans to ensure that the output of those scans are up to date so as to avoid prioritizing secrets that may have already been rotated.
Now that you’ve got a good handle on historical secret detection, if you are stuck trying to keep pace as new secrets are consistently committed, you’ll go mad. You’ll also probably never get to actually work on the secrets backlog because you’ll be playing whack-a-mole with the new ones. That’s why it’s equally important to be able to detect secrets within the feature branch (on every code change, not in the pull request, as many products do).
Implementing real-time secret detection helps “stop the bleed” of new secrets getting added to code, which ultimately frees up your security and development team to be able to focus on rotating high severity historical secrets and chip away at the backlog.
There are certain types of secrets that you may care about that can’t be validated externally because they are unique to your organization. A good example of this would be an API key that you use internally in your product. You don’t have a way to validate it externally, but you know the construction of the API key and so by virtue of knowing its construction, you know what to look for.
If this is important to your organization, your secrets tool should be able to accommodate regular expressions in order to detect this type of secret so that you can dictate, “I don’t want anything like this to be in my source code.”
A secret detection solution might tout a list of hundreds of secret types that it finds. That’s great! But if it’s deployed in CI/CD pipelines, for example, and you’ve only been able to prioritize configuring the pipelines for a small subset of all repositories in your organization… that’s not so great. Scanners deployed in the CI/CD pipeline often require developer configuration and so, in the real world, security teams are forced to prioritize which pipelines get scanner coverage.
IDE security plugins and pre-commit hooks can detect all the way left, for the developer, but present a similar challenge. If a security plugin lacks compatibility for a specific IDE used by a subset of your developers, or the pre-commit hook is not configured, you may be out of luck in trying to cover those developers’ code. As is more often the case, if developers aren’t required to use the security plugin or the pre-commit hook, they… well… won't use it.
At Arnica, we take a pipelineless approach – integrated directly into the source code management tools – that guarantees 100% coverage of the development ecosystem from day one and seamless integration into the tools your developers already use without requiring them to learn a new one. Check out our full comparison of the pipelineless approach to application security vs. IDE plugins vs. CI/CD pipelines.
It is important for your secret detection tools to be able to provide granular context of the secret in order to help you effectively prioritize what needs to be fixed. Secret detection and mitigation tools often get criticized for spewing alerts indiscriminately. An effective tool should help leverage deep contextual data and validation techniques to ensure that you’re laser focused on what is most important to you.
It’s nice to know (though maybe “nice” is the wrong way to put it) that your secret scanning tool found credentials that are associated with your AWS environment. But to fully understand the true risk associated with those credentials, you’d want to know, for example, if those credentials are associated with a root account or perhaps a lower environment account such as dev or QA. Or maybe it is just a canary token being used as a honeypot!
The canary token as a honeypot may present a valid set of credentials, but it is not a risk at all. An AWS dev account token may represent a medium severity that you want to handle; but, maybe you don’t need to do so with as much urgency as your AWS root account creds. These three scenarios have three wildly different risk severities associated with them, despite all being “AWS secrets in code.”
There are also certain key types that are dependent on the environment in which you attempt to validate them. In Arnica, for example, we have a validator that checks for credentials in the URL. We try to authenticate with those credentials in the URL but if we run from an environment that doesn’t have access to the corresponding asset via those credentials, the credentials will appear to be invalid. This is useful information, because that means if an attacker also tries to use the credentials from an external endpoint, they will come back invalid for the attacker as well. However, it may be the case that were you to try to validate from a different environment, you might get validated access using those same credentials. In this case you’d want to know that this is a valid secret but only when accessed from an internal network.
The context of where you validate from matters.
The old adage – ”you can’t fix what you can’t find” – is certainly true. But it’s important to ask a follow up question: “can you fix what you find?” The differentiation of secret detection and mitigation tools often comes from how secrets are handled once they are detected.
Alerting on newly detected secrets should truly be the bare minimum requirement for a secret solution. When a secret is found, someone should know about it. Better yet, a targeted alert should be directly sent to a developer, in real-time, when they attempt to push a secret to the source code repository.
But, notifications alone just leave you with a to-do list of secrets to manually rotate. It’s a pain in the butt. You don't want to do that and neither do your developers.
In order to truly get a handle on secrets, you need to be able to both detect and eliminate new secrets, before anyone else gains a copy of this secret, for example, by cloning the repository. This means leveraging a secret detection & mitigation solution that is able to detect and mitigate secrets as soon as they are pushed to the source code repository, even in a feature branch.
Secret tools should also provide a mechanism to mitigate existing risks in a timely manner. In order to accomplish this, your secret mitigation tool will have to leverage deep git internals expertise in order to provide a developer with the commands necessary to overwrite the secret across all commits. This is where the power of real time notifications directly to the developer is really highlighted. If in that notification, you can provide the developer with the exact git command script to run to eliminate the secret from the commit history, you’ve successfully empowered a ‘no new secrets’ policy.
With new secrets under control, it’s time to focus on your secret backlog. A secret scanning tool should be able to tell you what secrets exist in your code. Again, leveraging context and validation here to inform severity, you can effectively prioritize your backlog. Your security and engineering teams can then work together to rotate the most critical secrets. Secret rotation is the critical mitigation step in the case of historical secrets because any existing secrets will be “distributed” across all developers having a clone of the source code repository. The reason is that the cloned repository has a git tree that has the historical commits across all branches, which may contain the historical secrets as well.
As your engineering and security teams work together to tackle the backlog and rotate the prioritized secrets, your secrets tool should regularly rescan your code, which is important for two reasons. First, you want to ensure that your teams are working against a list of still-valid secrets. If a scan is a month old, it could contain credentials that have already been rotated and so they shouldn’t be prioritized. Second, you want to ensure that the key rotations were done successfully in order to close out any backlog items that have been resolved.
When it comes to selecting the right secret detection solution for your organization, it’s important to take a holistic approach to ensure it aligns with your software development lifecycle. The right tool should offer more than just detection and should guarantee full coverage of your environment, provide actionable context, and facilitate secret mitigation.
Developer experience – across detection, alerting, prioritization, and mitigation – should be kept at the forefront of your evaluation. At the end of the day, if it’s easy for developers to drive security outcomes, they’re more likely to do so. A tool that integrates seamlessly into your developer workflows will be more effective and more likely adopted by your engineering teams.
Real-time secret detection is critical for preventing new secrets from being added to your code, thereby “stopping the bleed.” Pairing real-time detection with regularly updated historical scans empower a holistic approach to secret detection and mitigation. Every secret, new or existing, should come with rich context in order to inform priority.
Finally, facilitating effective and efficient secret mitigation is paramount. A value-add secret solution should not only alert but also provide clear, actionable steps for remediation. Better yet, it should facilitate the mitigation for you!
Learn more about Arnica's pipelineless approach to application security including secret detection and mitigation.