The following is an analysis of Castle's performance in detecting and blocking credential stuffing attacks from Fall 2020.

Introduction

One question that we get asked all the time when talking with potential customers is: how effective is Castle at detecting and blocking threats in the context of credential stuffing or brute force attacks on an application login endpoint?

The technical answer, as with most complex analyses, is "well, it depends..."

However, in this article I examine two real-world attacks that Castle successfully defended against. Castle's rate of blocking malicious logins is 99.8% for the low-complexity example, and 97.5% for the high-complexity example. Keep in mind, this is in-app blocking that occurs after any WAF and IP-rate-limiting rules have been executed.

Castle reduces the perceived "success rate" of an attack, as viewed through the eyes of a bad actor, to 0.001% for the low-complexity example, and 0.03% for the high-complexity example.

Quick Note About Castle

I want to quickly mention that Castle isn't just a solution to protect against and recover from credential stuffing attacks. Check out castle.io to learn more about our suite of account security and fraud prevention solutions.

Low-Complexity Attacks

Let's define a low-complexity attack to have the following characteristics:

  • uses a single User-Agent
  • originates from a (relatively) limited set of IP addresses
  • uses an unfiltered list of credentials (most of the "stuffed" email addresses are non-existent accounts)
  • attempts each set of credentials once
  • no attempts to keep endpoint requests within normal limits

IP rate-limiting and WAF rules are usually the first line of defense against the type of attack outlined here. When such defenses fall short (as they frequently do), Castle is extremely effective at guarding user accounts from this kind of attack.

Attack A (Low-Complexity)

The application targeted in this example of a low-complexity attack is a boutique e-commerce website that sells fashion and lifestyle products. The credential stuffing attack shown took place over about 2 days. The graphs shown reflect traffic at the login endpoint, and nothing else.

1.png
Attack-of-interest login attempts (red) plotted against the rest of login attempts on the application

We can easily see that the traffic has distinct waves where the application sees an order-of-magnitude increase in login attempts.

Attack A Characteristics

Attack A, our low-complexity example, had the following characteristics:

  • ~150,000 login attempts
  • 1 distinct User-Agent (a widely used version of Chrome)
  • ~1,500 distinct IP addresses (85% from the USA, where most of the app users reside)

Attack A Accuracy

It's important to note that the accuracy figures below are figures that we know with privileged log access, not figures that the attacker is able to calculate. Because of Castle's protection of the accounts, the attacker thinks their credential list is far, far less accurate than it really is.

Attack A exhibited the following stats when it comes to accuracy:

  • 3.9% of the login attempts targeted user accounts that existed
  • 0.8% of the login attempts used valid username/password combinations

This example was a typical credential stuffing attack that scanned a list of credentials, tried each set of credentials one time, and moved on to the next set.

The graph below shows the events Castle received at our API from the application - it reflects login attempts with valid vs. invalid credentials. It does not take into account the Castle response, which blocked almost all of the valid-credential login attempts (we'll look at that in the next section).

3.png
Login attempts of the attack that had valid credentials ($login.succeeded) vs. invalid credentials ($login.failed)

Castle's Performance Against Attack A

  • Castle blocked 99.8% of valid-credential login attempts
  • Castle initiated automated account recovery processes for 99.8% of affected accounts
  • Attacker thinks they had just a 0.001% success rate

This attack utilized a common Chrome User-Agent and rotated through many IP addresses in the US, but it was still a relatively low-complexity attack compared to most of the attacks that we see. With a 0.001% success rate, the attacker may be discouraged from trying to attack this site again. They will also (likely) be discouraged from using this same credential list on other sites in the future, given their experience that the credentials are almost all invalid.

The graph below shows that, with Castle integrated at the login endpoint, just 2 accounts were breached. These breaches happened near the beginning of the attack. Without Castle, the application would have seen several hours with more than 100 account breaches per hour.

5.png
Number of account breaches (allowed logins) with vs. without Castle in-line integration

High-Complexity Attacks

In contrast to low-complexity attacks, we'll define high-complexity attacks to have the following characteristics:

  • uses thousands of unique User-Agents
  • uses thousands of unique IP addresses, from the country with the largest user base
  • uses a filtered list of credentials (high proportion of "stuffed" email addresses are tied to real accounts)
  • attempts credentials multiple times, always with new UA/IP combination for subsequent requests
  • attempts to keep endpoint requests within normal traffic expectations

Attack B (High-Complexity)

Attack B targeted a banking and investment application, also based in the US. This attack took place over several hours, and overlapped precisely with the American working day hours when the application sees peak usage. The graphs in this next section represent traffic at the login endpoint of the application.

2.png
Attack-of-interest login attempts (red) plotted against the rest of login attempts on the application

Compared with the low-complexity attack, this high-complexity attack kept the login endpoint traffic to the same order of magnitude as other application traffic. This attack was sandwiched largely between two other login traffic spikes that may have been a "decoy". The big yellow spikes were comprised entirely of unsuccessful login attempts, so it's likely that they were designed to distract from the "real" attack. We're not including the yellow spikes in our analysis because the features do not share characteristics with our attack-of-interest traffic, which is in red.

Attack B Characteristics

Attack B, our high-complexity example, had the following characteristics:

  • ~60,000 login attempts
  • ~20,000 distinct User-Agents
  • ~5,000 distinct IP addresses(>95% from USA, >99% from USA & Canada)

It is important to note the use of thousands and tens of thousands of unique IP addresses and User-Agents, originating from locations where this application's users live. Each login attempt can come from a unique User-Agent and IP address permutation.

Attack B Accuracy

As with our Accuracy figures for Attack A, the accuracy figures for Attack B are computed from privileged information, and the bad actor did not get this same feedback because Castle blocked many of the $login.succeeded events from starting a user session.

Attack B exhibited the following characteristics when it comes to accuracy of the credential list:

  • ~38% of the login attempts targeted user accounts that existed
  • ~36% of the login attempts used valid username/password combinations

The graph below shows the events Castle received at our API from the application. It shows login attempts with valid vs. invalid credentials. Castle blocked sessions from starting for most the $login.succeeded events (we'll look at that in the next section).

4.png
Login attempts of the attack that had valid credentials ($login.succeeded) vs. invalid credentials ($login.failed)

This attack likely started with a set of credentials that have been validated at many other websites, and the attacker has high confidence that the users on their list are re-using passwords everywhere. A 36% rate of "success" in a credential stuffing attack is astronomically higher than most figures reported elsewhere  (1, 2). Of course, with the Castle integration in place, this bad actor did not realize they had a 36% hit rate.

Castle's Performance Against Attack B

  • Castle blocked 97.5% of valid-credential login attempts
  • Castle initiated automated account recovery processes for 98.3% of affected accounts
  • Attacker thinks they had a 0.9% success rate
6.png
Number of account breaches (allowed logins) with vs. without Castle in-line integration

These numbers look great at first glance, but this attack submitted a wide range of login attempts (up to hundreds) per set of credentials. This bad actor knew, probably from prior experience, that persistence pays off when trying to breach user accounts for this application. They may have used fake accounts in order to build reputation.

Practically, this means that after tens of thousands of attempts from unique User-Agent and IP addresses, the bad actor was able to gain access to a few hundred user accounts. This still  represents a small fraction of their entire credential list. Castle detected  over 98% of these breached accounts and assisted with automated recovery after-the-fact. However,  the best news is that for 96% of the accounts that were eventually breached in the attack, Castle blocked login attempts before any attempts slipped through the detection cracks. This means that, if the application locked accounts or forced password resets on Castle's recommendations, 96% of the breaches would have been avoided. Just 20 accounts would have been breached during this attack.

To put this in simple perspective: login attempt 1 might be blocked by Castle. Login attempt 100 might get through our detection in these state-of-the-art attacks. Because of this reality, Castle recommends locking accounts or forcing password resets when Castle detects that a set of credentials is compromised. If accounts are locked and password resets are forced at login attempt 1, login attempt 100 does not allow for an account breach.

In this example scenario, with our best practices incorporated,  the bad actor's observed success rate for their login attempts would have fallen to 0.03%.

We understand that locking accounts and forcing password resets can result in a negative user experience, but it is the only way to ensure account security when a password is known to be leaked.

Conclusion

Tactics used in execution of credential stuffing attacks are ever-evolving. We believe that many security solutions are failing to detect account breaches that result from the more sophisticated attacks that we see, such as Attack B. We are constantly improving our own capabilities to detect and defend against these attacks, while minimizing negative effects on the end-user experience.

Thank you for taking the time to read this, and we hope you were able to learn about how Castle can help applications manage user account security. As I mentioned at the beginning, protection against credential stuffing attacks, and automation of account recovery, are just two of many things that Castle does for our customers. Learn more about our product offerings at castle.io.