When canvas lies: handling randomized fingerprints in fraud detection

Why canvas fingerprinting matters in fraud detection

Canvas fingerprinting is often presented as a textbook example of browser fingerprinting. The idea is simple: instruct the browser to draw complex shapes, capture the pixel values, hash them, and use the result as a distinctive attribute. On its own, canvas is not enough, but as part of a broader fingerprinting strategy, it can significantly strengthen bot and fraud detection.

For instance, it helps track sessions and detect account takeovers by comparing the device used during a login against the device history of an account. Even if an attacker has valid stolen credentials, this extra layer can expose that the login is coming from an unknown device. Canvas can also support attribution, allowing investigators to follow the same attacker across multiple accounts even if they rotate IP addresses with VPNs or proxies. Finally, it is often used for device classification and consistency checks, such as grouping devices into families (as shown in the “Picasso” paper) or identifying mismatches between the environment a user reports and what the canvas fingerprint reveals.

Attributes like canvas are powerful because they tend to be quite unique and relatively stable over time. The same qualities also make them attractive targets for manipulation. Attackers and bot developers often randomize or spoof canvas to avoid correlation.

At first glance, the defensive logic might seem obvious: if the canvas output is randomized, treat the session as malicious. In practice, it is more complicated. Modern browsers like Firefox, Brave, and Samsung Browser deliberately randomize canvas as a privacy feature. A significant share of legitimate users would therefore look suspicious if you relied on this attribute in isolation.

The real challenge is not the use of canvas itself, but how to interpret inconsistencies in high-entropy attributes that may be randomized by both attackers and genuine browsers. As we explained in another article, privacy protections can easily increase false positives if handled incorrectly.

In this article we use canvas as a case study to examine this problem. The same principles apply to other high-entropy attributes such as audio fingerprinting or font enumeration. Our goal is not to prescribe a one-size-fits-all recipe, but to outline guidelines and techniques that help preserve detection accuracy without penalizing legitimate users.

How canvas fingerprinting works

Canvas fingerprinting is a technique that leverages the fact that rendering graphics is never exactly the same across devices. When a site asks the browser to draw text or shapes on an invisible canvas element, the result depends on many low-level factors such as the graphics card and driver, the operating system, the browser engine, and the fonts installed locally.

To the human eye, the images often look identical. At the pixel level, however, they are slightly different. By serializing the canvas output with methods like toDataURL and hashing it, you obtain a fingerprint attribute that is relatively stable for a given device.

Because the output reflects both hardware and software, canvas has far higher entropy than attributes like screen size or user agent. This makes it useful not just for linking one user across sessions, but also for consistency checks. For example, if a browser reports one type of GPU but the canvas fingerprint points to another, that mismatch is worth investigating.

Another important point is that canvas fingerprints are not always used to pinpoint a single device. They can also be applied to group devices into families that share similar rendering quirks, as demonstrated in the “Picasso” paper. This approach is often more resilient than relying on perfect uniqueness, because even if attackers try to randomize outputs, the broader device class may still leak through.

The same strengths that make canvas powerful also make it a target. Attackers who rely on tools for credential stuffing or fake account creation often randomize or spoof canvas to avoid correlation. At the same time, browsers like Firefox and Brave deliberately introduce noise as a privacy feature. This dual role of canvas — a high-value, high-entropy signal that can be randomized both by attackers and by legitimate browsers — is what creates the challenge we explore next.

Dealing with randomized canvas fingerprints

Directly inserting the canvas fingerprint hash into your logic quickly breaks down once randomization enters the picture. If the value changes at every draw, the same user will appear to come from a different device each time. Tracking becomes unreliable, and any attempt to map the canvas to a set of known OS or browser characteristics also fails because the signal is unstable.

The better approach is not to discard canvas but to detect randomization and interpret it in context.

1. Detect randomization

The first step is to establish whether the canvas output is stable or manipulated. One simple method is to perform multiple draws in the same session or within a short timeframe and compare the results. Significant variation points to randomization. We described more detailed techniques in this article.

2. Separate normal and abnormal randomization

Not every case of randomization is suspicious. Some browsers, such as Firefox, Brave, and Samsung Browser, introduce noise intentionally as a privacy feature. That behavior should be treated as expected. Randomization becomes suspicious when it occurs in environments that do not ship it, like a vanilla Chrome build, or when the randomization pattern does not match the claimed browser. For example, Brave uses a specific and consistent style of noise. If a Brave client produces a different pattern, it strongly suggests tampering.

3. Handling normal randomization

When the randomization is consistent with what the browser claims, the right approach is to neutralize its effect rather than penalize it. Replace the raw canvas value with a placeholder in your fingerprinting logic so that tracking remains stable across sessions. You lose the uniqueness of the canvas, but you also avoid polluting the fingerprint with unstable data. At the same time, adjust your consistency checks so that this known behavior does not trigger false positives.

4. Handling abnormal randomization

When randomization is unexpected or inconsistent, treat it as a signal of possible manipulation. The raw canvas should again be excluded from the fingerprint to prevent instability, but replaced with a distinct placeholder that marks the case as “abnormal randomization.” This way, you retain the information that tampering was observed. On top of that, raise the user’s risk score or trigger deeper consistency checks. While this does not always justify blocking outright, some users rely on niche privacy extensions; it is a strong indicator that the session deserves more scrutiny.

Why hashing canvas outputs is not enough

Canvas fingerprinting also exposes a broader weakness in how many detection systems treat high-entropy attributes. Most libraries and commercial bot detection products do not store the raw canvas fingerprint output. Instead, they hash it. Hashing is efficient: it reduces storage and bandwidth requirements, and it makes comparisons simple. But this efficiency comes with an important tradeoff.

When the canvas is randomized, the hash by itself loses all context. If two hashes do not match, you cannot tell whether the difference comes from a legitimate privacy feature, from an attacker overriding the toDataURL function, or from a forged payload where the value has been replaced entirely. From the perspective of the hash, all three cases look the same.

Storing the raw image would provide clarity, but in most production systems it is too heavy and unnecessary. A more practical approach is to strengthen the hash with lightweight side signals that reveal whether manipulation is likely. You can verify the integrity of toDataURL and getImageData by inspecting their toString output or deliberately throwing errors to observe the stack trace.

You can also sample specific pixels at deterministic positions to confirm expected values, or compute aggregate metrics such as average pixel color, which are less sensitive to noise than full hashes. Measuring the execution time of canvas operations is another useful angle, since tampering or emulation often introduces detectable delays.

These techniques are not unique to canvas. Any costly, high-entropy signal that is reduced to a single hash should be paired with correlated checks. Otherwise, the moment an attacker tampers with it, you lose all diagnostic power and are left with an opaque value that cannot be interpreted reliably.

Putting canvas fingerprinting in context

Canvas fingerprinting highlights both the usefulness and the fragility of high-entropy attributes. On the one hand, it can expose account takeovers, link sessions that belong to the same attacker, and surface inconsistencies across browsers and devices. On the other, its uniqueness makes it a frequent target. Attackers randomize it to avoid correlation, while mainstream browsers like Firefox and Brave deliberately add noise to protect user privacy.

The real challenge is not whether to use canvas but how to interpret it. Randomization must be placed in context. When it matches what a browser is known to do, the value should be normalized so it does not pollute your fingerprinting logic. When the randomization is abnormal or inconsistent with the claimed browser, it should not cause immediate blocking but it should be captured, marked distinctly, and reflected in risk scoring or secondary checks.

Equally important, canvas demonstrates the limits of relying on hashes alone. Hashing removes the very context you need to distinguish legitimate privacy noise from tampering. Augmenting hashes with light but meaningful side signals—integrity checks, pixel probes, timing measurements—provides the visibility required to make that distinction.

In short, canvas remains a valuable part of a broader fingerprinting strategy. Used carefully, it strengthens detection without penalizing legitimate users. The key is nuance: treat expected randomization as normal, treat abnormal patterns as suspicious, and always secure high-entropy signals with enough context to remain resilient against both privacy features and evasive attackers.