TLS Fingerprinting with JA3 and JA3S

TL;DR

In this blog post, I’ll go over how to utilize JA3 with JA3S as a method to fingerprint the TLS negotiation between client and server. This combined fingerprinting can assist in producing higher fidelity identification of the encrypted communication between a specific client and its server. For example —

Standard Tor Client:
JA3 = e7d705a3286e19ea42f587b344ee6865 ( Tor Client )
JA3S = a95ca7eab4d47d051a5cd4fb7b6005dc( Tor Server Response )

The Tor servers always respond to the Tor client in exactly the same way, providing higher confidence that the traffic is indeed Tor. Further examples —

Trickbot malware:
JA3 = 6734f37431670b3ab4292b8f60f29984 ( Trickbot )
JA3S = 623de93db17d313345d7ea481e7443cf( C2 Server Response )

Emotet malware:
JA3 = 4d7a28d6f2263ed61de88ca66eb011e3 ( Emotet )
JA3S = 80b3a14bccc8598a1f3bbe83e71f735f ( C2 Server Response )

In these malware examples, the command and control server always responds to the malware client in exactly the same way; it does not deviate. So even though the traffic is encrypted and one may not know the command and control server’s IPs or domains as they are constantly changing, we can still identify, with reasonable confidence, the malicious communication by fingerprinting the TLS negotiation between client and server.

JA3 and JA3S have been open sourced and can be found here: https://github.com/salesforce/ja3

Some Background on JA3

We open sourced JA3, a method for fingerprinting TLS clients on the wire, in this blog post in 2017:

The primary concept for fingerprinting TLS clients came from Lee Brotherston’s 2015 research which can be found here and his DerbyCon talk which is here. If it weren’t for Lee’s research and open sourcing of it, we would not have started work on JA3. So, thank you Lee and all those who blog and open source!

To recap; TLS and its predecessor, SSL, are used to encrypt communication for both common applications, to keep your data secure, and malware, so it can hide in the noise. To initiate a TLS session, a client will send a TLS Client Hello packet following the TCP 3-way handshake. This packet and the way in which it is generated is dependent on packages and methods used when building the client application. The server, if accepting TLS connections, will respond with a TLS Server Hello packet that is formulated based on server-side libraries and configurations as well as details in the Client Hello. Because TLS negotiations are transmitted in the clear, it’s possible to fingerprint and identify client applications using the details in the TLS Client Hello packet.

This exquisitely drawn network diagram shows the SSL/TLS initial communication pattern.

The JA3 method is used to gather the decimal values of the bytes for the following fields in the Client Hello packet: Version, Accepted Ciphers, List of Extensions, Elliptic Curves, and Elliptic Curve Formats. It then concatenates those values together in order, using a “,” to delimit each field and a “-” to delimit each value in each field.

Example Client Hello packet as viewed in Wireshark

The field order is as follows:
TLSVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats

Example:
769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0

If there are no TLS Extensions in the Client Hello, the fields are left empty.

Example:
769,4–5–10–9–100–98–3–6–19–18–99,,,

These strings are then MD5 hashed to produce an easily consumable and shareable 32 character fingerprint. This is the JA3 TLS Client Fingerprint.

769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0 → ada70206e40642a3e4461f35503241d5
769,4–5–10–9–100–98–3–6–19–18–99,,, → de350869b8c85de67a350c8d186f11e6

We also needed to introduce some code to account for Google’s GREASE (Generate Random Extensions And Sustain Extensibility) as described here. Google uses this as a mechanism to prevent extensibility failures in the TLS ecosystem. JA3 ignores these values completely to ensure that programs utilizing GREASE can still be identified with a single JA3 hash.

Does JA3 work on TLS 1.3? Yes.

Here we have TLS 1.3 Client Hello packets for two different browsers, each ordering their ciphers and extensions differently as well as including (or excluding) different ciphers and extensions. Therefore the JA3 will still be unique per client.

JA3S

After creating JA3 we started playing with using the same method to fingerprint the server side of the TLS handshake, the TLS Server Hello message. The JA3S method is to gather the decimal values of the bytes for the following fields in the Server Hello packet: Version, Accepted Cipher, and List of Extensions. It then concatenates those values together in order, using a “,” to delimit each field and a “-” to delimit each value in each field.

The field order is as follows:
TLSVersion,Cipher,Extensions

Example:
769,47,65281–0–11–35–5–16

If there are no TLS Extensions in the Server Hello, the fields are left empty.

Example:
769,47,

These strings are then MD5 hashed to produce an easily consumable and shareable 32 character fingerprint. This is the JA3S Fingerprint.

769,47,65281–0–11–35–5–16 → 4835b19f14997673071435cb321f5445

We MD5 hash because there is no limit to how many ciphers or extensions can be added to the Client Hello or Server Hello respectively, and our rule of thumb is that if the fingerprint cannot fit in a tweet, then it’s too long. We also use MD5 so the JA3 method can be more easily integrated into existing technologies. Remember that JA3 is a method that is designed to work within any application on any hardware. I admit, fuzzy hashing would be better, but we wanted to use a method that could be incorporated into currently-deployed technologies and most of them do not yet have fuzzy hashing support, while even the oldest Netscreen Firewall can churn out MD5s. Also, given the limited data set, hash collisions are not a concern here. I know MD5 can be a point of contention within the security community so I hope this helps explain our reasons behind using it.

Our code does allow for the entire string to be logged along with the hash value for added analysis. I highly recommend that if you are able, you log the entire fingerprint string for JA3 and JA3S as well as the hash values. The added analysis capability can come in handy. Though, if your organization is the type that’s short on log space, just logging the hash should do you just fine.

Why JA3S Works

We found that the same server will formulate its Server Hello message differently depending on the Client Hello message and its contents. So it’s not possible to fingerprint a server just based on its Hello message like we could with clients and JA3. Because of this, some suggested that there was no value here. But we ran with it anyway because Salesforce has a never-ending supply of caffeine. After some time we found that, though servers will respond to different clients differently, they will always respond to the same client the same.

In this network diagram, we can see that the client is sending a TLS Client Hello packet of all A’s. Therefore the server responds with A and will always respond to As with A.

Here a different client sends all B’s. The same server as before now responds with B and will always respond to Bs with B. Different client, different response, but always the same for each client.

Real World Example:

In this log output JA3 is on the left and JA3S is on the right

In this example I have contacted the same server 4 times over using the same client. I then contacted it again using a different client 4 times over. The way that the server responds is always the same for the same client, though different for a different client.

Usage for Security

In the event that a threat actor custom-built their own malware executable, it’s likely that the JA3 fingerprint will be unique to that executable. For example here is the Client Hello of a custom piece of malware developed by pen testers for an engagement:

You can see that there is only a single strong accepted cipher suite which is anomalous and the resulting JA3 was unique in our environment, making this easy to detect, no matter the destination.

Other pen testing tools such as PupyRAT will specify their ciphers and ordering as seen here in the Pupy code:

This makes for an unusual and unique splattering of ciphers in the Client Hello which therefore generates a unique JA3:

One can then pivot on the JA3 to enhance their hunting or response operations.

But what if the client application uses common libraries or OS sockets for communication like Python or Windows Socket? The JA3 would be common in the environment and therefore not as useful for detection. This is where JA3S can assist in identifying the malicious communication.

For example, both Metasploit’s Meterpreter and Cobalt Strike’s Beacon use a Windows socket to initiate TLS communication. For Windows 10, that is JA3=72a589da586844d7f0818ce684948eea (when going to an IP) and JA3=a0e9f5d64349fb13191bc781f81f42e1 (when going to a domain). Other legitimate applications on Windows use the same socket, making identification of the malicious communication difficult. However, the way that the C2 servers on Kali Linux respond to this client application is unique compared to the way normal servers on the internet respond to this socket. So if we combine JA3 + JA3S, we are then able to identify this malicious communication regardless of destination IP, Domain, or Certificate Details. The search (at the time of this writing) could look like:

Metasploit Win10 to Kali:
(JA3=72a589da586844d7f0818ce684948eea OR JA3=a0e9f5d64349fb13191bc781f81f42e1) AND JA3S=70999de61602be74d4b25185843bd18e

Cobalt Strike Win10 to Kali:
(JA3=72a589da586844d7f0818ce684948eea OR JA3=a0e9f5d64349fb13191bc781f81f42e1) AND JA3S=b742b407517bac9536a77a7b0fee28e9

As with everything, there is a risk of false positives. You could think of JA3 as the TLS equivalent of the User-Agent string. Just because one piece of software or malware has a particular string doesn’t mean it will always be unique to that software. It is possible for other software to use the same string. However, there’s no reason not to use the string to augment your analysis and detections. Just like other network metadata, JA3 is an extra piece of information to be used in enriching your data. JA3S, when used in conjunction with JA3, can significantly reduce the level of false positives if you’re looking for something specific.

Pen Tester Example

In another example we have pen testers using the Python version of Empire as their malware of choice. The JA3 in this case would be that of Python, not unique in any developer environment.

If we were to search for this JA3 across the environment the results would look something like this:

However, the pen tester’s C2 server responded to the Python client in a unique way. So when we search for the JA3 of Python and the JA3S of the way their C2 server responded, the results looked more like this:

I forgot to take screenshots so you’ll just have to trust me that this is exactly what Splunk looked like.

The resulting output are the beacons of the malware to their C2 server. As you can see, JA3 and JA3S combined essentially creates a fingerprint of the cryptographic negotiation between client and server.

Following the eradication of the pen testers, they moved their C2 image to another IP and domain. However, the malware and server remained the same applications and therefore the fingerprints remained the same. The previous detection worked immediately. Finally the pen testers purchased space in a completely different service provider, purchased a new legitimate looking certificate, purchased a new domain, and moved their C2 image there. Detection was instant.

Because detection was based on the infrastructure and technology, not on destination IPs, domains, or certs, we no longer had to rely on traditional IOCs which are easily changed. This moved the detection near the top of David Bianco’s Pyramid of Pain and increased the cost of engagement for the adversary.

Conclusion

JA3 and JA3S are TLS fingerprinting methods. JA3 fingerprints the way that a client application communicates over TLS and JA3S fingerprints the server response. Combined, they essentially create a fingerprint of the cryptographic negotiation between client and server. While not always a silver bullet to TLS-based detection or a guaranteed mapping to client applications, they are always valuable as a pivot point for analysis.

We designed these methods so that they can be easily applied to existing technologies. The resulting fingerprints are easy to consume and easy to share. The BSD 3-Clause license makes it easy to implement. We just wanted it to be easy. In doing so, our hope is that it becomes a valuable addition to your defensive arsenal and that it inspires others to build off of our research and push the industry forward.

Zeek/Bro and Python versions of JA3 and JA3S are available at https://github.com/salesforce/ja3 as well as links to other tools which have implemented the methods.

JA3 was created by:
John Althouse
Jeff Atkinson
Josh Atkins

For SSH client and server fingerprinting, please see HASSH at https://github.com/salesforce/hassh

For automatic client to JA3 or HASHH mapping, please see Bro-Sysmon at https://github.com/salesforce/bro-sysmon/

For any questions or comments, please feel free to email me or @4A4133.