The
encryption of network traffic is an ever-growing phenomenon as the world
marches steadily further into the information age. Much of the world’s day-to-day
correspondences are now reliant on network communication for the transmission
and reception of vital data. The benefits of encrypting communications are
obvious, but the security it provides can be a double-edged sword. Security and
encryption techniques can make it difficult for network security measures to
identify malicious communications and the precursors to attacks, making
malignant communications all but invisible to many standard detection measures.
One of the more popular and common network traffic encryption protocols is TLS.

The
Transport Layer Security (TLS) is a security technique that provides a method
of selecting communication and encryption protocols that both a HTTPS client
and a server can effectively utilize. TLS provides security for a wide variety
of communication between networks, ranging from financial transactions on major
retail websites, to private communications between individuals, all the way
down to malware returning the data it has illicitly acquired to the creator. TLS
is effective because of the inherent extreme difficulty any eavesdropper would
experience, given that they were wishing to analyze the encrypted traffic and
not simply record whether or not communication had occurred. TLS users operate
confidently under this belief; that although an eavesdropper could easily
observe the existence of their session, that the content itself will remain secure
and unintelligible without access to the cryptographic keys that would remove
the obfuscation. However, there do exist tools that can subvert this assurance and
be used to quickly determine the HTTPS client in the communication. Enter the TLS
fingerprinting technique.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

The
primary reason that this identification technique can be successful is due to
the fact that the Transport Security Layer needs to generate an initial
communication between the HTTPS client and server before any sort of encrypted
communication can take place. This initial transmission consists of packets
which inform both the HTTPS client and selected server of the other’s
capabilities and preferences in regards to security algorithms. This is done so
that the selection of algorithms that are mutually acceptable for both the HTTPS
client and the server can be determined. These selections can range from
cryptographic methods and cipher suites, compression systems for files and
data, hashing algorithms, the list goes on. Logically, this communication of
preferred selection has to be done out in the open without encryption, since no
method of encryption or obfuscation has been selected yet and any
implementation of such would have no guarantee of being intelligible to the
party being communicated to. While this does not present an opportunity for the
breaking of any sort of future encryption that either party selects, mere
observance of an event does nothing to indicate the details of it in this case,
this unguarded exchange provides the key element which TLS fingerprinting
requires to function.

Because
of this unguarded exchange, it is possible to build a metric for the identification
of a particular HTTPS client, by capturing the data contained in the initial packet
that the client sends to the server when trying to determine the protocols for
the TLS session.  The nature of these
initial packets changes only infrequently, and a fingerprint can be built from
their elements and then utilized in order to recognize a particular HTTPS
client in a future session. The fields and data points that need to be captured
from this observance are:  the active TLS
version the HTTPS client is using, the TLS version in the HTTPS client’s record
layer, the ciphers and algorithms that were chosen and applied, any compression
methods utilized in the communication, and the list of active extensions
utilized by the HTTPS client.  Of these,
the field with the most variance, and thus one of the best to use for the
purposes of identification, is the HTTPS client’s list of supported cipher
suites. A cipher suite is a collection of cryptographic techniques that defines
a secure communication. There are hundreds of cipher suites, and even more
combinations of them, but they are all built from a small number rudimentary elements:
key exchange, encryption algorithms and methods integrity validation. Different
programs often use very distinct cipher suites. This combined data set is
effectively changed only on a very rare basis for any particular HTTPS client
or server, and thus offers far greater granularity than assessing cipher suites
alone.

Capturing
the initial communication between the HTTPS client and server is an excellent
method for fingerprinting client packets for several reasons. First and
foremost, it is possible to capture the packets from initial TLS handshake with
a high degree of accuracy, initial communication bursts occur rarely enough
that it is a manageable task to observe and record all of them that occur on a
target network. Storing these initial packets also requires little in regard to
storage space, reducing the overall cost of acquiring and analyzing the data.
This is in direct contrast to the normally exorbitant cost associated with full
data surveillance and recording. Finally, the collection of these packets take
place without the requirement to keep track of the current state of the Transmission
Control Protocol (TCP) or the observance of the packet in that particular stream.
This reduces the overall cost again, this time in the area of necessary
processing power and amount of memory that needs to be allocated to track any
associated packets. This ties into the real-world applications of the TLS
fingerprinting technique as an economical and low-upkeep method of surveillance.

The
practical application for TLS Fingerprinting lies in its use as form of passive
surveillance and detection. The technique allows for a low-cost, and low
investment form of communication monitoring that enables the detection of a
near limitless variety of traffic without requiring access to either the server
or the HTTPS client endpoints. The ability to detect malicious programs or
unwanted software without having to specifically search for a narrow range is a
very useful ability for anyone monitoring network traffic, whether legitimately
or not. Using TLS fingerprinting, potentially unwanted forms of software can
also be detected, as almost every application with a TLS connection possesses its
own semi-unique and inherent fingerprint. The detection of unusual
communications streams is also a simple and useful application of the TLS
fingerprinting technique, and is something most network security plans should
find worthy of investigation due to how simple to it is to detect. For example,
many web services are expecting a human to interact with them via a browser and
are designed for such. Amazon, for instance, would be very interested in a
connection from a script or bot program that starts buying large amounts of
products, whether with legitimate funds or not.

Network
traffic analysis is also something that the use of the TLS fingerprinting
technique makes easier. Given that each HTTPS client we identify has a unique
fingerprint, we can utilize the collected data for network traffic analysis. By
calculating the number of unique IDs that share the same IP address, we can
quantify the number of HTTPS clients using a specific machine, and any NAT
mechanisms that might be present. This sort of HTTPS client identification can make
a large contribution to any given network’s security and ability to detect
hostile activity targeting network assets or users. By monitoring the activity
of HTTPS clients, and utilizing a metric for suspicious activity, the detection
and prevention of network attacks and the spread of malware can be drastically
lessened. However, the establishment of this metric for behavior and a dynamic
accurate solution for determining malice is beyond the scope of this writing.

Of
course, the TLS fingerprinting technique is not a foolproof one and there do
exist techniques to counter it.  The
natural response is to, as a HTTPS client, modify your own TLS fingerprint in
order to subvert this form of identification. While possible, there are several
complexities inherent to this idea. To avoid being identified by an existing
fingerprint, the initial handshake that the HTTPS client and server exchange
with each other must be modified, which by necessity entails artificially
choosing to support, or not support, many cipher suites and other features of
encryption, communication and compression. Doing this means lowering the
security of any communication between the HTTPS client and server by
introducing the requirement to support different, potentially less efficient,
communication options. Even worse, if a specific server has strict
communication protocols, these changes might prevent the HTTPS client from any
exchange of information with it, or necessitate a change in the server’s
protocols as well. Another response for a HTTPS client, is to utilize a proxy
when connecting. This causes the technique to detect the various extensions,
communication protocols, and cipher suites of the proxy instead of the true HTTPS
client. This method is only a stop-gap solution however, as any given proxy can
still be fingerprinted, identified, and its traffic marked accordingly or
refused.

In
conclusion, as the variety of communication avenues grows, HTTPS client/server
communication protocols will continue to rely on TLS to provide a swift method
of reasonable security and privacy via cryptographic techniques. The
utilization of TLS fingerprinting allows for a quick and resource-cheap method of
determining which ciphers are being used, and thus allows for the more precise application
of defensive strategies and communication filtering for network administrators
and security professionals.  TLS
fingerprinting also enhances the abilities of network traffic analysis by
providing an economical network-based form of identification of HTTPS clients.
The technique is lightweight, not limited in the scope of its deployment, and
does not violate the confidentiality, security, or availability of a client’s
data, making it an excellent candidate for implementation even in network’s
handling traffic composed of sensitive material.