|
Honeypots are an exciting new technology with enormous potential for the security
community. The concepts were first introduced by several icons in computer security,
specifically Cliff Stoll in the book The
Cuckoo's Egg', and Bill Cheswick's paper ' An
Evening with Berferd.' Since then, honeypots have continued to evolve, developing
into the powerful security tools they are today. The purpose of this paper is
to explain exactly what honeypots are, their advantages and disadvatages, and
their value to the security.
Definitions
The first step to understanding honeypots is defining what a honeypot is. This
can be harder then it sounds. Unlike firewalls or Intrusion Detection Systems,
honeypots do not solve a specific problem. Instead, they are a highly flexible
tool that comes in many shapes and sizes. They can do everything from detecting
encrypted attacks in IPv6 networks to capturing the latest in on-line credit
card fraud. Its is this flexibility that gives honeypots their true power. It
is also this flexibility that can make them challenging to define and understand.
As such, I use the following definition to define what a honeypot is.
A honeypot is an information system resource whose value
lies in unauthorized or illicit use of that resource.
This is a general defintion covering all the different manifistations of
honeypots. We will be discussing in this paper different examples of honeypots
and their value to security. All will fall under the definition we use above,
their value lies in the bad guys interacting with them. Conceptually almost all
honeypots work they same. They are a resource that has no authorized activity,
they do not have any production value. Theoreticlly, a honeypot should see no
traffic because it has no legitimate activity. This means any interaction with a
honeypot is most likely unauthorized or malicious activity. Any connection
attempts to a honeypot are most likely a probe, attack, or compromise. While
this concept sounds very simple (and it is), it is this very simplicity that
give honeypots their tremendous advantages (and disadvantages). I highlight
these below.
Advantages: Honeypots are a tremendously simply concept, which gives
them some very powerful strengths.
- Small data sets of high value: Honeypots collect small amounts of
information. Instead of logging a one GB of data a day, they can log only one
MB of data a day. Instead of generating 10,000 alerts a day, they can generate
only 10 alerts a day. Remember, honeypots only capture bad activity, any
interaction with a honeypot is most likely unauthorized or malicious activity.
As such, honeypots reduce 'noise' by collectin only small data sets, but
information of high value, as it is only the bad guys. This means its much
easier (and cheaper) to analyze the data a honeypot collects and derive value
from it.
- New tools and tactics: Honeypots are designed to capture anything
thrown at them, including tools or tactics never seen before.
- Minimal resources: Honeypots require minimal resources, they only
capture bad activity. This means an old Pentium computer with 128MB of RAM can
easily handle an entire class B network sitting off an OC-12 network.
- Encryption or IPv6: Unlike most security technologies (such as IDS
systems) honeypots work fine in encrypted or IPv6 environments. It does not
matter what the bad guys throw at a honeypot, the honeypot will detect and
capture it.
- Information: Honeypots can collect in-depth information that few,
if any other technologies can match.
- Simplicty: Finally, honeypots are conceptually very simple. There
are no fancy algorithms to develop, state tables to maintain, or signatures to
update. The simpler a technology, the less likely there will be mistakes or
misconfigurations.
Disadvantages: Like any technology,
honeyopts also have their weaknesses. It is because of this they do not replace
any current technology, but work with existing technologies.
- Limited view: Honeypots can only track and capture activity that
directly interacts with them. Honeypots will not capture attacks against other
systems, unless the attacker or threat interacts with the honeypots also.
- Risk: All security technologies have risk. Firewalls have risk of
being penetrated, encryption has the risk of being broken, IDS sensors have
the risk of failing to detect attacks. Honeypots are no different, they have
risk also. Specifically, honeypots have the risk of being taken over by the
bad guy and being used to harm other systems. This risk various for different
honeypots. Depending on the type of honeypot, it can have no more risk then an
IDS sensor, while some honeypots have a great deal of risk. We identify which
honeypots have what levels of risk later in the paper.
It is how you leverage these advantages and disadvantages that defines the
value of your honeypot (which we discuss later).
Types of Honeypots
Honeypots come in many shapes and sizes, making them difficult to get a grasp
of. To help us better understand honeypots and all the different types, we break
them down into two general categories, low-interaction and high-interaction
honeypots. These categories helps us understand what type of honeypot you are
dealing with, its strengths, and weaknesses. Interaction defines the level of
activity a honeypot allows an attacker. Low-interaction honeypots have limited
interaction, they normally work by emulating services and operating systems.
Attacker activity is limited to the level of emulation by the honeypot. For
example, an emulated FTP service listening on port 21 may just emulate a FTP
login, or it may support a variety of additional FTP commands. The advantages
of a low-interaction honeypot is their simplicity. These honeypots tend to be
easier to deploy and maintain, with minimal risk. Usually they involve installing
software, selecting the operating systems and services you want to emulate and
monitor, and letting the honeypot go from there. This plug and play approach
makes deploying them very easy for most organizations. Also, the emulated services
mitigate risk by containing the attacker's activity, the attacker never has
access to an operating system to attack or harm others. The main disadvantages
with low interaction honeypots is that they log only limited information and
are designed to capture known actiivty. The emulated services can only do so
much. Also, its easier for an attacker to detect a low-interaction honeypot,
no matter how good the emulation is, skilled attacker can eventually detect
their presence. Examples of low-interaction honeypots include Specter,
Honeyd,
and KFSensor.
High-interaction honeypots are different, they are usually complex solutions
as they involve real operating systems and applications. Nothing is emulated,
we give attackers the real thing. If you want a Linux honeypot running an FTP
server, you build a real Linux system running a real FTP server. The advantages
with such a solution are two fold. First, you can capture extensive amounts
of information. By giving attackers real systems to interact with, you can learn
the full extent of their behavior, everything from new rootkits to international
IRC sessions. The second advantage is high-interaction honeypots make no assumptions
on how an attacker will behave. Instead, they provide an open environment that
captures all activity. This allows high-interaction solutions to learn behavior
we would not expect. An excellent example of this is how a Honeynet captured
encoded back
door commands on a non-standard IP protocol (specifically IP protocol 11,
Network Voice Protocol). However, this also increases the risk of the honeypot
as attackers can use these real operating system to attack non-honeypot systems.
As result, additional technologies have to be implement that prevent the attacker
from harming other non-honeypot systems. In general, high-interaction honeyptos
can do everything low-interaction honepyots can do and much more. However, they
can be more complext to deploy and maintain. Examples of high-interaction honeypots
include Symantec
Decoy Server and Honeynets.
You can find a complete listing of both low and high interaction honeypots at
Honeypot Solutions
page. To better understand both low and high interaction honeypots lets look
at two examples. We will start with the low-interaction honeypot Honeyd.
Honeyd: Low-interaction honeypot
Honeyd is a low-interaction
honeypot. Developed by Niels Provos, Honeyd is OpenSource and designed to run
primarily on Unix systems (though it has been ported to Windows). Honeyd works
on the concept of monitoring unused IP space. Anytime it sees a connection attempt
to an unused IP, it intercepts the connection and then interacts with the attacker,
pretending to be the victim. By default, Honeyd detects and logs any connection
to any UDP or TCP port. In addition, you can configure emulated services to
monitor specific ports, such as an emulated FTP server monitoring TCP port 21.
When an attacker connects to the emulated service, not only does the honeypot
detect and log the activity, but it captures all of the attacker's interaction
with the emulated service. In the case of the emulated FTP server, we can potentially
capture the attacker's login and password, the commands they issue, and perhaps
even learn what they are looking for or their identity. It all depends on the
level of emulation by the honeypot. Most emulated services work the same way.
They expect a specific type of behavior, and then are programmed to react in
a predetermined way. If attack A does this, then react this way. If attack B
does this, then respond this way. The limitation is if the attacker does something
that the emulation does not expect, then it does not know how to respond. Most
low-interaction honeypots, including Honeyd, simply generate an error message.
You can see what commands the emulated FTP server for Honeyd supports by review the
source code.
Some honeypots, such as Honeyd, can not only emulate services, but emulate
actual operating systems. In other words, Honeyd can appear to the attacker
to be a Cisco router, WinXP webserver, or Linux DNS server. There are several
advantages to emulating different operating systems. First, the honeypot can
better blend in with existing networks if the honeypot has the same appearance
and behavior of production systems. Second, you can target specific attackers
by providing systems and services they often target, or systems and services
you want to learn about. There are two elements to emulating operating systems.
The first is with the emulated services. When an attacker connects to an emulated
service, you can have that service behave like and appear to be a specific OS.
For example, if you have a service emulating a webserver, and you want your
honeypot to appear to be a Win2000 server, then you would emulate the behavior
of a IIS webserver. For Linux, you would emulate the behavior of an Apache webserver.
Most honeypots emulate OS' in this manner. Some sophisticated honeypots take
this emulation one step farther (as Honeyd does). Not only do they emulate at
the service level, but at the IP stack level. If someone uses active fingerprinting
measures to determine the OS type of your honeypot most honeypots respond with
the IP stack of whatever OS the honeypot is installed on. Honeyd spoof the replies,
making not only the emulated services, but emulated IP stacks behave as the
operating systems would. The level of emulation and sophistication depends on
what honeypot technology you chose to use.
Honeynets: High-interaction honeypot
Honeynets are a prime example
of high-interaction honeypot. Honeynets are not a product, they are not a software
solution that you install on a computer. Instead, Honeyents are an architecture,
an entire network of computers designed to attacked. The idea is to have an
architecture that creates a highly controlled network, one where all activity
is controlled and captured. Within this network we place our intended victims,
real computers running real applications. The bad guys find, attack, and break
into these systems on their own initiative. When they do, they do not realize
they are within a Honeynet. All of their activity, from encrypted SSH sessions
to emails and files uploads, are captured without them knowing it. This is done
by inserting kernel modues on the victim systems that capture all of the attacker's
actions. At the same time, the Honeynet controls the attacker's activity. Honeynets
do this using a Honeywall gateway. This gateway allows inbound traffic to the
victim systems, but controls the outbound traffic using intrusion prevention
technologies. This gives the attacker the flexibility to interact with the victim
systems, but prevents the attacker from harming other non-Honeynet computers.
An example of such a deployment can be seen in Figure
1.
Value of Honeypots
Now that we have understanding of two general categories of honepyots, we can
focus on their value. Specifically, how we can use honeypots. Once again, we
have two general categories, honeypots can be used for production purposes or
research. When used for production purposes, honeypots are protecting an organization.
This would include preventing, detecting, or helping organizations respond to
an attack. When used for research purposes, honeypots are being used to collect
information. This information has different value to different organizations.
Some may want to be studying trends in attacker activity, while others are interested
in early warning and prediction, or law enforcement. In general, low-interaction
honeypots are often used for production purposes, while high-interaction honeypots
are used for research purposes. However, either type of honeypot can be used
for either purpose. When used for production purposes, honeypots can protect
organizations in one of three ways; prevention, detection, and response. We
will take a more in-depth look at how a honeypot can work in all three.
Honeypots can help prevent attacks in several ways. The first is against automated
attacks, such as worms or auto-rooters. These attacks are based on tools that
randomly scan entire networks looking for vulnerable systems. If vulnerable
systems are found, these automated tools will then attack and take over the
system (with worms self-replicating, copying themselves to the victim). One
way that honeypots can help defend against such attacks is slowing their scanning
down, potentially even stopping them. Called sticky honeypots, these solutions
monitor unused IP space. When probed by such scanning activity, these honeypots
interact with and slow the attacker down. They do this using a variety of TCP
tricks, such as a Windows size of zero, putting the attacker into a holding
pattern. This is excellent for slowing down or preventing the spread of a worm
that has penetrated your internal organization. One such example of a sticky
honeypot is LaBrea Tarpit. Sticky
honeypots are most often low-interaction solutions (you can almost call them
'no-interaction solutions', as they slow the attacker down to a crawl :). Honeypots
can also be protect your organization from human attackers. The concept is deception
or deterrence. The idea is to confuse an attacker, to make him waste his time
and resources interacting with honeypots. Meanwhile, your organization has detected
the attacker's activity and have the time to respond and stop the attacker.
This can be even taken one step farther. If an attacker knows your organization
is using honeypots, but does not know which systems are honeypots and which
systems are legitimate computers, they may be concerned about being caught by
honeypots and decided not to attack your organizations. Thus the honeypot deters
the attacker. An example of a honeypot designed to do this is Deception
Toolkit, a low-interaction honeypot.
The second way honeypots can help protect an organization is through
detection. Detection is critical, its purpose is to identify a failure or
breakdown in prevention. Regardless of how secure an organization is, there will
always be failures, if for no other reasons then humans are involved in the
process. By detecting an attacker, you can quickly react to them, stopping or
mitigating the damage they do. Tradtionally, detection has proven extremely
difficult to do. Technologies such as IDS sensors and systems logs haven proven
ineffective for several reasons. They generate far too much data, large
percentage of false positives, inability to detect new attacks, and the
inability to work in encrypted or IPv6 environments. Honeypots excel at
detection, addressing many of these problems of traditional detection. Honeypots
reduce false positives by capturing small data sets of high value, capture
unknown attacks such as new exploits or polymorphic shellcode, and work in
encrypted and IPv6 environments. You can learn more about this in the paper Honeypots: Simple, Cost
Effective Detection. In general, low-interaction honeypots make the best
solutions for detection. They are easier to deploy and maintain then
high-interaction honeypots and have reduced risk.
The third and final way a honeypot can help protect an organization is in
reponse. Once an organization has detected a failure, how do they respond? This
can often be one of the greatest challenges an organization faces. There is
often little information on who the attacker is, how they got in, or how much
damage they have done. In these situations detailed information on the
attacker's activity are critical. There are two problems compounding incidence
response. First, often the very systems compromised cannot be taken offline to
analyze. Production systems, such as an organization's mail server, are so
critical that even though its been hacked, security professionals may not be
able to take the system down and do a proper forensic analysis. Instead, they
are limited to analyze the live system while still providing production
services. This cripiles the ability to analyze what happend, how much damage the
attacker has done, and even if the attacker has broken into other systems. The
other problem is even if the system is pulled offline, there is so much data
pollution it can be very difficult to determine what the bad guy did. By data
pollution, I mean there has been so much activity (user's logging in, mail
accounts read, files written to databases, etc) it can be difficult to determine
what is normal day-to-day activity, and what is the attacker. Honeypots can help
address both problems. Honeypots make an excellent incident resonse tool, as
they can quickly and easily be taken offline for a full forensic analysis,
without impacting day-to-day business operations. Also, the only activity a
honeypot captures is unauthorized or malicious activity. This makes hacked
honeypots much easier to analyze then hacked production systems, as any data you
retrieve from a honeypot is most likely related to the attacker. The value
honeypots provide here is quickly giving organizations the in-depth information
they need to rapidly and effectively respond to an incident. In general,
high-interaction honeypots make the best solution for response. To respond to an
intruder, you need in-depth knowledge on what they did, how they broke in, and
the tools they used. For that type of data you most likely need the capabilities
of a high-interaction honeypot.
Up to this point we have been talking about how honeypots can be used to protect
an organization. We will now talk about a different use for honeypots, research.
Honeypots are extremely powerful, not only can they be used to protect your
organization, but they can be used to gain extensive information on threats,
information few other technologies are capable of gathering. One of the greatest
problems security professionals face is a lack of information or intelligence
on cyber threats. How can we defend against an enemy when we don't even know
who that enemy is? For centuries military organizations have depended on information
to better understand who their enemy is and how to defend against them. Why
should information security be any different? Research honeypots address this
by collecting information on threats. This information can then be used for
a variety of purposes, including trend analysis, identifying new tools or methods,
identifying attackers and their communities, early warning and prediction, or
motivations. One of the most well known examples of using honeypots for research
is the work done by the Honeynet Project, an all volunteer, non-profit
security research organization. All of the data they collect is with Honeynet
distributed around the world. As threats are constantly changing, this information
is proving more and more critical.
Getting Started
If you have never worked with honeypots before and want to learn more, I recommend
starting with simple low-interaction honeypots, such as KFSensor or Specter
for Window users, or Honeyd for Unix users. There is even a Honeyd
Linux Toolkit for easy deployment of Honeyd on Linux computers. Low-interaction
honeypots have the advantage of being easier to deploy and little risk, as they
contain the activity of the attacker. Once you have had an opportunity to work
with low-interaction solutions, you can take the skills and understanding you
have developed and work with high-interaction solutions. To help you better
understand honeypots, below is a chart summarizing what we just covered.
Low-interaction Solution emulates operating systems and
services.
|
High-interaction No emulation, real operating systems
and services are provided.
|
- Easy to install and deploy. Usually requires simply installing and
configuring software on a computer.
- Minimal risk, as the emulated services control what attackers can
and cannot do.
- Captures limited amounts of information, mainly transactional data
and some limited interaction.
|
- Can capture far more information, including new tools,
communications, or attacker keystrokes.
- Can be complex to install or deploy (commercial versions tend to be
much simpler).
- Increased risk, as attackers are provided real operating systems to
interact with
|
Finally, no paper on honeypots would be complete without a discussion about
legal issues. There are many misconcepts about the legal issues of honeypots.
Instead of briefly covering the legal issues in this paper, I will be releasing
a new paper at the end of May, 2003 dedicated to the legal issues of honeypot
technologies.
Conclusion
The purpose of this paper was to define the what honeypots are and their value
to the security community. We identified two different types of honeypots, low-interaction
and high-interaction honeypots. Interaction defines how much activity a honeypot
allows an attacker. The value of these solutions is both for production or research
purposes. Honeypots can be used for production purposes by preventing, detecting,
or responding to attacks. Honeypots can also be used for research, gathering
information on threats so we can better understand and defend against them.
If you are interested in learning more about honeypots, you may want to consider
the book Honeypots: Tracking
Hackers, the first and only book dedicated to honeypot technologies.
|