IP Traceback: Information Security Technical Update

/* The following article is extracted from the "Information Security Newsletter" published by the JUCC IS Task Force. */ 
Distributed Denial of Service attack is one of the most menacing security threats on the Internet. In order to put down these attacks, the real source of the attack should be identified. IP traceback is the function to trace the IP packets within the Internet traffic.
The Internet Protocol
Structure of an IP packet

IP (Internet Protocol) is the primary protocol of the Internet communication standards. It delivers packet from the source host to the destination device based on the information carried in the packet header.
The IP packet is composed of the header which carries the IP address, the destination IP address and other meta-data required to route and deliver the packet. Even if the source IP address is stored in the header, address spoofing is possible by exploiting security loopholes.  
Vulnerabilities of the protocol
The TCP/IP protocol has been designed to send data reliably but it does not secure the process1. In fact, the authenticity of the source address carried in IP packets is never checked by the network routing infrastructure. Thus, a motivated attacker can easily trigger a Denial of Service (DoS) attack. These kinds of attacks mainly rely on forged IP addresses or source address spoofing. 
Source address spoofing
In an IP spoofing attack, an intruder uses a forged source IP address and establishes a one-way connection in order to execute malicious code at the remote host2. This technique is usually used for DoS attacks especially SYN flood attacks. The latter is a form of DoS in which the attacker sends a succession of SYN requests to a target’s system in an attempt to consume enough server resources to make the system unresponsive to legitimate traffic3.
Every connection in TCP/IP starts with a threeway-handshake. The client sends a synchronization signal SYN to the server which acknowledges it by sending back a SYN-ACK, and waits for the client to send an ACK signal.
In case of an attack, the SYN-ACK is sent to a spoofed IP address, therefore, the ACK message never arrives and the server resources will be blocked, degrading the service for legitimate users.
(Distributed) Denial of Service attacks
DoS disables network services for legitimate users. An attack starts when computers are infected with malware and turned into botnets. These machines become the compromised hosts. There are two kinds of compromised hosts:
The Zombies: They launch the SYN flooding attack through uncompromised machines that communicate with the victim’s machine through the three-wayhandshake protocol. These uncompromised machines are called “reflectors”.
The Stepping Stones: They act as intermediate nodes between the attacker and the zombies to make it harder to discover the attacker.
Forging a false IP address is easy especially with the python script “Scapy”. Scapy is a powerful interactive packet manipulation program. It is able to forge or decode packets of a wide number of protocols, send them on the wire, capture them, match requests and replies.
The intended receiver uses Wireshark to analyse the receiving packets and verify the information of the forged packet.
The TCP/IP protocol does not check the source address. Thus, the address source that appeared on Wireshark is not the true source.
DDoS attack: Expensive Damages
Denial of Service attack is one of the three most expensive cyber-attacks. Along with malicious insider attacks and webbased attacks, they account for 55% of all cybercrime costs per year.
The estimated cost damage is $40,000 per hour and 49% of DDoS attacks last for 6 to 24 hours according to Incapsula poll 6. Thus the average DDoS costs is $500,000 with some exceptionally expensive cases.
However, damages are not only financial: loss of customer trust, virus/malware infection and loss of intellectual property are other consequences of DDoS attack. 
IP Traceback

DDoS attack is a growing concern as it targets a broad range of industries, from e-commerce to financial institutions, it can lead to a significant loss of money because of unavailability of service. Preventive measures against these attacks are available, but the identification of the source of attack and prevention of any recurrences are also crucial to a good practice of cyber security.

One of the ways to achieve IP traceback is hop-by-hop link testing. When an attack is launched, the network administrator will log into the closest router to the victim and analyse the packet flow to determine the origin of the malicious packets. This will localise the next upstream router. The major drawback of this simple method is that it requires a strong interoperability between routers, and the attack must still be in progress while the tracing of malicious packet takes place.

Comparison of Existing IP Traceback Techniques 

IP traceback techniques can be classified as pro-active or reactive.

A pro-active approach locates the source after the attack by looking at the records files and logs of the network.

A reactive approach locates the attacker on the flight when the attack is detected by a specialised hardware.

The comparison of traceback techniques will focus on three illustrative methods which belong to different classes of IP traceback techniques. These techniques remain at the stage of research and are not yet released in the market. The Source Path Isolation Engine (or hash-based) algorithm is an in-band pro-active techniques. The iCaddie ICMP is the evolution of the ICMP out-of-band traceback technique. The third one is the reactive IDIP mechanism.


Source Path Isolation Engine (SPIE) 

SPIE, or Hashed-based IP traceback is used to trace the origin of a single packet. This system was proposed by Snoeren et al5. It is a packet logging technique which means that it involves storing packet digests at some crucial routers. The main issue is that the storage of saved packet data requires a lot of memory. SPIE is of high storage efficiency and thus reduces the memory requirement (0.5% of the link capacity per unit time in storage). In fact, instead of storing the packets, it uses auditing techniques. It computes and stores 32-bit packet digest. Moreover, an efficient data structure to store packet digest is mandatory. SPIE uses Bloom filter structure. 

Another important issue of packet logs is the risk of eavesdropping. Storing only packet digests and not the entire packet prevents SPIE from being misused by attackers. Therefore, the network is protected from eavesdropping which is one of the criteria of an effective IP traceback system. 

There are two options to determine the route of a packet flow. The first one is to audit the flow while it passes through the network and the second is to attempt to infer the route based on its impact on the state of the network. The difficulty of using them increases as the size of the packet flow decreases. Especially, the second one becomes impossible because small flows have no detectable impacts on the network. Thus, an audit option is used in SPIE. 

SPIE is also called hash-based IP traceback because a hash of the invariant fields in the IP header is stored in each router as a 32-bit digest. It remains stored only for a limited duration of time because of space constraint.

Packet digests are created by Data Generation Agent (DGA) at each router. Before a traceback begins, an attack packet must be detected. To determine it, an intrusion detection system (IDS) is used. IDS provides a packet, the last hop router, the time of attack, to the SPIE Traceback Manager (STM) which will verify its authenticity and integrity. Upon successful verification, STM will send the signature information to the SPIE Collection and Reducing Agent (SCAR) responsible for the victim’s network area. If any match is found, the SCAR returns a partial attack graph of the involving routers. Then STM will then send new queries to another SCAR region. This process continues until the attack path is constructed. Finally, the STM sends the result back to the IDS.

iCaddie ICMP
In out-of-band pro-active schemes, the tracing mechanism is conducted with the help of separate packets generated at the routers when the malicious packet traverses through them. 
The most used technique is the ICMP Traceback (iTrace). As a packet traverses through the network, an ICMP (Internet Control Message Protocol)7 packet is generated by the router every 20,000 packets that pass through it. ICMP messages are contained within standard IP packets, thus the header structure is the same as the IP one. The iTrace payload contains useful information about the router visited by IP packets such as router’s ID, information about the adjacent routers, timestamp, MAC address pair of the link traverse, etc. Thus, the victim is able to infer the true source of the IP packet from the information available. As most of the DoS attacks are flooding attacks, a sufficient amount of trace packets is likely to be generated. This technique does not require any modification of the existing infrastructure. However, it consumes considerable bandwidth and requires a large number of packets to traceback an attacker. It also has a poor handling of DDoS. In recent years, there has been an improvement in tackling the issues of the original scheme8.
One of the variants of the classic ICMP traceback is iCaddie9. It has been designed by taking into account various properties:
i. The cost and time required for upgrading network equipment
ii. The extra traffic load due to out-of-band IP traceback techniques
iii. The computational overhead of the attack path construction process
iv. The storage capacity to collect and store information for path construction
Like iTrace, the Caddie message is an extra ICMP message generated by a router or an application, called a Caddie initiator. Attached to it is the entire packet history of one randomly selected packet, called a Ball packet, which is forwarded by the router. In fact, while a router is forwarding packets, it randomly selects one of the packets as a ball packet. The Caddie message will collect the path information about the sequence of the routers (called Caddie propagators) identity along the way toward the ball packet’s destination.
However, while iTrace generates ICMP traceback message every 20,000 packets, iCaddie works differently. In order to reduce the number of traceback messages produced, each router maintains a timer that indicates how long it has not received a traceback message. If the amount exceeds a specified threshold the router will start to act as Caddie initiator.
Then, as routers act as Caddie propagators, they append their IP address to the Router List (RL) along with the incoming interface and next hop information. This list is encrypted with a Message-Authentication Hash function (HMAC) in order to prevent the RL elements from being modified. This function runs with the Time-Released Key Chain (TRKC)10. To generate a sequence of secret keys, each router successively applies the HMAC function to a randomly selected seed, and then each router reveals the key after a delay at the end of each time interval. The destination of a Caddie message can retrieve the newest key, and then compute all the secret keys for previous time intervals to finally compute and verify the HMACs for every RL element in the Caddie message.
The benefit of this approach is that the number of trace packets produced is fewer. It is independent of the attack path and is solely dependent on the number of attack sources. The scheme produces fewer attack sources and false positives as the chances of two packets digest forwarded within a short gap of time is much smaller. More generally, the ICMP traceback scheme is really interesting as it can handle DDoS very well with fast recognition and requires low interoperability between ISPs as Caddie propagators transmit the Caddie packet like any other packet. However, it still requires more bandwidth than an in-band technique and the deployment cost is non-negligible.
Reactive IDIP
Intrusion Detection System (IDS) can also be used by reactive attack defence methods. It will alert the system in case of attack and this one will respond with a traceback. The defence can be handled by the network or by the host11. For instance, the Intrusion Detection and Identification Protocol (IDIP) is handled by the network. Therefore, it uses less resources.
IDIP is used to trace the real-time path and source of intrusion12. IDIP systems are separated in IDIP communities. Each community contains its own system of intrusion detection and the response is managed by the Discovery Coordinator. Each community is divided in neighbourhoods in such a way that only one IDIP component belongs to each neighbourhood. This architecture allows the collection of intrusion-related information at a central site which enables the exchange of intrusion reports to have a better understanding of the situation. In each neighbourhood, a local IDS agent watches and sends its report to a boundary controller.
When an attack occurs, the detector node sends an attack report to its neighbours, which will help trace the attack path and also send the attack report along the attack path. But before sending it, they will decide how to respond to the attack (disabling the user account, installing filtering rules, etc.), depending on the type of attack and the response of other IDIP nodes. Then all the attack reports are sent back to the Discovery Coordinator. This allows it to have an overview of the situation and modify local node decision if necessary. This technique stops the diffusion of the attack and at the same time rebuild the attack path.
The efficiency of IDIP is linked to the effectiveness of intrusion identification at different boundary controllers. Each controller needs to have the same intrusion detection capability as the IDS. The automated response allows the system to react quickly.
One of the main advantages of this technique is its minimal dependence on the system infrastructure. This method can trace the connection that spoofed the source addresses. In fact, the IDIP protocol is based on what the components have recorded rather than network routing tables. The drawbacks are that it requires high ISP cooperation especially with the controller boundary and that it depends on the reliability of the router. IDIP can successfully trace back the source unless it encounters stepping stones – a sequence of intermediate hosts that help attacker remains anonymous.
Comparative study
Three main approaches have been studied:
The pro-active in-band technique: The traceback information is carried within the packet header. Logging scheme like SPIE, can only trace packets that have been delivered in the recent past as the packet digests are made to expire after a certain period of time.
The pro-active out-of-band technique: The trace information is sent within a separate packet. This technique requires more network bandwidth in delivering the trace information.
The reactive IDS assisted approach: It requires a significant amount of cooperation between ISP to perform the traceback. The reliability of this scheme is only up to the extent to which a router is secured to an attacker.
The objective of IP traceback is to determine the true source of DoS/DDoS attacks. IP traceback and attack detection form an efficient collaborative defence against DoS attacks across the Internet. Depending on the company’s resources, there also exist various pro-active techniques that all have evolved from basic scheme, such as IP marking, IP logging, ICMP-based traceback, overcoming the shortcomings that the researchers had focused on. Some are more prone to one aspect of the network attack than other. Hence network administrators should take into consideration their business requirement and objective to implement the best suited approach. However, it has been done at the lab scale but hasn't yet moved out into the field.
  1. “Requirements for Internet Hosts – Communication Layers” R. Braden October 1989 pdf
  2. “TCP/IP Vulnerabilities: IP spoofing and Denial-of-Service Attacks” A. Kak 25 April 2015 pdf
  3. “SYN flood” 7 July 2015 Web. 23 July 2015
  4. “Scapy” December 2014 Web. 24 July 2015
  5. “Hash-Based IP Traceback” A. C. Snoeren et al. 2001 pdf.
  6. “Incapsula survey – DDoS Impact Survey”T. Matthews 2014 pdf.
  7. “Internet Control Message Protocol” June 2015 Web. 23 July 2015
  8. “Comparative study of IP Traceback Techniques” M.Lapeyre 2015 pdf.
  9. “A DoS-Resistant IP Traceback Approach” Bao-Tung Wang, 2003 pdf.
  10. “Advanced and Authenticated marking Schemes for IP traceback” D.X. Song and A. Perrig 2001 pdf.
  11. “Taxonomy of IP Traceback” L. Santhanam et al. 2006 pdf.
  12. “Infrastructure for Intrusion Detection and Response” D. Schnackenberg et al.