Next
Previous
Contents
The portion of the ISAKMP key exchange where the ESP SPI values are
communicated is encrypted, so the ESP SPI values must be determined by
inspection of the actual ESP traffic. Also, the outbound ESP traffic does
not contain any indication of what the inbound SPI will be. This means
there is no perfectly reliable way to associate inbound ESP traffic with
outbound ESP traffic.
The IPSec masq patch attempts to associate inbound and outbound ESP traffic
by serializing initial ESP traffic on a by-remote-host basis. What this
means is:
- If an outbound ESP packet with an SPI value that has not previously
been seen (or whose masquerade table entry has expired) is received (which
shall hereafter be called an "initial packet"), a masquerade
table entry for that SourceAddr+SPI+DestAddr combination is created. It is
marked as "outstanding", that is, no inbound ESP traffic has been
received for it yet. This is done by setting the "inbound SPI"
value in the masq table entry to zero, which is a value reserved for uses
such as this. This will happen at the initiation of a new ESP connection
and at regular intervals when an existing ESP connection rekeys.
- As long as the masq table entry is outstanding, no other initial ESP
packets for the same remote host will be processed. The packets
are immediately discarded, and a system log entry is made saying the
traffic is temporarily blocked. This also applies to initial traffic from
the same masqueraded host going to the same remote host if the SPI values
differ. Traffic to other remote hosts, and traffic where both SPI values
are known ("established" traffic) is not affected by this.
- This could easily lead to a Denial of Service of the remote host, so
this outstanding ESP masq table entry is given a short lifetime, and only a
limited number of retries of the same traffic are allowed. This permits
round-robin access to the remote host if several masqueraded hosts are
attempting to initialize simultaneously and responses aren't coming back
very quickly, for example due to network congestion or a slow remote host.
The retry limitation begins once there is a collision, so the masqueraded
IPSec host can wait as long as necessary for a reply until there's a need
for serialization.
- When an ESP packet from the outstanding remote host is received and
the SPI value does not appear in any masq table entry, it is assumed that
the packet is the response to the outstanding initial packet. The SPI value
is stored in that masq table entry, thus associating the SPI values, and
the inbound ESP traffic is routed to the masqueraded host. At this point
another initial packet for the remote server may be processed.
- Any ESP traffic with a zero SPI value is discarded as invalid.
There are several ways this can fail to associate traffic properly:
- Network delays or a slow remote host can cause the response to the
first initial packet to be delayed long enough that the init masq table
entry expires and a different masqueraded host is given a chance to
initialize. This could cause the response to be associated with the wrong
outbound SPI, which would cause inbound traffic to be routed to the wrong
masqueraded host. It is assumed that if this happens the masqueraded host
receiving the traffic in error will discard it because it has an unexpected
SPI value, and everybody will eventually time out, rekey and try again.
This can be addressed by editing
/usr/src/linux/net/ipv4/ip_masq.c
and increasing the INIT lifetime or the number of INIT retries permitted,
at the cost of increasing the blocking (and DoS) window.
- Sessions idle or semi-idle (with infrequent inbound traffic and
no outbound traffic) for a long period of time may be idle long enough for
the masq table entry to expire. If the remote host sends traffic to an
established yet expired session while an outstanding init to the same
remote host is underway, the traffic may be misrouted for the same reason
as described above. This can be addressed by making sure the IPSec Masq
Table Lifetime kernel configuration parameter is slightly longer than the
rekey interval, which is the longest time any given SPI pair will be used.
The problem here is that you may not know all of the rekey intervals if
you're masquerading for many remote servers, or some may have their rekey
intervals set to unreasonably high values, such as several hours, causing
the IPSec masq table to become cluttered with stale entries having long
lifetimes - and possibly leading to misrouted traffic if a given remote
host reuses an SPI value. An better workaround would be to run a ping
process on the masqueraded hosts to send some traffic over the link at
regular intervals that are less than the masq table lifetime. Perhaps at
some point the masq timeout could be made adaptive, by monitoring the
ISAKMP channel and shortening the timeout on all masq table entries for a
masqed-host/remote-host pair that has begun a rekey.
- If there is a delay between a rekey and the transmission of outbound
ESP traffic using the new SPI, and during this delay inbound ESP traffic
using the new SPI is received, there will be no masq table entry and the
inbound traffic may be discarded or misrouted as above. The ideal solution
for this would be for the masqueraded IPSec host (which will probably be a
Windows IPSec client) to ping the remote gateway over the tunnel
immediately upon completion of the rekey. This would minimize the delay
between rekeying and outbound ESP traffic with the new SPI. Running a ping
process on the client would also be a possible workaround for this problem
as well, though the ping interval should be fairly short. Unfortunately,
this introduces otherwise unnecessary network traffic.
The best solution is to have some way to preload the masq table with the
properly associated out-SPI/in-SPI pair or some other mapping of
remote_host + inbound_SPI to masqueraded_host. This cannot be done by
inspecting the ISAKMP key exchange, as it is encrypted. It may be possible
to use Host-NAT to communicate with the masqueraded IPSec host and request
notification of SPI information once it has been negotiated. This is being
investigated. If something is done to implement this it probably will be
done in the 2.2.x series patches but not in the 2.0.x series.
Next
Previous
Contents