Skip to content
Advertisement

Efficient way to store and forward based on 5 tuple data

I am making a Ryu application which has to keep track of network “conversations”, that is to say, the bidirectional L4 flows. The issue is that while there are many ways to do this, efficiency is a big problem.

Problems:

  • Bidirectional data must be easily looked up. For example, a naive approach might be to hash a 5 tuple and then perform lookups against the hash. However, the issue is that it would really be two hashes because of the directionality of src/dst IP and port.
  • Another way to do it might be to simply store ip/port combos in a table and then check against them. The problem is that requires extracting those values for every packet and then accounting for the directionality, which seems messy.

Ideal

What would be ideal is a hash that in some way accounts for directionality. That way you still have a hashtable, but the hashtable matches whether the traffic is:

SRCIP: A, DSTIP: B, SRCPORT: A, DSTPORT: B

or

SRCIP: B, DSTIP: B, SRCPORT: A, DSTPORT: A

Advertisement

Answer

I used Corelight’s pycommunityid. It will allow you to provide a flow tuple and then generate a sha1 hash based on the provided data. It will account for the aforementioned bidirectional problem (see pseudocode)

import communityid

cid = communityid.CommunityID()

tpl = communityid.FlowTuple.make_tcp('127.0.0.1', '10.0.0.1', 1234, 80)
print(cid.calc(tpl))

tpl = communityid.FlowTuple.make_tcp('10.0.0.1', '127.0.0.1', 80, 1234)
print(cid.calc(tpl))

Output:

1:mgRgpIZSu0KHDp/QrtcWZpkJpMU=
1:mgRgpIZSu0KHDp/QrtcWZpkJpMU=

As you can see from the above, flow direction is not an issue and the same combination of ip/port will produce the same hash.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement