NetFlow Collection on Debian GNU/Linux using flow-tools, flowscan and CUFlow
by Adam Armstrong, adama@jml.net
10 May 2005
Version 0.1
Table of Contents
i. Changelog
0.1 2005-05-10 Initial Document Version
1. Introduction
This document is intended to walk you through the installation of a NetFlow collector and statistics processor using a Debian GNU/Linux server, flow-tools, flowscan and CUFlow. The setup will allow you to monitor which specific hosts, subnets, TCP/UDP ports/services, TOS classes or IP protocols are using bandwidth on your network for billing, diagnostic or information only purposes, along with a complete historical record.

Example flow-tools + flowscan + CUFlow output
1.1. Copyright and License
This document is Copyright 2005 by Adam Armstrong. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is available at http://www.gnu.org/copyleft/fdl.html
1.2. About the author
The author is originally from a small town called Morpeth in the northern part of England.
You can find me at http://www.memetic.org or adama@memetic.org
1.3. Acknowledgements
I would like to thank all the people who've done netflow docs before and cisco for completely breaking the netflow implementation in the 65/7600!
2. Requirements
The following is needed to use this HOWTO:
- A server running Debian GNU/Linux Sarge (3.1)
- A NetFlow capable router (Cisco 7200vxrs will be used in this example)
2.1 The Server
In order to store and archive your captured flows, you'll need a very large amount of disk space. Pushing an average of 3Mbit of traffic, I generate 75MB of flow files per day. I'd recommend a decent amount of RAM and CPU power, perhaps a P4 2.4GHz and 512MB with a 160GB disk for a company pushing 4-5Mbit
2.2 The Router
Any router capable of outputting NetFlow v5 flows will be sufficient. I'm using Cisco 7206VXRs in my example, but any router from the 8xx series up to GSRs and beyond will output NetFlow data.
2.3 The Operating System
For the purposes of this howto, I'm going to be using Debian GNU/Linux Sarge. There are several reasons for this, firstly because it's easy to install and maintain, secondly because the deb-based nature of the distribution means that we can cut down on the size of the HOWTO by using Debian's packages where possible.
3. Installation
3.1 Initial System Preperation
Start with a clean install of Debian Sarge connected to the internet. Become root and perform any updates and install the SSH server first. Agree to any questions it asks during the install.
adama@server:~$ su -
Password:
server:~# apt-get update
server:~# apt-get dist-upgrade
server:~# apt-get install ssh |
We also need to download and unpack the CUFlow module.
server:~# wget http://www.columbia.edu/acis/networks/advanced/CUFlow/CUFlow-1.5.tgz
server:~# tar zxvf CUFlow-1.5.tgz
|
3.2 Package Installation
Now we need to install the packages we'll be using for the installation. Again agree to any questions asked during the install.
server:~# apt-get install flow-tools flowscan rrdtool apache
|
This step will install a large number of packages and may take a long time, depending upon the specifications of your server and the speed of your internet connection.
We need to copy some files into place for CUFlow to work.
server:~# sed 's/${FindBin::Bin}\/CUFlow\.cf/\/etc\/flowscan\/CUFlow\.cf/' \
/root/CUFlow-1.5/CUFlow.pm > /usr/share/perl5/CUFlow.pm
server:~# mv /root/CUFlow-1.5/CUFlow.cf /etc/flowscan
|
This will also modify the CUFlow perl module so that it knows where to find its config file.
3.3 Creating the directories
Now we need to create the directories required for the setup. I use /var/netflow for netflow data and rrd files. If you've used a large disk in your server, most of it should be mounted as /var!
server:~# mkdir /var/netflow/{,rrd,flows,flowtemp,scoreboard}
|
This step will create the /var/netflow directory and the subdirectories rrd, flows, flowtemp and scoreboard.
3.4 Creating the scripts
Now we need to create some scripts.
This script exports the flow-tools files so they can be used by flowscan. It'll be run by flow-scan each time it outputs a flow file, exporting the flow to /var/netflow/flowtemp so it can be read by flow-tools.
Put create the script as /usr/bin/flow-exporter and chmod it to 755
server:~# cat > /usr/bin/flow-exporter << EOF
#!/usr/bin/perl
$file = $ARGV[0];
if ( $file =~ /.*ft-v05\.(\d\d\d\d)-(\d\d)-(\d\d)\.(\d\d)(\d\d)(\d\d)/ ){
$cflowfile = "flows.".$1.$2.$3."_".$4.":".$5.":".$6;
$command = "/usr/bin/flow-export -f0 < $file > /var/netflow/flowtemp/$cflowfile";
print "$command\n";
system($command);
}else{print "File $file didn't match\n";}
EOF
server:~# chmod 755 /usr/bin/flow-exporter |
4. Configuration
4.1 Flow-tools Configuration
We need to configure flow-tools to accept netflow data from our router and output it to the correct place at the correct intervals in the correct format, to achieve this create a configuration file for flow-tools.
server:~# cat > /etc/flow-tools/flow-capture.conf << EOF
-w /var/netflow/flows 0/0/9990 -S5 -V5 -E1G -n 287 -N 0 -R /usr/bin/flow-exporter
EOF |
The debian startup script will basically execute /usr/bin/flow-capture appending the line from the config file.
4.2 Flowscan Configuration
We need to configure flowscan to use CUFlow and read the flow files from /var/netflow/flowtemp/.
server:~# cat > /etc/flowscan/flowscan.cf << EOF
FlowFileGlob /var/netflow/flowtemp/flows.*:*[0-9]
ReportClasses CUFlow
WaitSeconds 30
Verbose 1
EOF |
Next we need to configure CUFlow for our network. You should modify this example to include your network address and any subnets, protocols, services and AS numbers you want to monitor.
server:~# cat > /etc/flowscan/CUFlow.cf << EOF
Subnet 192.168.0.0/24
Network 192.168.0.128/28 SomeNetwork
Network 192.168.0.0/25,192.168.0.192/27 SomeOtherCustomer
OutputDir /var/netflow/rrd
Multicast
Scoreboard 25 /var/netflow/scoreboard /var/netflow/scoreboard/toptalkers.html
AggregateScore 25 /var/netflow/rrd/agg.dat /var/netflow/scoreboard/overall.html
Service 20-21/tcp ftp
Service 22/tcp ssh
Service 23/tcp telnet
Service 25/tcp smtp
Service 53/udp,53/tcp dns
Service 80/tcp http
Service 110/tcp pop3
Service 119/tcp nntp
Service 143/tcp imap
Service 443/tcp https
Service 3306/tcp,3306/udp mysql
Service 1433/tcp,1433/udp,1434/tcp,1434/udp ms-sql
Service 3389/tcp,3389/udp ms-rdp
Service 5900/tcp,5900/udp vnc
Protocol 1 icmp
Protocol 4 ipinip
Protocol 6 tcp
Protocol 17 udp
Protocol 47 gre
Protocol 50 esp
Protocol 51 ah
TOS 0 normal
TOS 1-255 other
ASNumber 444 Some-ISP
ASNumber 555 Another-ISP
ASNumber 666 Some-Peer
EOF |
Next we'll create a script to start and stop the flowscan daemon and link it to runlevel 3.
server:~# cat > /etc/init.d/flowscan << EOF
#!/bin/sh
# description: Start FlowScan
case "$1" in
'start')
cd /var/netflow/ ; /usr/bin/flowscan >>/var/log/flowscan 2>&1 </dev/null & >/dev/null
touch /var/lock/flowscan.1
;;
'stop')
rm -f /var/lock/flowscan.1
;;
*)
echo "Usage: $0 { start | stop }"
;;
esac
exit 0
EOF
server:~# chmod 755 /etc/rc3.d/S21flowscan
server:~# ln -s /etc/init.d/flowscan /etc/rc3.d/S21flowscan |
4.3 Web Configuration
We need to configure apache to serve the CUFlow CGI and scoreboard files.
server:~# ln -s /var/netflow/scoreboard /var/www/netflow-scoreboard
server:~# sed 's/\/cflow\/reports\/rrds/\/var\/netflow\/rrd/' \
/root/CUFlow-1.5/CUGrapher.pl > /usr/lib/cgi-bin/CUGrapher.pl
|
4.4 Router Configuration
We need to configure our routers to output flows. In this example, Fastethernet 0/0 is the interface connected to our upstream provider on each router. The netflow collector's IP address has been blanked out as x.x.x.x for you to fill in. We're going to make the flows source-address be the loopback address of the router, you can ommit that part if your router doesn't use any loopback addresses (it probably should!). CEF should already be enabled on any Cisco router doing netflow.
routera# conf t
routera(config)# ip flow-cache timeout active 1
routera(config)# ip flow-export source Loopback0
routera(config)# ip flow-export version 5 origin-as
routera(config)# ip flow-export destination x.x.x.x 9990
routera(config)#interface fastethernet0/0
routera# ip route-cache flow
routera(config-if)#exit
routera(config)#exit
routera#
|
5. Starting it up
5.1 Starting the daemons
Now that it's all configured, we'll start the daemons.
server:~# /etc/init.d/flow-tools start
server:~# /etc/init.d/flowscan start
|
You should be able to see in /var/log/syslog and /var/log/flowscan that it's all working. It'll take around 5 minutes before flow-tools processes a complete flow file for flowscan to analyse (there should be about 10 lots of "sleep 30...").
server:~# tail /var/log/syslog
May 10 17:40:42 server flow-capture[6671]: setsockopt(size=4194304)
May 10 17:40:44 server flow-capture[6671]: New exporter: time=1115746844 src_ip=127.0.0.1 dst_ip=127.0.0.1 d_version=5
server:~# tail /var/log/flowscan
sleep 30...
2005/05/10 21:30:04 working on file /var/netflow/flowtemp/flows.20050510_21:25:00...
2005/05/10 21:30:06 flowscan-1.020 CUFlow: Cflow::find took 2 wallclock secs ( 2.41 usr + 0.00 sys = 2.41 CPU) for 622050 flow file bytes, flow hit ratio: 10197/11310
2005/05/10 21:30:06 flowscan-1.020 CUFlow: report took 0 wallclock secs ( 0.00 usr 0.00 sys + 0.10 cusr 0.04 csys = 0.14 CPU)
server:~# |
If it's all running you should see temporary files and 'ft' files in /var/netflow/flow and rrd files in /var/netflow/rrd.
6. Using it
6.1 Using the web interface
Now that it's all running you can visit http://yourserver/cgi-bin/CUGrapher.pl and http://yourserver/netflow-scoreboard/ to see the generated stats.
|