Blog :: Security Operations

Detecting Data Leaks

This month I had the opportunity to work with a customer that wanted to use our malware incident response system for detecting data leaks.  I put together a solution that met the customers goals and thought it would make a good reference for anyone who needs a solution for detecting data leakage.

Detecting Data Leaks – Saved Report

The first step was to create a saved report in Scrutinizer called “Exceeded 5 MB in 5” Below you can see that I excluded a bunch of traffic to as I knew it would trigger an indicator of compromise event and ultimately lead to false positives.  I later added a few more exclusions; one of which was an autonomous system.

detecting data leakage

Also in the above, I specified traffic:

  • coming from (i.e. source) is included
  • going to (i.e. destination) is excluded

The above made sure that we were only looking at internal traffic headed to the Internet. In the bottom left of the above, you will see a threshold of 5M Bytes.  This saved report runs every 5 minutes and any host that sends greater than 5MBs in 5 minutes will trigger an indicator of compromise (IOC) event. The IOCs will be counted by a 2nd process.

Detecting Data Leakage – Scheduled Process

The above saved report is going to create a lot of IOCs – and you want that.  They won’t however, by themselves trigger a Security Event alarm.  I wrote a simple program in perl that I scheduled to execute every 5 minutes.  Here is what it does:

  • Reviews only the “Exceeded 5 MB in 5” IOC events
  • Groups by source IP address and destination IP address (configurable)
  • Discards IOC events that are older than 24 hours (configurable)
  • Adds up the “sum_octetdeltacount” values from each IOC event to create a “Total Byte Exfil” value
  • If the “Total Byte Exfil” value is above 100MB (configurable) then post a “Data Leak” Security Event in the Alarms tab

In short, the above detects end users uploading > 100MB to an unapproved Internet host(s) within a 24 hour period.  Below you can see that there have been 105 events where the saved report with a threshold named “Exceeded 5 MB in 5” was exceeded by 19 different internal hosts.  These 105 events are what is counted by the scheduled process. If the above criteria is met, a “Data Leak” Security Event is posted. Notice that 5 hosts had triggered it.  I tested this by simply uploaded files from the 5 end systems displayed.

detecting data leaks

Regardless of whether they upload the data in 15 minutes or 15 hours, the above setup detects and alerts on it and it is customizable!

  • If you want to detect malware that only uploads 2-3MB in a 5 minute period you can tweak the code.
  • If you want to trigger on a total of over 250MBs and increase the time frame to 48 hours, it can easily be done.

Detecting Data Exfiltration

There are many ways that electronic data can be leaked out of your company.  DNS is a big risk for data leaks which is why our FlowPro Defender looks at Fully Qualified Domain Name (FQDN) requests, monitors NXDomain messages and considers several other factors often indicative of data exfiltration.  My point is, we have to go about detecting data leakage several different ways.

Here’s the code for the scheduled process:

use DBI();
use Net::Syslog;
my $dbh = DBI->connect("DBI:mysql:database=plixer;host=;port=3306","root", "root");
my $query = qq{SELECT inet_b2a(violator_address) as violator,inet_b2a(destination_address) as destination FROM plixer.alm_bulletin_board_data where message like '%Exceeded 5 MB in 5%' and (epoch < UNIX_TIMESTAMP(DATE_ADD(NOW(),INTERVAL -24 HOUR))) group by violator_address, destination_address;}; my $sth = $dbh->prepare("$query");
while (my $alarmref = $sth->fetchrow_hashref()){
    print "$alarmref->{'violator'},$alarmref->{'destination'}\n";
    my $query = qq{SELECT message FROM plixer.alm_bulletin_board_data where message like '%Exceeded 5 MB in 5%' and (epoch < UNIX_TIMESTAMP(DATE_ADD(NOW(),INTERVAL -24 HOUR))) and inet_b2a(violator_address) = '$alarmref->{'violator'}' and inet_b2a(destination_address) = '$alarmref->{'destination'}';};
    #print "query: $query\n";
    my $sthbytecount = $dbh->prepare("$query");
    my $byteTotal = '0';
    while (my $byteref = $sthbytecount->fetchrow_hashref()){
        my (undef, $bytemessage) = split(/sum_octetdeltacount : /,$byteref->{'message'});
        my ($bytes, $delimiter) = split(/ /,$bytemessage);
        $byteTotal = $byteTotal + $bytes;
        ### DEBUG #print "my bytes: $bytes\n";      #print "my delimiter: $delimiter\n";
    if ($byteTotal > 100){
        my $syslogMessage = "Source IP: $alarmref->{'violator'}\nDestination IP: $alarmref->{'dst'}\nTotal Byte Exfil: $byteTotal\n";
        print "$alarmref->{'violator'},$alarmref->{'dst'}, Total Byte Exfil: $byteTotal\n" ;
        my $query = qq{SELECT fqdn FROM plixer.domain_index where requester_id = inet_a2b('$alarmref->{'violator'}') and resolved_to_id = inet_a2b('$alarmref->{'destination'}') and last_seen < UNIX_TIMESTAMP(DATE_ADD(NOW(),INTERVAL -24 HOUR));}; my $sth_dns = $dbh->prepare("$query");
        while (my $dnsref = $sth_dns->fetchrow_hashref()){
            print "$dnsref->{'fqdn'}\n";
            $syslogMessage .= "$dnsref->{'fqdn'}\n";
        my $s=new Net::Syslog(Name=>"DATAEXFIL",Facility=>"emergency",Priority=>"info",SyslogPort=>,"514",SyslogHost=>"localhost");

IMPORTANT NOTE: The above code does not include my latest update which is to make sure that the same event doesn’t continuously get posted to the Alarm tab.  I had to add more complicated logic to accomplish this.  Also, the script above looks at IP pairs, Source and Destination, however it can be easily modified to look at a single source with multiple destinations or multiple sources with a single destination. Just contact our team for the latest code or if you need help setting this up.