Fun with Network Forensics: Discovering a Rouge Bridge

Introduction

This is a short write up on some interesting things I found while completing a midterm project for a Network Forensics class I took last year. My network forensics group decided to map the traffic for contemporary Windows-based denial of service vulnerabilities. Our project utilized a live network of volunteer hosts connected to the university network. We used NetFlow data collected by Flow Tools. While searching for possible exploits I found a hidden network bridge. The bridge used a non-human host registered to a roaming port in a networking closet. The host was eventually found to use a rouge process which proxied connections from an external residence on to campus. A malicious user could have used this bridge to proxy requests from their home through the university.

Motivation

The results presented were motivated by a general interest in watching Windows Visita/7 SMB 2.0 buffer overflow traffic on TCP port 445. TCP port 445 is used by file sharing services. It was safe to assume that most hosts on the university network participated in file sharing. However, while locating possible flooding attempts we recognized file sharing servers and investigated their network communication. As mentioned before, collection of flow data required consent from the user. For our project we obtained consent from everyone on our subnet and were forwarded flow data likewise. However, one of the flagged file servers we observed was unique in that it did not illustrate desktop-like behavior.

Tools

The forensics project utilized NetFlow data. The data was generated by Cisco edge devices owned by the university’s Information Technology department. The Netflow generator was given a filter to redirect all data from the volunteer subnet, also configured as a VLAN, to a collector located in my dorm room. The filter used by the Campus IT department was part of the flow-tools project. The forensics project also used flow-tools to collect the flow data. FlowViewer was used to visualize and generate reports from the data. All the data presented in this post was collected over a 10-day period. (October 8th to October 18th, 2009)

Observations

Here is some background information about the university residential network. Each host will have a non-NAT IPv4 IP from the university's class B network. All non-established incoming connections are blocked by default and only opened by request through the Information Technology department. This is important as only established connections may contain a source IP other than those in the class B network. For reference let's call it (192.168/16).

aggregation of hosts using port 445 Figure 1: Aggregation of hosts using port 445

While observing live traffic for two weeks we did not detect any Windows Vista/7 SMB 2.0 attacks. Unfortunately we could only record flow data from wired connections. Most of the students use laptops connect to the university’s wireless network. Fortunately, we observed enough anomalous behavior to satisfy a midterm report. The discovery of the network bridge began with what appeared to be a scan of one host (Figure 1). Let's call this host 192.168.200.100. The other hosts are slightly blurred to suggest non-importance. The actual IPs are meaningless and anonymized. When viewing an aggregated list of destinations with port 445 (Microsoft Directory Services) I located a machine with a large flow to packet count. This meant there was one machine in particular which experienced 81 connections, with small-sized payloads. I first thought it to be a scanning attempt, but when considering the aggregation is over a single port it doesn’t make much sense. (Unless someone is constantly scanning on port 455.)

Well, it turns out that someone was constantly scanning Mr. 200.100 for an open 445 port. Assuming this host is a file server, the scan-like behavior made sense when considering the connections made by clients which have the file share mapped. The client will periodically send polling messages to determine if the file server is still online, typically when the client user views the mapped location in their file explorer software. (It's sticking out like a sore because it's a file server that preforming no file transfer.)

After performing some forensics on the host I had more evidence suggesting a local file server. All hosts, except for 192.168.200.100, contained an enormous amount of web traffic flows. The destination and source ports for flows involving the host are shown in Figure 2.

Figure 2b: Source ports used in flows with source as 192.168.200.100

Figure 2a: Destination ports used in flows with source as 192.168.200.100

This most certainly is a non-human host as it makes a trivial amount of calls to ports {80, 443, 587, 110} and etc (Figure 2). The high-numbered ports represent clients connecting to running services. Furthermore the dominating source port is 139, Windows File Sharing. However, what service is it making calls from port 33709? This port is a registered port according to IANA but has no registration entry for TCP or UDP. However, Linux does include 33709 as an ephemeral port choice (from 32768-61000). Ephemeral ports are chosen at random. If this is a Linux-based operating system, why is 192.168.200.100 choosing 33709 so often? It is more likely that the host has a socket bound to 33709.

Investigating Port 33709

The first thing was to find some information about port 33709. Although a high-numbered port, it could still be used favorably by a well-known service. Unfortunately there is little information about the port, beside the obvious that it is used by Linux as a candidate for a local port for outgoing traffic. This means that the host is most likely making a connection(s) to 33709 or has made a large connection from 33709, perhaps as a software update. This is possible because of how flow data is recorded. The timeout for a single flow is 30 minutes. If a connection lasts longer than 30 minutes then a flow record will start. A long flow with 33709 as a source port will result in a saturated amount of 33709 connections. I looked at what was using port 33709:

Figure 4: The intersection of ports from flows containing port 33709

Figure 3: Hosts selecting 33709 as a destination port

It seems that there were two hosts which used 33709 as a destination port. Here (Figure 3) we see a non-192.168 IP. This is a host outside of the campus network. Nothing too interesting as it makes sense if an ephemeral port was used during a connection to the outside host. I then visualized the intersection of source and destination ports, where 33709 is the source (Figure 4). Figure 4 shows the destination ports used where 33709 was the source. This revealed that two connections were also made from certified ephemeral ports without any significance, and 53 (DNS). Now this is interesting as the intersection implies that where port 33709 was used as a source, {53, 51457, 56466} was used as the destination. This is only allowed (via the campus firewall) for internal connections.

Following this discovery, I tried to match these port permutations to hosts. I knew that either the source or destination had to match on 192.168.200.100 and 33709. I can show that 192.168.1.21 is a DNS server via Figure 5.

Figure 5: Flows using source 53 and destination 33709

This reveals two pieces of information, the remaining connections to the outside host revealed in Figure 3 are suspicious, and 192.168.200.100 is most likely a Linux host. The external host is being connected to over high-level ports {51457, 56466}. It may be the case that the external host is listening on these ports and the Linux OS chose 33709 as an ephemeral port. It would make much more sense if the opposite were true and Mr. 200.100 was acting as a server listening on port 33709 with a sole-client of 67.81.x.x.

But how is a host outside of the campus network, which is allowed no incoming un-established connections, connecting to Mr. 200.100 on port 33709? The only way this could be possible is if 192.168.200.100 is initiating the connection. The next bit of information I uncovered revealed how:

Hosts connecting to 193.169.200.100 Figure 6: Hosts connecting to 193.169.200.100

This statistical report (Figure 6) lists all flows where 192.168.200.100 was a destination. This report reveals quite a bit about what the host had been doing. The most obvious bit of information gained being, it was running a Fedora Linux OS. This confirms the initial assumption, based on the ephemeral port choice during the DNS query, that the host in question was running Linux. The second piece of information is that it was making many requests to some optimum online subscriber (67.81.x.x). The third, the host may have had Hamachi installed. Hamachi is a light-weight VPN server which is capable of creating a virtual LAN over multiple NAT firewalled LANs [4]. It seems like something was using the non-human file server to tunnel IP traffic, perhaps a certain Optimum Online subscriber? I ran a geo-IP lookup on the Optimum user and it turns out they live somewhere near from the university.

Conclusions

We did not detect any threats to our volunteer live network via the exploits our group was monitoring. However, I observed interesting behavior from a non-human host. After further investigation of the host (192.168.200.100) I confirmed a Hamachi VPN and a zombie SSH connection. The host was being used as a tunnel to connect the LAN of the file server, the university network, to the LAN of a residential Optimum Online subscriber.

Obviously this conclusion can be formed by generating the report shown in Figure 6. However the important part of forensics is not knowing what you're looking for, there were too many local hosts to run reports on each and doing so would typically reveal millions of flows. Running reports for low-flow, high-time hosts might have also revealed 192.168.200.100 but mosts time the time hosts on the network are DHCPing and although alive for a short time, their flow-time ratio is still high. Coming full-circle from identifying anomalous activity to arriving at understandable and explainable results was what made the report successful.

Taking this class gave me a good taste of some network forensic-based investigation techniques. Although hours were spent looking for unknown data, a lot of good came from it. I also have a final report which I intend to create into a blog post. With that I have a few helpful scripts to visualize flow-data in excel (at the time my statistics package for Matlab was expired).