TCP Connection Analysis with Tycho

August 28 2020

by Sebastian Manns

Network analysis is an important and interesting part of malware analysis. Very often malware communicates with so-called command and control servers. From these servers it receives instructions, keys are exchanged or new functions are loaded in the form of payloads. If you want to analyze unknown Malware, it is a good first step to find out if the malware connects to a server.

In this blog article i will show you, how to quickly and easily create a small network analysis tool for TCP connections with Tycho. The goal is to detect when a process connects to a server, find out the address of the server, and report what data is exchanged.

What is Required

For illustrative purposes it is sufficient to work with a standalone configuration. That means I use a local server on the same computer as the client to check the functionality of the tool.

We need:

  • A working Tycho setup as described here
  • A client and a server, to establish a connection and transmit data. We will intercept this connection in order to analyze it.

As client and as server netcat is used, a simple cross-platform tool. With netcat you can also easily establish a local TCP connection and transfer data. The client is the process that we are going to analyze with Tycho.

All right, let’s go!

We will start with refreshing a bit of TCP socket basics. Then we do our connection analysis in two parts: Part 1 is identifying the destination and port of a new client connection, and part 2 is finding out the content that is being sent from the client.

Socket Basics

Windows uses sockets for communication over a network. Both the client and the server must have created a socket. Using the IP address and the port of the server, the client can connect to the server. After the connection is established, data can be transmitted in both directions.

To have a better understanding I will briefly explain on snippets from a simple C program how a client initializes a socket and transmits data.

The snippets are from Windows Dev Center and the complete code can be found here.

First, the data has to be initialized, that means the message and the buffer size are defined and initialized the socket.

The next step is to create a socket for communication.

After the process has created a socket, it uses the API function connect to connect it to the remote side.

Next, the API function send is used to send the previously defined message to the server.

Finally, the client receives the response from the server and the connection is terminated.

Part one: Get Remote Address and Port

A possibility to continue now would be to set breakpoints on the API functions. But that would be a very complicated way. With Tycho we can go a simpler way. API calls need to use system calls internally in order to make the operating system talk to the actual hardware. We can simply use the system call tracking function to get the important information. As demonstrated in other blog entries, analyzing system calls is one of Tycho’s strengths.

We build on this Tycho functionality and intercept NtDeviceIoControlFile, which is documented here, to get the IP address, port of the server and the transmitted data. NtDeviceIoControlFile is not only used for network connections, but also for many other things e.g. to find out the size of the hard disk, as demonstrated in the blog article Reverse Engineering with Tycho.

The task for which NtDeviceIoControlFile was just called can be determined from the IOCTL code. The IOCTL Code is an input parameter of this system call, so we can readily retrieve it with Tycho. To find out which IOCTL are used for network operations we can look at the Dr. Memory project. This project contains a rich collection of open source system call analysis data, that we can use. Of particular interest is IOCTL 0x12007, which stands for AFD_Connect.

The acronym AFD stands for Ancillary Function Driver. This driver is responsible for the communication to the low level functions of tcpip.sys. That is the primary driver for managing network connectivity.

This means that when the system call NtDeviceIoControlFile is called successfully with this IOCTL code, a connection to a server is established.

Now i will show code snippets that demonstrate how to use Tycho in order to read the server address and the transmitted data, only with information from NtDeviceIoControlFile.

Important, the following code is not complete. It misses the initialization steps for Tycho and also execption handling. It is just intended as illustrative example for this blog post. The complete script can be found as Tycho recipe tycho-recipes repository.

First we set a system call breakpoint to NtDeviceIoControlFile with the IOCTL for AFD_CONNECT. So we already know when the client establishes a connection.

The next task is to find out the IP address and port of the server. Again this github repository of the Dr. Memory project also tells us that the input buffer of NTDeviceIoControlFile is an AFD_CONNECT_INFO structure in this case, which is described here.

This structure is defined as follows:

typedef struct  _AFD_CONNECT_INFO {
    BOOLEAN           UseSAN;
    ULONG             Root;
    ULONG             Unknown;
    SOCKADDR          RemoteAddress;
} AFD_CONNECT_INFO , *PAFD_CONNECT_INFO ;

Looking at this structure, RemoteAddress obviously sounds very interesting and we have to find out the layout behind SOCKADDR. This information is available directly from Windows Dev Center and looks like this:

typedef struct sockaddr_in {
  short          sin_family;
  USHORT         sin_port;
  IN_ADDR        sin_addr;
  CHAR           sin_zero[8];
} SOCKADDR_IN, *PSOCKADDR_IN;

Now we have found the information we were looking for and we can extend the code example from above:

Summary Part One

The data structure AFD_CONNECT_INFO can be found in the input buffer of NtDeviceIoControlFile. There, we have to look at offset 24 where RemoteAddress starts. It is important to remember that we are in the area of network data, meaning the data we read is big-endian. You can see this in the struct.unpack functions. Normally the first parameter is a < which indicates that the data as little-endian. But now there is a ! which means that the data will be read in big-endian. For more information about the struct.unpack function, refer to the Python documentation.

Counted from the start of RemoteAddress we can find the port at offset 2, called sin_port (2 bytes long), and the IP address at offset 4, called sin_address (4 bytes long).

So part one is done. Tycho allows us to find out with very little effort to which server a process has connected.

Part Two: Get Transmit Data

What you might be interested in next is the data being sent to the server.

Finding the point when a transmission starts is very simple as the NtDeviceIoControlFile system call is used again. This time, the IOCTL code is 0x1201F, which stands for AFD_SEND and it is responsible for sending data to the server. You can check the definition of the AFD_SEND again at Dr. Memory.

This time the input buffer of NTDeviceIoControlFile has a different function as it contains a AFD_SEND_INFO struct. You can find a congruent definition of this structure in the Dr. Memory project source.

This looks like this:

typedef struct  _AFD_SEND_INFO {
  PAFD_WSABUF       BufferArray;
  ULONG             BufferCount;
  ULONG             AfdFlags;
  ULONG             TdiFlags;
} AFD_SEND_INFO , *PAFD_SEND_INFO ;

At this point we need to take a closer look at PAFD_WSABUF, because this structure contains the number of transmitted bytes and a pointer to the data in memory. The documentation can be found here.

The WSABUF struct is quite simple:

typedef struct _WSABUF {
  ULONG len;
  CHAR  *buf;
} WSABUF, *LPWSABUF;

With this information, we can extend the script to get the transmitted data. First we get the input buffer (AFD_SEND_INFO):

Then we read at offset 0 from AFD_SEND_INFO (8 bytes) to get the PAFD_WSABUF pointer.

Finally we only have to read at PAFD_WSABUF offset 0 the size of the transmitted data and at offset 8 the pointer, which shows where the data is located.

Summary Part Two

The input buffer is an AFD_SEND_INFO structure this time. It contains a pointer at offset 0 to the PAFD_WSABUF structure. And in this one we find the number of bytes, at offset 0, that are sent and also the pointer, at offset 8, where to find them.

This is all there is, we have now created a network monitoring script for a process.

Demo

Here is a short demo on Youtube that shows the script in action. It shows a typical Tycho set up which uses an analyst’s system (left) on which the script is running. On the right is the target system where the processes to be monitored run.

Summary

With Tycho it is super easy to analyze TCP connections. To get information such as a remote address, port, and transmitted data you don’t need to deal with different API calls in a complicated way, you only need one system call, NtDeviceIoControlFile. In order to understand the system call semantics, we studied structure definitions that are openly available in the microsoft docs and open source projects. In the end, we only had to convert the collected knowledge in a python script. And because the Pytycho library abstracts all complicated operating system details for us, this was a very simple task.

But this is only the beginning. With the knowledge gained and this script as a basis, other functions could be implemented. For example, it would be possible to actively manipulate the transmitted data or to integrate block- and allow-lists for defined addresses/ports to log specific information.


Share this article: