File tracking with Tycho

2019-08-20

by Sebastian Manns

Before diving deep into the analysis of unknown malware, some basic knowledge about its behavior is required. As a starting point, it is useful to observe the files the malware touches and changes. Tycho can help to automate the observation of file creation and modification, giving the malware analyst a good overview of its behavior. In this blog entry, I will show you how to build a file tracker with Tycho.

Preparation

How is a file opened, created, read, written or deleted on Windows?

Windows uses seven system calls for these tasks.

  1. NtOpenFile
  2. NtCreateFile
  3. NtWriteFile
  4. NtReadFile
  5. NtClose
  6. NtDeleteFile
  7. NtSetinformationFile

Whenever a program tries to touch a file NtOpenFile or NtCreateFile is used to open an existing file or to create a new file. A successful system call can be identified by a return value of 0. A new file handle value of type FileHandle was created. The file handle is used as a reference value for the corresponding opened file whenever the file is used. After the file handle is created, NtReadFile or NtWriteFile is called with the file's handle, to read or write the file respectively. NtSetinformationFile also uses the handle to delete a file. In the last step, the system call NtClose releases the file handle. NtDeleteFile is a legacy system call using the ObjectAttributes to delete a file forcefully without requiring a file handle.

What is required:

For this task, we need a typical Tycho Setup and the program from which we want to see which files it will modify e.g. Pafish. It is useful to have read my previous blog entry to understand how to read information from a system call by using Tycho.

Tycho implementation

The following sections explain how a Tycho script looks like that is able to track file interaction of a Windows process. A detailed explanation of chosen system call handling functions is given.

File handle management

In the first section, I described how Windows modifies files with system calls. To allow the file tracker to track files, we have to associate files with file handles. In order to maintain a list of such associations, we can use a python dictionary object that maps from file handles to file system paths. This dictionary is called handle_dict and is used as a parameter for the handle_* functions, that are described in the next section. To retrieve the file handle and the file we use Tycho’s system call interpretation. For NtOpenFile the cropped interpretation output would look like this.

{'after': {'FileHandle': {'type': 'HANDLE*', 'value': 0x8F258}},
 'before': {'ObjectAttributes': {'tracked': {'type': 'OBJECT_ATTRIBUTES',
                                             'value': {'ObjectName': {'tracked': {'type': 'UNICODE_STRING',
                                                                                  'value': {'content': u'\\??\\C:\\Windows'}},
                                                                      'type': 'UNICODE_STRING*',
                                                                      'value': 0x8F1C8}}},
                                 'type': 'OBJECT_ATTRIBUTES*',
                                 'value': 586216i}},
 'name': 'NtOpenFile',
 'return_value': 0L}

In the example, you can see that in the after parameter there is a pointer to the FileHandle and in the before parameter you can find the file path under the key content.

System call handling

Evaluating some selected system calls individually is done by a specific handle_*-function. The * referring to the name of the system call. Overall, these functions are very similar, as they use the same parameters, update handle_dict accordingly and print all relevant data. Because NtCreateFile and NtOpenFile produce a new file handle, they have it as an output parameter. All the other file related system calls require a file handle for operation, therefore they receive it as an input parameter. The system call handler functions need to extract those file handle correctly depending on the kind of system call.

The following code snippet shows the implementation of the handle_nt_open_file() function.

def handle_nt_open_file(sys_call_param, process, ts, hd):
""" Check if the system call was executed successfully. Save the name and
    the handle given by the system call in the handle_dict dictionary. Print
    the handle, the time when the file was opened and the path to the file.

    Keyword arguments:
    process        : tycho process
    sys_call_param : The return value from handle_syscall_event with
                     in / out_parameters, return_value and name
    ts             : timestamp
    hd             : handle_dict with key: handle value: file path
    """

    name = sys_call_param["name"]
    before = sys_call_param["before"]
    after = sys_call_param["after"]
    s_return = sys_call_param["return_value"]

    if name == "NtOpenFile" and s_return == 0:

        h_address = after["FileHandle"]["value"]
        new_handle = struct.unpack("<Q", process.read_linear(h_address, 8))[0]
        file_path = before["ObjectAttributes"]["tracked"]["value"]\
            ["ObjectName"]["tracked"]["value"]["content"]
        hd[new_handle] = file_path
        print_event(name, ts, new_handle, hd[new_handle])

The system call interpretation provides us with the address of the file handle. We use Tycho's process.read_linear() function to extract the actual value from the memory of our target process.

h_address = after["FileHandle"]["value"]
new_handle = struct.unpack("<Q",process.read_linear(h_address, 8))[0]

The file path itself is automatically provided by the system call interpretation and can be retrieved the following way.

file_path = before["ObjectAttributes"]["tracked"]["value"]["ObjectName"]["tracked"]["value"]["content"]

Finally, we add the new key (new_handle) / value (file_path) pair to the handle_dict and print the output.

hd.update({new_handle: file_path})
print_event(name, ts, new_handle, hd[new_handle])

To show another example, following is the implementation of handle_nt_write_file().

def handle_nt_write_file(sys_call_param, process, ts, hd):
""" Check if the system call was executed successfully and if the handle
    is present in handle_dict. Print the handle, the time when it
    was written in the file and the path to the file.

    Keyword arguments:
    process        : tycho process
    sys_call_param : The return value from handle_syscall_event with
                     in / out_parameters, return_value and name
    ts             : timestamp
    hd             : handle_dict with key: handle value: file path
    """

    if name == "NtWriteFile" and s_return == 0:
        file_handle = before["FileHandle"]["value"]
        if file_handle in hd:
             print_event(name, ts, file_handle, hd[file_handle])

Because NtWriteFile receives the file handle as an input parameter, we can directly retrieve its value from the system call interpretation.

Now we test if this file_handle already exists in handle_dict. If it is present, the file_handle and the corresponding file path is printed.

For the other system calls, the behavior of the handle functions is similar.

Result

After a few minutes we get a result like this for Pafish:

[11:16:26]: handle:    176 NtCreateFile         : pafish.log
[11:16:27]: handle:    176 NtWriteFile          : pafish.log
[11:16:27]: handle:    176 NtClose              : pafish.log
[11:16:27]: handle:    176 NtCreateFile         : hi_CPU_VM_rdtsc_force_vm_exit
[11:16:27]: handle:    176 NtClose              : hi_CPU_VM_rdtsc_force_vm_exit
[11:16:29]: handle:    176 NtCreateFile         : pafish.log
[11:16:30]: handle:    176 NtWriteFile          : pafish.log
[11:16:30]: handle:    176 NtClose              : pafish.log
[11:16:30]: handle:    176 NtCreateFile         : hi_sandbox_mouse_act
[11:16:31]: handle:    176 NtClose              : hi_sandbox_mouse_act
[11:16:32]: handle:    208 NtOpenFile           : \??\C:\
[11:16:32]: handle:    208 NtClose              : \??\C:\
[11:16:33]: handle:    208 NtCreateFile         : hi_sandbox_drive_size2
[11:16:34]: handle:    208 NtClose              : hi_sandbox_drive_size2
[11:16:34]: handle:    208 NtCreateFile         : pafish.log
[11:16:35]: handle:    208 NtWriteFile          : pafish.log
[11:16:35]: handle:    208 NtClose              : pafish.log
[11:16:35]: handle:    208 NtCreateFile         : hi_sandbox_NumberOfProcessors_less_2_raw
[11:16:36]: handle:    208 NtClose              : hi_sandbox_NumberOfProcessors_less_2_raw
[11:16:36]: handle:    208 NtCreateFile         : pafish.log
[11:16:36]: handle:    208 NtWriteFile          : pafish.log
[11:16:37]: handle:    208 NtClose              : pafish.log
[11:16:37]: handle:    208 NtCreateFile         : hi_sandbox_NumberOfProcessors_less_2_GetSystemInfo
[11:16:37]: handle:    208 NtClose              : hi_sandbox_NumberOfProcessors_less_2_GetSystemInfo
[11:16:37]: handle:    212 NtCreateFile         : \??\Nsi
[11:16:38]: handle:    216 NtOpenFile           : \??\C:\Windows\SysWOW64\dhcpcsvc6.DLL
[11:16:39]: handle:    216 NtClose              : \??\C:\Windows\SysWOW64\dhcpcsvc6.DLL
[11:16:39]: handle:    264 NtOpenFile           : \??\C:\Windows\SysWOW64\dhcpcsvc.DLL
[11:16:39]: handle:    264 NtClose              : \??\C:\Windows\SysWOW64\dhcpcsvc.DLL
...

It's a very easy way to see correlations or a specific behavior e.g.

  • deletion of files with certain extensions
  • creation of deep file paths
  • newly created files
  • downloaded files

Summary

By applying small changes you can easily extend the functionality from this file tracker e.g.

  • You can also output the content of NtWriteFile and read files that only existed for a short time.
  • Limit analysis to specific folders or files.
  • Add system calls to see which registry changes have been made.
  • Adding a function to create a log file.

Share this article: