Before diving deep into the analysis of unknown malware, some basic knowledge about its behavior is required. As a starting point, it is useful to observe the files the malware touches and changes. Tycho can help to automate the observation of file creation and modification, giving the malware analyst a good overview of its behavior. In this blog entry, I will show you how to build a file tracker with Tycho.
Preparation
How is a file opened, created, read, written or deleted on Windows?
Windows uses seven system calls for these tasks.
Whenever a program tries to touch a file NtOpenFile
or NtCreateFile
is used
to open an existing file or to create a new file. A successful system call can
be identified by a return value of 0
. A new file handle value of type FileHandle
was created. The file handle is used as a reference value for the
corresponding opened file whenever the file is used.
After the file handle is
created, NtReadFile
or NtWriteFile
is called with the file's handle, to read
or write the file respectively. NtSetinformationFile
also uses the handle to
delete a file. In the last step, the system call NtClose
releases the
file handle. NtDeleteFile
is a legacy system call using the
ObjectAttributes
to delete a file forcefully without requiring a file handle.
What is required:
For this task, we need a typical Tycho Setup and the program from which we want to see which files it will modify e.g. Pafish. It is useful to have read my previous blog entry to understand how to read information from a system call by using Tycho.
Tycho implementation
The following sections explain how a Tycho script looks like that is able to track file interaction of a Windows process. A detailed explanation of chosen system call handling functions is given.
File handle management
In the first section, I described how Windows modifies files with system calls.
To allow the file tracker to track files, we have to associate files with file handles.
In order to maintain a list of such associations, we can use a python dictionary
object that maps from file handles to file system paths. This dictionary is called handle_dict
and is used as
a parameter for the handle_*
functions, that are described in the next
section. To retrieve the file handle and the file we use Tycho’s system call
interpretation.
For NtOpenFile
the cropped interpretation output would look
like this.
{'after': {'FileHandle': {'type': 'HANDLE*', 'value': 0x8F258}},
'before': {'ObjectAttributes': {'tracked': {'type': 'OBJECT_ATTRIBUTES',
'value': {'ObjectName': {'tracked': {'type': 'UNICODE_STRING',
'value': {'content': u'\\??\\C:\\Windows'}},
'type': 'UNICODE_STRING*',
'value': 0x8F1C8}}},
'type': 'OBJECT_ATTRIBUTES*',
'value': 586216i}},
'name': 'NtOpenFile',
'return_value': 0L}
In the example, you can see that in the after
parameter there is a
pointer to the FileHandle
and in the before
parameter you can find the file path under the key content
.
System call handling
Evaluating some selected system calls individually is done by a specific
handle_*
-function. The *
referring to the name of the system call. Overall,
these functions are very similar, as they use the same parameters, update
handle_dict
accordingly and print all relevant data. Because NtCreateFile
and NtOpenFile
produce a new file handle, they have it as an output parameter.
All the other file related system calls require a file handle for operation,
therefore they receive it as an input parameter. The system call handler
functions need to extract those file handle correctly depending on the kind
of system call.
The following code snippet shows the implementation of the handle_nt_open_file()
function.
def handle_nt_open_file(sys_call_param, process, ts, hd):
""" Check if the system call was executed successfully. Save the name and
the handle given by the system call in the handle_dict dictionary. Print
the handle, the time when the file was opened and the path to the file.
Keyword arguments:
process : tycho process
sys_call_param : The return value from handle_syscall_event with
in / out_parameters, return_value and name
ts : timestamp
hd : handle_dict with key: handle value: file path
"""
name = sys_call_param["name"]
before = sys_call_param["before"]
after = sys_call_param["after"]
s_return = sys_call_param["return_value"]
if name == "NtOpenFile" and s_return == 0:
h_address = after["FileHandle"]["value"]
new_handle = struct.unpack("<Q", process.read_linear(h_address, 8))[0]
file_path = before["ObjectAttributes"]["tracked"]["value"]\
["ObjectName"]["tracked"]["value"]["content"]
hd[new_handle] = file_path
print_event(name, ts, new_handle, hd[new_handle])
The system call interpretation provides us with the address of the file handle.
We use Tycho's process.read_linear()
function to extract the actual value from
the memory of our target process.
h_address = after["FileHandle"]["value"]
new_handle = struct.unpack("<Q",process.read_linear(h_address, 8))[0]
The file path itself is automatically provided by the system call interpretation and can be retrieved the following way.
file_path = before["ObjectAttributes"]["tracked"]["value"]["ObjectName"]["tracked"]["value"]["content"]
Finally, we add the new key (new_handle
) / value (file_path
) pair to the
handle_dict
and print the output.
hd.update({new_handle: file_path})
print_event(name, ts, new_handle, hd[new_handle])
To show another example, following is the implementation of handle_nt_write_file()
.
def handle_nt_write_file(sys_call_param, process, ts, hd):
""" Check if the system call was executed successfully and if the handle
is present in handle_dict. Print the handle, the time when it
was written in the file and the path to the file.
Keyword arguments:
process : tycho process
sys_call_param : The return value from handle_syscall_event with
in / out_parameters, return_value and name
ts : timestamp
hd : handle_dict with key: handle value: file path
"""
if name == "NtWriteFile" and s_return == 0:
file_handle = before["FileHandle"]["value"]
if file_handle in hd:
print_event(name, ts, file_handle, hd[file_handle])
Because NtWriteFile
receives the file handle as an input parameter, we can directly
retrieve its value from the system call interpretation.
Now we test if this file_handle
already exists in handle_dict
.
If it is present, the file_handle
and the corresponding
file path is printed.
For the other system calls, the behavior of the handle functions is similar.
Result
After a few minutes we get a result like this for Pafish:
[11:16:26]: handle: 176 NtCreateFile : pafish.log
[11:16:27]: handle: 176 NtWriteFile : pafish.log
[11:16:27]: handle: 176 NtClose : pafish.log
[11:16:27]: handle: 176 NtCreateFile : hi_CPU_VM_rdtsc_force_vm_exit
[11:16:27]: handle: 176 NtClose : hi_CPU_VM_rdtsc_force_vm_exit
[11:16:29]: handle: 176 NtCreateFile : pafish.log
[11:16:30]: handle: 176 NtWriteFile : pafish.log
[11:16:30]: handle: 176 NtClose : pafish.log
[11:16:30]: handle: 176 NtCreateFile : hi_sandbox_mouse_act
[11:16:31]: handle: 176 NtClose : hi_sandbox_mouse_act
[11:16:32]: handle: 208 NtOpenFile : \??\C:\
[11:16:32]: handle: 208 NtClose : \??\C:\
[11:16:33]: handle: 208 NtCreateFile : hi_sandbox_drive_size2
[11:16:34]: handle: 208 NtClose : hi_sandbox_drive_size2
[11:16:34]: handle: 208 NtCreateFile : pafish.log
[11:16:35]: handle: 208 NtWriteFile : pafish.log
[11:16:35]: handle: 208 NtClose : pafish.log
[11:16:35]: handle: 208 NtCreateFile : hi_sandbox_NumberOfProcessors_less_2_raw
[11:16:36]: handle: 208 NtClose : hi_sandbox_NumberOfProcessors_less_2_raw
[11:16:36]: handle: 208 NtCreateFile : pafish.log
[11:16:36]: handle: 208 NtWriteFile : pafish.log
[11:16:37]: handle: 208 NtClose : pafish.log
[11:16:37]: handle: 208 NtCreateFile : hi_sandbox_NumberOfProcessors_less_2_GetSystemInfo
[11:16:37]: handle: 208 NtClose : hi_sandbox_NumberOfProcessors_less_2_GetSystemInfo
[11:16:37]: handle: 212 NtCreateFile : \??\Nsi
[11:16:38]: handle: 216 NtOpenFile : \??\C:\Windows\SysWOW64\dhcpcsvc6.DLL
[11:16:39]: handle: 216 NtClose : \??\C:\Windows\SysWOW64\dhcpcsvc6.DLL
[11:16:39]: handle: 264 NtOpenFile : \??\C:\Windows\SysWOW64\dhcpcsvc.DLL
[11:16:39]: handle: 264 NtClose : \??\C:\Windows\SysWOW64\dhcpcsvc.DLL
...
It's a very easy way to see correlations or a specific behavior e.g.
- deletion of files with certain extensions
- creation of deep file paths
- newly created files
- downloaded files
Summary
By applying small changes you can easily extend the functionality from this file tracker e.g.
- You can also output the content of
NtWriteFile
and read files that only existed for a short time. - Limit analysis to specific folders or files.
- Add system calls to see which registry changes have been made.
- Adding a function to create a log file.