In the last article, we have shown how to interrupt a process running in an unpatched Windows system on top of the Cyberus virtualization platform before it executes specific system calls using the Tycho Python API. This time, we demonstrate how to implement a short but useful script that logs which files are accessed by a process of our choice.
Prerequisites
We need the same setup as in the last article Windows system call parameter analysis:
- The typical Cyberus Setup. If you are not familiar with it, please have a look at our blog article Fun with Python and Tycho.
- An executable to analyse. Let's use pafish again.
List of accessed files and devices
In the previous blog post, we used the NtCreateFile
system call to show how easy it is to extract some Windows system call arguments out of a running process.
This time, we will have a closer look into individual arguments in order extract the actual path of the files being accessed for read/write purposes.
Doing this for every NtCreateFile
system call will enable us to collect a list of all accessed files and devices during the execution of our executable pafish.exe
.
Please be aware that files can also be mapped to memory using different system calls. For the sake of simplicity we just concentrate on
NtCreateFile
in this blog post.
Before we know what exactly our script needs to do, we need to research Windows data structures a bit:
NtCreateFile
data structure research
In order to understand how to get file/device paths out of calls to NtCreateFile
, we need to have a look at its signature
It is documented on MSDN:
NTSTATUS NtCreateFile(
_Out_ PHANDLE FileHandle,
_In_ ACCESS_MASK DesiredAccess,
_In_ POBJECT_ATTRIBUTES ObjectAttributes,
_Out_ PIO_STATUS_BLOCK IoStatusBlock,
_In_opt_ PLARGE_INTEGER AllocationSize,
_In_ ULONG FileAttributes,
_In_ ULONG ShareAccess,
_In_ ULONG CreateDisposition,
_In_ ULONG CreateOptions,
_In_ PVOID EaBuffer,
_In_ ULONG EaLength
);
While studying the types of the input parameters, we find that file and device names are stored in the OBJECT_ATTRIBUTES
structure. (POBJECT_ATTRIBUTES
is a type alias of a pointer to an OBJECT_ATTRIBUTES
value)
It is also documented on MSDN:
typedef struct _OBJECT_ATTRIBUTES {
ULONG Length;
HANDLE RootDirectory;
PUNICODE_STRING ObjectName;
ULONG Attributes;
PVOID SecurityDescriptor;
PVOID SecurityQualityOfService;
} OBJECT_ATTRIBUTES, *POBJECT_ATTRIBUTES;
We do quickly find out that the UNICODE_STRING
pointer does actually lead us to the file/device path which is stored as a wide string in the guest memory.
In contrast to structures, strings have dynamic size, so they have to be handled differently.
Therefore, we need to implement our custom handler in a way that it knows and respects the string size while extracting it from memory.
Luckily, the MSDN documentation tells us that a UNICODE_STRING
structure contains both a pointer to the wide string and its actual length:
typedef struct _LSA_UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} LSA_UNICODE_STRING, *PLSA_UNICODE_STRING, UNICODE_STRING, *PUNICODE_STRING;
Implementation
Now, let's implement what we learned. The first imports are similar to the previous post so we will skip their explanation:
import time
import pprint
from pyTycho.syscall_interpreter import interpret_execute_syscall
from pyTycho import tycho
Instead of running a few python code lines in the interactive python shell as in the previous articles, we are writing a standalone script this time. The whole script is accessible here on github.
Additionally, we need some other features from the pyTycho.syscall_interpreter
module which is needed for carving of additional types and values out of system calls.
The information obtained from it can be post processed by registering our own custom handler functions for the system call interpretation phase.
from pyTycho.syscall_interpreter import pointer_tracking_enabled_by_type
from pyTycho.syscall_interpreter import type_specific_handlers
The first few lines are in general the same as in the last article.
We obtain the service
object that allows us to talk to the Tycho service, then we open a process handle to pafish.exe
before it is actually started.
That process handle allows us to interrupt pafish
(or any other app whose executable name we put into the variable file_name
) in the moment it tries to execute its first instruction.
Then we print a wait message until we see our app being scheduled:
service = tycho()
process_handle = service.open_process(file_name)
process_handle.set_break_on_start(True)
while not process_handle.is_running():
time.sleep(1)
print("{} is currently not running".format(file_name))
System call interpretation handler
The Cyberus system call interpretation engine is able to automatically extract pointer target values from system call arguments out of the guest system.
In order to make use of this feature, we need to enable system call breakpointing for the NtCreateFile
system call and add the pointer type POBJECT_ATTRIBUTES
to the pointer tracking list:
breakpoint = process_handle.get_syscall_breakpoint()
breakpoint.add_syscall_whitelist(tycho.syscalls.NtCreateFile)
breakpoint.add_syscall_whitelist(tycho.syscalls.NtTerminateProcess)
breakpoint.enable()
pointer_tracking_enabled_by_type.append("POBJECT_ATTRIBUTES")
We also added
NtTerminateProcess
to the breakpoint white list. The explanation for this follows later.
In the second step, we want to carve the file name string only.
This means that we need to access the UNICODE_STRING
member of the OBJECT_ATTRIBUTES
structure pointed at from the input arguments.
At this point we need to write our own custom handler function that we will then register as a callback in the system call interpretation library. The library assumes the following signature:
def our_callback(process_handle, argument_representation):
# ...
The first parameter is the tycho process handle.
The second parameter is a tuple that contains the type name of the parameter (in our case UNICODE_STRING
) as well as a python dictionary object with carved member values of type instance:
(
UNICODE_STRING, {
"Length" : (ULONG, <length of string>),
"MaximumLength" : (ULONG, <maximum length>),
"Buffer" : (PVOID, <pointer to buffer>),
}
)
Our callback function does not need to return anything.
The implementation shall just extract the file/device path and add it to a global list:
list_of_files = []
def extract_string(process, object_representation):
global list_of_files
typ, value = object_representation
if typ == "UNICODE_STRING" and "Buffer" in value.iterkeys() and "Length" in value.iterkeys():
_, length = value["Length"]
_, address = value["Buffer"]
filename = process.read_linear(address, length)
list_of_files.append(filename.decode("utf-16"))
At first we check if we got the right type.
Then we obtain both the pointer to the wide string as well as its length.
Finally, we can use that information to read the path out of guest memory using the process.read_linear
function.
Since this is a wide string, we need to decode it before finally appending it to our global file list.
Having the handler implemented, we can now register it in the system call interpretation library:
type_specific_handlers.append( ("UNICODE_STRING",extract_string) )
At this point, we have performed all necessary preparation.
We can now write a loop that extracts path information out of every NtCreateFile
system call:
while True:
process_handle.resume()
thread_handle = process_handle.wait_for_breakpoint()
syscall = interpret_execute_syscall(process_handle, thread_handle)
if syscall["num"] == tycho.syscalls.NtTerminateProcess:
break
pprint.pprint(list_of_files)
A look at the abort condition reveals why we also whitelisted the
NtTerminateProcess
system call: We use it to detect the termination ofpafish
.
After pafish
has terminated, we print the list of file and device paths.
Running the whole script shows that some files were accessed more than one time:
[u'pafish.log',
u'pafish.log',
u'pafish.log',
u'pafish.log',
u'hi_CPU_VM_rdtsc_force_vm_exit',
u'pafish.log',
u'hi_sandbox_mouse_act',
u'\\??\\PhysicalDrive0',
u'\\??\\Nsi',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\??\\VBoxMiniRdrDN',
u'\\??\\pipe\\VBoxMiniRdDN',
u'\\??\\VBoxTrayIPC',
u'\\??\\pipe\\VBoxTrayIPC',
u'\\??\\C:\\Windows\\SysWOW64\\de-DE\\MPR.DLL.mui',
u'\\??\\C:\\Windows\\Globalization\\Sorting\\sortdefault.nls',
u'\\??\\C:\\Windows\\SysWOW64\\rsaenh.dll',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\DEVICE\\NETBT_TCPIP_{D0A4D4B8-574B-4FC2-939C-13AE21F36507}',
u'\\DEVICE\\NETBT_TCPIP_{846EE342-7039-11DE-9D20-806E6F6E6963}',
u'\\DEVICE\\NETBT_TCPIP_{D535A6F8-90AF-4DC7-B511-4DD493AD6F6F}',
u'\\??\\HGFS',
u'\\??\\vmci',
u'pafish.log']
Conclusion
The Tycho Python API is capable to extract system call information out of a running process. While doing that, it can also extract additional data using user-provided callback functions.
Based on such insights, users can implement their own high level breakpoints, e.g. that stop a process when it...
- accesses certain paths
- reads/writes specific files
- attempts to communicate with (specific) hosts from the network (for example by interpreting
NtDeviceIoControlFile
system calls) - ...
Of course it is also possible to not only log system call parameters, but also to fake the return values or even deflect the whole call.