ptrace

原文

ptrace()函数在*nix系统下提供了独特的功能，允许一个进程对另一个进程查看数据，控制执行。包括读写寄存器，内存数据，和信号。进程可以通过fork()函数来和子进程，或attach一个正在执行的进程，来建立进程间的跟踪和被跟踪关系。这个函数最多的应用是在构建调试器和进程跟踪工具。

关于如何使用ptrace()没有很多在线文档，可能是因为它是一个POSIX里糟糕的系统函数。如果你以前未使用过它，那你将会有一段“难忘”的经历。文档还不算差，不过没有太多细节。

文档里的函数定义：

#include <sys/ptrace.h>
 
long ptrace (enum __ptrace_request request,
             pid_t pid,
             void *addr,
             void *data);

参数：

_ptrace_request request:ptrace操作指令
pid_t pid:目标进程的pid
void* addr:部分ptrace操作需要读取或写入的内存地址
void* data:部分ptrace操作需要读取或写入的数据的地址

ptrace返回一个长整型数，表示ptrace操作执行结果，0表示执行成功，-1表示失败。对于读取数据的操作，表示从目标获取到的数据，-1表示错误。

正如你所见，这不是一个直白的或简单的系统调用。需要根据你具体的需要来使用，而且在很多特殊情况下要考虑输入和输出。首先我们来介绍一下对子任务的ptrace操作。

被跟踪的子进程有两种基本状态：停止态和运行态。ptrace操作不能在正在运行的子进程上执行，因此，需要满足一下条件：

子进程主动停止
父进程手动停止子进程

一般一个进程在接受到SIGSTOP信号后会停止（称作‘T’状态）。但是，当被跟踪时，除SIGKILL外，子进程收到任何信号都会停止，包括希望被忽略的信号。在接受到子进程通过wait()停止的通知后，父进程得以有时间来执行各种ptrace操作，或者通过ptrace告知子进程继续执行，无论是传递或忽略信号导致的停止。

如果父进程希望子进程停止（例如调试器里，在用户输入后，希望其停下），则可以通过常规方法发送一个SIGSTOP信号。除了SIGKILL信号外，任何未使用的信号都可以完成这项工作，但最好避免使用奇怪的信号导致含糊不清。在执行操作之前，要确保子进程被停止，这点很重要，否则ptrace会返回-1，ESRCH错误:”No such process”.

列举一下子进程的stopping，ptrace()-ing,running的状态：

子进程处于running状态
子进程在接受信号（SIGSTIOP/SIGTRAP/other）后停止
父进程通过wait()接受子进程信号
父进程执行各种ptrace操作
父进程发送信号使子进程继续执行

任何ptrace操作在步骤4之外都将失败（不按照以上流程）。确保在尝试使用ptrace之前已经通知子进程被停止。我上面提到的使用wait()来检索子进程的进程状态。这是正确的-就像传统的fork进程那样，跟踪进程使用wait()来接受信号后确定任务状态。事实上，使用waitpid()可能更简单，以便可以精确指定要等待的任务，而不会同时跟踪多个任务/线程。

现在我们来讨论一些更有趣的ptrace代码，我会为每一个操作提供一个简短的代码。任何NULL参数都是该ptrace操作未使用的。首先，是处理启动和终止跟踪子进程的代码。

PTRACE_TRACEME

long ret = ptrace (PTRACE_TRACEME, 0, NULL, NULL);

This is the only ptrace operation which is used by the child. It’s purpose is to indicate that the child task is to be traced by a parent and to grant it necessary ptrace permissions. The 0 in the pid field refers to the child task’s parent. As soon as the child makes a call to any of the exec() functions, it receives a SIGTRAP, at which point it is stopped until the tracing parent allows it to continue. It is important for the parent to wait for this event to happen before performing any ptrace operations, including the configuration operations involved with PTRACE_SETOPTIONS.

这是仅有的子进程使用的ptrace操作。它的目的是指明子进程由父进程跟踪并授予父进程必要的ptrace权限。pid参数是0，表示该进程的父进程。一旦子进程执行任何exec()函数，它会收到一个SIGTRAP信号，此时它会停止，直到跟踪的父进程允许继续。在执行任何ptrace操作之前（包括涉及PTRACE_SETOPTIONS的配置操作），父进程等待该事件是很重要的。

PTRACE_ATTACH

long ret = ptrace (PTRACE_ATTACH, target_pid, NULL, NULL);

This is used by a task when it wishes to trace the execution of another task. For the most part, this will make the process represented by target_pid the literal child of tracing task. By and large, the situation created by using PTRACE_ATTACH is equivalent to what would’ve happened if the child had used PTRACE_TRACEME instead.

An important note is that this operation involves sending a SIGSTOP to the targeted process, and as usual, the parent needs to perform a wait() on target_pid after this call before continuing with any other work to ensure the child has properly stopped.

它被使用在一个任务希望去跟踪另一个执行中的任务的情况下。大多数情况下，它会是target_pid指定的进程成为当前进程的子进程。总的来说，父进程使用PTRACE_ATTACH，就像子进程使用PTRACE_TRACEME一样。

需要注意的是，这个操作会向目标进程发送一个SIGSTOP信号，通常地，父进程需要在这个调用之后执行wait()，然后在执行其他操作，为了确保子进程已经停止。

PTRACE_CONT

long ret = ptrace (PTRACE_CONT, target_pid, NULL, 0);

This will be the request you’ll use each time that wait() indicates that the child has stopped after receiving a signal to get it running again. If the data field is anything besides zero or SIGSTOP, ptrace will figure its a signal number you’d like delivered to the process. This can be used to actually deliver signals to the child which caused it to stop and notify the parent before acting on them. For common signals like SIGTRAP, you probably won’t want to do this. However, if you’d like to see if the child properly handles a SIGUSR1, this would be one way to go about it.

这将是每次使用wait()指示被停止的子进程在接受到信号后再次运行的请求。如果数据字段是除0和SIGSTOP以外的任何内容，则ptrace会让父进程接受到你指定数值对应的信号。它也可以将让子进程停止的信号传递给父进程然后在处理它们前通知父进程。但对于一些普通的信号，像SIGTRAP，你可能并不想处理它。但是如果你想知道子进程是否正确处理了SIGUSR1，这也许是一个解决办法。

PTRACE_DETACH

long ret = ptrace (PTRACE_DETACH, child_pid, NULL, 0);

Completes the tracing relationship between the parent and child, and if the parent attached to the child, “re-parents” the child back to its original parent process. Then it continues the child with a SIGCONT.

Now that we’ve covered the basics of how to get a tracing running, let’s get to some of the more interesting stuff.

结束父子进程间的跟踪关系，如果父进程是附加到子进程上的，那么子进程会回属到原先父进程上。然后用SIGCONT信号使子进程继续执行。

PTRACE_PEEKTEXT | PTRACE_PEEKDATA

long word = ptrace (PTRACE_PEEKDATA, child_pid, addr, NULL);
if (word == -1)
  if (errno)
    fail ();

On GNU/Linux systems, text and data address spaces are shared, so although these two codes would be used interchangeably here, on other UNIX platforms this would not be the case. The purpose of this request is to read words from the child task’s data address space and inspect the values. I mentioned above that peek operations require a little extra effort when detecting errors, which is briefly outlined in the code snippet above. Although ptrace will return -1 for error on a peek operation, -1 may also be the value stored at the provided memory address. Thus, errno must be checked in these situations to ensure an error actually happened.

The utility of this request is obvious – reading values from memory addresses in another task’s address space. If you consider GDB, printing variables or setting breakpoints would all need to use this request.

在GNU/Linux系统上，文本和数据地址空间是共享的，所以虽然这两个代码在这里是可以互换使用的，但是在其他UNIX平台并不可以。该操作的目的是从子进程的数据地址空间获取数据和插入数据。之前提到过，peek操作在错误检查上需要额外注意，这在之前也简单介绍过。虽然ptrace执行失败时会返回-1，但也有可能这-1是从目标内存中获取来的值。因此，这种情况下就要检查是不是真的发生错误。

这个请求的目的很显然，是从目标任务的地址空间获取指定内存地址上的值。

PTRACE_POKETEXT | PTRACE_POKEDATA

long ret = ptrace (PTRACE_POKEDATA, child_pid, addr, new_val);

Conversely to the peek functions, the poke functions do the opposite – write arbitrary values into the memory space of the child task. This is useful if you’d like to examine the change in behavior of the child task given different parameters, or for debugging tasks such as inserting breakpoints. This is turning into a pretty long post, but I can cover how to insert breakpoints into a child task’s address space on a later blog post.

和peek操作相反，poke是将任意值写入子进程的内存空间。你可以通过给定不同参数来检查子进程行为中所做的改变，对于调试任务来说就比如插入断点的行为。

PTRACE_SINGLESTEP

long ret = ptrace (PTRACE_SINGLESTEP, child_pid, NULL, NULL);

The single-step request is actually several operations batched into one. A PTRACE_SINGLESTEP request will execute a single instruction in the child task, then stop the child and notify the parent with a SIGTRAP. The operations involved include setting and removing a breakpoint so that only a single instruction is executed. This can be used to slowly step through the execution of a program, and assist with the usage of the other ptrace operations above. Think “stepi” from GDB.

单步请求实际上是将多个操作合并为一个操作。PTRACE_SINGLESTEP请求将使子进程执行单个指令，然后停止子进程并通过SIGTRAP信号通知父进程。涉及到的操作包括设置和删除一个断点，以便只执行一条指令。这可以用来减缓程序运行，并协助上述的ptrace操作。

PTRACE_GETREGS | PTRACE_SETREGS

#include <sys/user.h>
user_regs_struct regs;
long ret = ptrace (PTRACE_GETREGS, child_pid, NULL, &regs);
#ifdef __x86_64__
regs.rip = 0xdeadbeef;
#elif defined __i386__
regs.eip = 0xdeadbeef;
#endif
ret = ptrace (PTRACE_SETREGS, child_pid, NULL, &regs);

These ptrace requests involve reading and writing the general-purpose register values for the child process. The above example does three things:

Reads the values of all general-purpose registers associated with child_pid
Sets the instruction pointer of the user_regs_struct structure to a not-so-random address
Writes the edited user_regs_struct back to the child, likely causing a crash upon re-execution due to the new instruction pointer setting

Similar functionality is available for the designated floating-point registers as well through the use of PTRACE_GETFPREGS and PTRACE_SETFPREGS.

这些ptrace请求涉及读取和写入子进程的通用寄存器。

上述例子：

读取child_pid进程的所有通用寄存器。
将user_regs_struct结构体中的指令指针设置成指定值。
把user_regs_struct结构里写回目标进程，新指令指针的设置有可能会导致子进程重新运行时崩溃

通过使用PTRACE_GETFPREGS和PTRACE_SETFPREGS，也可以操作浮点寄存器。