This post has moved. So far, we've implemented reporting for
Python exceptions that bubble
out of the main loop,
C++ exceptions that bubble into Python (and then
out of the main loop), and
structured exceptions that bubble into
Python (and then out of the main loop.) This is a fairly
comprehensive set of failure conditions, but there's still a big piece
missing from our reporting.
Imagine that you implement this error reporting and have customers try
the new version of your software. You'll soon have a collection of
crash reports, and one thing will stand out clearly. Without the
context in which crashes happened (call stacks, variable values,
perhaps log files), it's very hard to determine their cause(s). And
without determining their cause(s), it's very hard to fix them.
Reporting log files are easy enough. Just attach them to the error
report. You may need to deal with privacy concerns or limit the size
of the log files that get uploaded, but those are straightforward
problems.
Because Python has
batteries
included, grabbing the call stack from a Python exception is
trivial. Just take a quick look at the
traceback
module.
Structured exceptions are a little harder. The structure of a call
stack on x86 is machine- and sometimes compiler-dependent.
Fortunately, Microsoft provides an API to dump the relevant process
state to a file such that it can be opened in
Visual
Studio or
WinDbg,
which will let you view the stack trace and select other data. These
files are called minidumps, and they're pretty small. Just call
MiniDumpWriteDump
with the context of the exception and submit the generated file with your crash
report.
Grabbing a call stack from C++ exceptions is even harder, and maybe
not desired. If you regularly use C++ exceptions for communicating
errors from C++ to Python, it's probably too expensive to grab a call
stack or write a minidump every single time. However, if you want to
do it anyway, here's one way.
C++ exceptions are implemented on top of the Windows kernel's
structured exception machinery. Using the try and
catch statements in your C++ code causes the compiler to
generate SEH code behind the scenes. However, by the time your C++
catch clauses run, the stack has already been unwound.
Remember
that SEH has three passes: first it runs filter expressions until it
finds one that can handle the exception; then it unwinds the stack
(destroying any objects allocated on the stack); finally it runs the
actual exception handler. Your C++ exception handler runs as the last stage,
which means the stack has already been unwound, which means you can't
get an accurate call stack from the exception handler. However, we
can use SEH to grab a call stack at the point where the exception was
thrown, before we handle it...
First, let's determine the SEH exception code of C++ exceptions
(WARNING, this code is compiler-dependent):
int main() {
DWORD code;
__try {
throw std::exception();
}
__except (code = GetExceptionCode(), EXCEPTION_EXECUTE_HANDLER) {
printf("%X\n", code);
}
}
Once we have that, we can write our exception-catching function like
this:
void throw_cpp_exception() {
throw std::runtime_error("hi");
}
bool writeMiniDump(const EXCEPTION_POINTERS* ep) {
// ...
return true;
}
void catch_seh_exception() {
__try {
throw_cpp_exception();
}
__except (
(CPP_EXCEPTION_CODE == GetExceptionCode()) && writeMiniDump(GetExceptionInformation()),
EXCEPTION_CONTINUE_SEARCH
) {
}
}
int main() {
try {
catch_seh_exception();
}
catch (const std::exception& e) {
printf("%s\n", e.what());
}
}
Now we've got call stacks and program state for C++, SEH, and Python
exceptions, which makes fixing reported crashes dramatically easier.
Next time I'll go into more detail about how C++ stack traces work,
and we'll see if we can grab them more efficiently.