Peter Fargas
Independent Research & Prototypisation
https://informatik-handwerk.de
Leipzig, Germany
Release date: July 2014
Last update:
Link to authoritative version

Thoughts on Stacktrace and other Process Introspection

☰ Content

When it comes to process introspection, snapshots in time, I did the following observations:

Execution paths

At current industry standard, the complete execution path under introspection is divided into two parts (the second might be missing): the "normal" control flow path and the exception return path. Currently, the return path is usually very branching-factor minimalistic, branching by failure type if at all, but this is only due to the limited understanding of language features.

Both should be recorded - one is traditionally called the stacktrace the other is usually emulated via exception repacking. Common format (and especially for the exception repacking- without repetition) is often needed to sanitize logs and their analysis without need of further tools or reformat.

The coarsness of the paths is on "variable space" (if thinking in blocks*[java has finer scope]) resp. on "function calls" (if thinking in connectivity). Finer resolution could be the "if branching" and "loop unfolding", more coarse view could be on objects, packgages/namespaces, transactions. Observe that this resolution might be dependant on categorical different notions: ignoring or taking into account language features or constructs, code/file structure but as well the logical level can play role. There is as well another dimension which should be noted down: the recorded control-flow path is forgetfull to the depths it traversed (task completed), this is somewhat similar to looking over branch decisions - if logging happens inside some sub-block, the information about location reveales the sub-block; past it, information ceases to be implicite, although this is implicitness of a somewhat more vague kind than on the function-call resolution.

Since the resolution needed is often per-introspection-case, there are various other tools for achieving the deserved coarsity which is not uniform at all throughout the space inspected. The facilities include code stepping, breakpointing, ad-hoc outputs to console or variable space inspection. For the automatized logging, even without knowledge of various logical viewpoints onto code, simplification of logs by successivly increasing resolution towards the point under inspection can be achieved by very simple manners.

Continuing on to a completely different paradigm, within an executiton path there is data or information which participates with the control flow on different levels. Upon failures, depending on the data or control-flow disruptions, the execution path still keeing or to loss ot its stability. Thus from slowly degrading to suddenly influenced. Many times, it is beneficial to continue an execution as far as possible. The usage of notices, warning or error logging levels went somewhere halfways there - such logging is used as independant channels without binding to a specific cause. Traces of execution paths collecting such deteoriation are valuable for speedup of debug (especially the primary one - dynamical-insight&correction of freshly created code) as well as deeper insight into progagation paths of data under tainting, improving code stability, security insights, etc.

Export of variables

As mentioned, a certain variable-space is associated with execution paths. This space (be it understood as a composite of, single variables or well defined subparts of them) is changing in time in various dimensions. The logical as well as the language type can change, the value can change and the structure can change.

Towards the finer, the structure is often of a more statical nature, this reflects the notion of computers being especially well suited for mass operation on commonly structured space. *[It should be understood, that either the space itself or the view of that space is what is so suitably common. I as well would like to mention here, that I seldom make distinction between arrays of different lengths - arrays are either empty or filled - the filling has a common type.] The variable space at a certain moment thus rather strongly reflects the position inside code. This leads to the following insight: between two points in time, the difference in structuring can be relatively easily computed, reducing information overflow. The search for properties where data-structure does not mimick code-structure could be further helpful (the size of arrays, length of strings, magnitude of numbers being probably among such).

Especially in type-benevolent languages, the type is an important determination factor of execution path which will be taken. The export of variables should always thus include not only the structuring of the content, but the type or encoding of structural features. A simple linearization of structure which reveals the typing of terminals (possibly depth limited) can be helpful for quick oversight, inspection, for finding semantically uncommon events or turningpoints. As with structure, differetial comparism is a valuable tool.

Including contents itself into this, comparism of various types, structural-pruning and enriching with delta-values can be incorporated into process traces, as information but as well (preprocess-)assistance for depth-limitation, cutting-out test-covered sub-paths, etc.