Thursday, 11 March 2010

Visua Basic Profiler

Profiling can be as complicated as you want it to be. Or as simple as you need.

While everyone else is struggling with Numega and other monsters you caan be quietly productive just by adding a simple module and a few calls to the parts that look as though they need improvement.

FOr most of the problems that most of us have we don't need sophisticated analysis tools and we don't need nanosecond precision. In fact timing precision of a few milliseconds is good enough for most profiling tasks even if the things being timed are of shorter duration than that. The reason is that if the routine kis very short and executed just once then it doesn't need optimizing whereas if it is executed very frequently then the average time will still be non zero because sometimes the counter will tick during the routine. Also it is frequently the cas ethat you have a process that takes, say, five seconds and you need to get it down to two seocnds. This means that even the ten millisecond resolution of gettickcount is good enough to tell you where the bottleneck is.

SO on to the tools.

Requirements

  • Must be able to profile a program that consists of multiple dlls.
  • Analysis need not be real time.
  • Instrumentation code can be added by hand.
  • Analysis must be able show total time spent in each instrumented routine, time spent in the routines it calls, mean, max, min.
  • Analyser must be interactive so that the use can exlore the results
  • A VB IDE add-in would be a good way to add instrumentation code: see VbIde.

Implementation

Data Collection

Create a simple ActiveX executable that exports the following methods:

TraceEnter
Logs the time at which a routine is entered along with the routine name and any extra string data provided by the caller.
TraceExit
Logs the tme at which the routine exits.
SaveLog
Appends the log to a text file and clears the log memory.

Even without any special analysis software you can use the log to identify bottlenecks.

Analysis

A simple UI that displays the log as a tree view. Selecting a node should display the details.

One problem that the analyser must solve is that a routine that exits by raising an error will not have recorded the exit time. Apartial, and Ithink usable, solution is simply to interpolate the time by using the same time as the exit time of thye calling routine.

A more srious problem isn thyat it will appear from the log that the routine following the one that bexited by raising an errorn was actually called fromm the failed routine when in fact it was called by the caller of the failed routine.

A log looks something like this:

+a
  +b
    +c
    -c
    +d
    -d
  -b
-a

A calls b which calls c and then d. No analysis problems there. But if c fails it will not record its exit and the log will look like:

+a
  +b
    +c
      +d
      -d
    -b
  -a

I think that the solution might be to read the log backwards. Because entry and exit must be balanced and an entry log never goes missing we can detect the problem if we read from the end because we see that b exited so that means that the previous line must be either something else that exited that was called from b or it must be the entry of b itself. So when we encounter the entry log for c preceding the entry log for d we know that c must have failed because at that point the only legal item is an exit from something called by b or the entry log for b.

So we can interpolate an exit log for c and give it the same time as the entry of d. This will of course overestimate the time in c; this may not matter as it will draw attention to that routine and perhaps cause the programmer to look more closely at its behaviour.

Statistics

Once you have read the log and created objects for each of the calls it is a simple matter to collect the times for all calls to a given routine. This can be summarised in a list view. The list view control also lets you sort by column so you can find the most often called routine, the one that uses the most time, etc.

Coverage

It should be quite straightforard to compare the log with the source code and list the procedures that have not been executed. This can help you create both unit and integration tests.

Integration

Adding the Trace calls is tedious and time consuming. It is also very easy to use the wrong name in the calls resulting in meaningless or misleading logs. These problems are easily avoided by a simple Add-In that add the necessary code and removes it when asked.

A further convenience is to wrap the calls in compile time if statements:

  #If DoTrace Then
    TraceEnter "Main"
  #End If

This allows the tracing to be turned of by defining DoTrace=0 in the properties of the project that you are tracing.

The Add-In can be further refined to add tracing only to modules and procedures that meet certain requirements. Some obvious candidates are to not add trace calls to shared code nor to routines that are short wrappers around other calls ; shared code is code that is not in the same folder sub-tree as the VBP file.

No comments:

Post a Comment

Blog Archive

Followers