User:Andrew Nikitin/TimePlot

From J Wiki
Jump to: navigation, search

Plotting time based signals

It was kind of revelation for me that there are at least 2 different kind of plotting dependencies and the methods and techniques used for one kind are not readily applicable to the other.

First kind is plotting the dependency of one variable from another. Usually it is assumed that there is some kind of functional dependence, possibly hidden by the presence of noise. For example, plots of temperature versus altitude. To deal with these plots Plot is perfectly suitable in most cases.

Another kind is a plot of one or several possibly related variables versus time. In this case there is no functional dependency between time and variables. I have found that Plot when used as is does not perform all functions needed.

This article is an attempt to spell out requirements for such plotting. I see it as an application, that, possibly, uses Plot component.

Why don't you use X?

There are many programs for plotting data both commercial and available for free. I have yet to see one that is suitable for plotting time based data.

There are specialized (and pretty expensive) packages for data acquisition and processing that include some sort of plotting facilities. All of them, while still usable, lack certain features that makes using them a pain.

Having said that, if you know some program that implemets features listed below, please tell me.

  • One feature that is a trait of a good plot component is to display adjustable "smart" time axis. J Plot is still lacking this feature. This is one of the reasons to develop in-house plot components as was in my case once. -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>

Assumptions

Collected data represents inputs, internal states and outputs of some device or group of devices over time.

Each data stream is associated with a single variable. For each data point timestamp (time since beginning of the recording) can be obtained -- either stored directly with data or inferred.

Data may be collected at different time rates or even asynchronously.

Several data streams collected during single session represent behaviour of a device or group of devices and their interaction with environment. The amount of data can be quite large. Say, 500 variables collected every 20ms for several hours (just an example, not an upper limit).

There can be more than one session. This means that once in a while we repeat data acquisition process with same or slightly different set of variables. And over time several of such sessions has accumulated. Sometimes there is a need to pick some old recording and look at it again.

The final plot does not contain the entire time range and all of the variables. Instead, only "interesting" time ranges and "relevant" variables are included. Determining such ranges and sets is integral part of time based plotting.

Features

This alleged plot based signal plotting application should contain the following features.

Metadata

Usually collected data comes in a form of data files, created by data acquisition software. These files may or may not contain metadata: date and time of a session, site where session was performed, name(s) of people involved, versions of hardware and firmware, comments. There should be a way to specify and store all of this and more in some kind of session configuration file. Same config file should also list all the data files associated with the session. Since data files may accidentally be renamed or moved to different place, some kind of file invariant like size, CRC32 or MD5 may be stored to assist search if such need arises.

Syncronization of several sources

Typical session contains just one file. But on rare but sizeable occasion there is more than one file. For example, when 2 different data acquisition systems are used during same session in parallel. Or when one of these systems was stopped, data was saved and then data recording started again -- in a new file. Another example -- data is collected at slow speed, but once in a while, when it seems appropriate, experimenter hits a Trigger button on the oscilloscope and gets extra fine resolution of few selected signals for a short time.

In all these cases each data file will contain timestamp relative to the beginning of that file and there should be a way to adjust it to bring in sync with the rest of the files.

Signal plotting

The main goal of this application is to plot signals as functions of time. Therefore, this should be one of the feature. There should be a way to specify color, line style and thickness. Every plotting program has this ability to some extent, I just spell it out here for completeness.

In addition to these known ways to plot signals, there may be inetersting to consider nonstandard modes, aimed at plotting discrete signals with relatively narrow range. Examples would be boolean variables or state variables (with, say, 4 states).

Boolean variables may be plotted as a strip that has dark color when signal is low and light color when signal is high. The goal here is to use as little of precious screen space, while keeping signal visual.

  • consider "stick" plot type -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>

State variables may have a symbol associated with each state. every time state changes application plots this symbol (on the same level).

Bitmask variables may need yet another mode of plotting. While it is possible to turn a single bitmask variable into separate bits, it may be more convenient to have special mode.

  • these could be transformed at application (or adapter) level to something understood by plot component. For example, change of state can be represented as x-y plot of type marker.

"Step" plotting

Some signals are naturally discrete. For example commanded state of on/off solenoid can be either "on" or "off". When Plot plots such signal in the vicinity of a transition, it interpolates it giving nonsensical values of 0.1, 0.2, ... etc. Alternative is to plot signal in "dot" mode, but this is not always convenient. So, there is a need for third setting -- plot signal as a curve, but when need to plot values between timestamps, use value from earlier timestamp, rather than interpolated value.

  • this is not very clear: may use stick again for on/off as 0/1 or hide Y axis altogether -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>

More than 2 vertical axis

Usually plotted variables have different physical meaning and different ranges. For example, we may want to plot boolean variable together with a value from a pressure sensor ranging from 0 to 1000 kPa together with some resistance ranging from 0 to 1e7 Ohm. When plotted on the same scale, at least 2 of these 3 signals will be undecipherable. Using 2 scales does not solve the problem, since whatever signals you clase right, the third signal will be severely out of range or out of resolution. There should be a way to assign individual scaling to every signal.

This brings a question of how to visualize those scales. One of the solution I've seen in exsiting software is to draw all these scales side by side on the left of the plot. If there is 50 variables there would be 50 scales drawn, leaving very little space to draw data. Some way to disable unnecessary scales and to stack several scales on top of each other would be nice.

While we are dealing with scales, it may be worth to implement non-linear scales and non-linear scaling.

  • J Plot has two Y axis, they can accommodate 2 scales. More scales can be fit too, but the data need to be pre-scaled in application. Visualizing is hence possible for two of the scales to show other scales consider switching them into view (with possible re-scaling of the tertiary data) using application controls on the form -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>
  • How exactly do you fit more than 2 scales? Do you have an example? -- Andrew Nikitin <<DateTime(2007-01-07T23:38:08Z)>>
  • Elementary,
   rescale=: ( (>./ (],-) <./)@[ p. ((>./-<./)%~(-<./))@] )"1
   plot2y=: 4 : ' pd''show''[pd"1 y[pd''y2axis''[pd x[pd''new'' '
   ({. plot2y }.) (2&{. , 1&{ rescale 2&}.) 2 3 4 5 6^~/ 3 * (i.%<:) 101
Timeplot2.png Timeplot1.png
result of above hypothetical multi-axis

-- Oleg Kobchenko <<DateTime(2007-01-08T05:38:48Z)>>

Zoom In/Zoom out/Scrolling

Before plotting we need to find interesting spot. This calls for ways to zoom in the data and then scroll left and right and watch carefully.

Zoom-in and zoom-out in time based plots only zooms X- scale (which, probably, should be called T-scale in this context). I see almost no situation when zooming Y axis is beneficial for time based plots, yet too many plotting packages with interactive capabilities allow proportional zoom for X and Y and do not allow zoom just for X.

Signal database

It takes time to properly select scaling, colors, line style and thickness for different variables. It would be a shame to do it again when you need to plot similar variables again. While many plotting programs allow you to save settings, none of them allow you to conveniently reuse settings from different plots. As mentioned, there is more than one session, so it makes sense to maintain some kind of signal database that for each signal (identified by name) contians color, scaling, curve mode etc. When signal is added from a pool of available in the data file signals, application would look up this database and assign starter values for scaling etc.

This signal database should be independent of the session config file and can be used with many sessions.

Slices

There should be a way to display "slice" -- exact values of some variables at given time. Existing software implemets it as a "cursor" -- vertical line that can be moved left and right and show the values of selected variables at this time in a table.

Commenting

There should be some way to include textual comments on the plot. This usually looks like some explanation text and an arrow that points to one or more points on one or more curves. This textbox and arrows should be displayed only when specified points are in the view, possibly with some minimal required level of detail.

Searching

There is often a need to search existing archive of measurements for some recently discovered phenomenon. First step in such search is to select those datafiles (sessions) that contain necessary variables and were performed under plausible conditions. Second step would be to search each individual dataset for a given events. "Events" are usually complex combinations on constraints and conditions that involve precedence, rate of change, values of the individual variables. Something along the lines of "when measured pressure does not change fast enough for 50 ms after commanded pressure changed for more than 100 kPa in less than a second". There is no way of formalizing such queries, short of writing a verb that scans through the data in search of the condition. Some kind of simple and convenient format independent API should be developed to access the data.

  • this is an interesting feature of plot visualisation. While the searching itself is non-plot, displaying of search results with possibility of pagination, stepping are again interactive features -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>

Comments

  • It's rare or only for simple cases that a plot component can display data "as is". Typically a separation between plot component and application specific functionality is sought. If application reqiurements are not compatible to be an extension to low-level features of plot component, than another component is used, such as even isigraph. See more comments inline. Things maked as "non-plot" are good for general information, but should be distilled for the purpose of this discussion and reformulated in terms of the plot component visibility. -- Oleg Kobchenko <<DateTime(2007-01-07T07:49:36Z)>>
  • Thanks for comments. Want to clarify one confusion here. In my world Plot as is is perfectly normal "application". I can get some data from modeling or experiment, then pull it into J then plot it and see if I can see anything useful in it and if I do, the picture from Plot can be used in presentation. Except, when the data is time based -- then I cannot use plot for the reasons I listed. Here I am not talking about changing plot class to do what I want, rather about making an "application" (if you prefer to call it this way) that heavily relies on Plot for curve display and does what I want. In this context, I considered your "non-plot!" comments as non-value added and removed them. Sorry. -- Andrew Nikitin <<DateTime(2007-01-07T23:38:08Z)>>
  • You are in denial, or one of us at least. You want to think of J Plot as an application. Whereas it's not. It's a component. To be able to tailor it to specific needs, there need an application to be created. As for non-plot issues: it is a way to focus on the point of a general solution, so that it is useful outside of the scope of a particular use case. -- Oleg Kobchenko <<DateTime(2007-01-08T05:38:48Z)>>