To gain qualitative intuition about a dataset, it is common to visualize
trajectories among few samples.
tunacell provides a
matplotlib-based framework to visualize timeseries as well as the underlying
colony/lineage strutures arising from dividing cells.
In order for the colour-code to work properly, matplotlib must be updated to a version >=2.
In this document we will describe how to use the set of tools defined in
We already saw in the 10 minute tutorial a simple plot of length vs. time in a colony from our numerical simulations. Here we will review the basics of plotting small samples in few test cases.
If you cloned
tunacell repository, there are two ways of executing
quickly the following tutorial.
You may run the script
plotting-samples.py with the following command:
python plotting-samples.py -i --seed 951
The seed is used to select identical samples as the one printed below.
Alternatively it can be run from the root folder using the Makefile:
If you execute one of the commands above, there is no need to run the commands below. Follow the command line explanations and cross-reference it with the following commands to understand how it works. If you didn’t execute the commands above, you can run sequentially the commands below.
For plotting demonstration, we will create a numerically simulated experiment, where the dynamics is sampled on a time interval short enough for the colonies to be of reasonable size. Call from a terminal:
tunasimu -l simushort --stop 120 --seed 167389
In a Python script/shell, we load data with the usual:
from tunacell import Experiment, Parser, Observable, FilterSet from tunacell.filters.cells import FilterCellIDparity from tunacell.plotting.samples import SamplePlot exp = Experiment('~/tmptunacell/simushort') parser = Parser(exp) np.random.seed(seed=951) # uncomment this line to match samples/plots below parser.add_sample(10) # define a condition even = FilterCellIDparity('even') condition = FilterSet(filtercell=even) # define observable length = Observable(name='length', raw='exp_ou_int') ou = Observable(name='growth-rate', raw='ou')
We have defined two observables and one condition used as a toy example.
With these preliminary lines, we are ready to plot timeseries. The main object
to call is
SamplePlot, which accepts the following parameters:
samples, an iterable over
Parserinstance used to parse data,
- the list of conditions (optional).
We already saw how to define instances of the class
Samples can be chosen samples, or random samples from the experiment. We will
review below the different cases with concrete examples from our settings.
We have 10 samples in our
parser, that have been chosen randomly.
Remember that they can also be specified on purpose with the container and
cell identifiers. Once stored in the parser object, they can be addressed by
their index in the table; to check the table of samples, call:
If you used the default settings, you should observe:
index container cell ------- ------------- ------ 0 container_015 3 1 container_087 14 2 container_002 6 3 container_012 12 4 container_096 15 5 container_040 8 6 container_088 14 7 container_007 1 8 container_042 2 9 container_013 5
We start from the basic example initiated in the 10 minute tutorial:
colony = parser.get_colony(0) # any index between 0 and 9 would do
and we call our plotting environment:
colplt = SamplePlot([colony, ], parser=parser, conditions=[condition, ])
The first argument is an
Observable instance, the second the sample(s)
to be plotted, then it is more explicit. Conditions must be given as a list of
FilterSet instances (the list can be left empty).
We start with the default settings and will inspect the role of each parameter:
The figure is stored as the
fig attribute of
colplt.fig.show() # in non-interactive mode, colplt.fig in interactive mode
This kind of plot should be produced:
The default settings for a colony plot display:
- one lineage per row (it comes from keyword parameter
- cell identifiers on top of each cell (
- container and colony root identifiers when they change,
- vertical lines to follow divisions (
Data points are represented by plain markers (
and with underlying, transparent connecting lines for visual help
Title of plot is made from the
The first feature we explore is to visualize whether samples verify a given
condition. To do so, use the
report_condition keyword parameter:
Conditions are labeled according to their representation, this is why we used
fig attribute should store the following result:
Colour can be changed for distinct cells, lineages, colonies, or containers (given in order of priority), or not changed at all.
Changing cell colour¶
colplt.make_plot(length, report_condition=repr(condition), change_cell_color=True)
Changing lineage colour¶
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True)
The default setting is not to superimpose lineages. It is possible to change
this behaviour by changing the
superimpose keyword parameter. Some
keywords are reserved:
'none': do not superimpose timeseries,
'all': superimpose all timeseries into a single row plot,
colony: superimpose all timeseries from the same colony, thereby making as many rows as there are different colonies in the list of samples,
container: idem with container level,
and when an integer is given, each row will be filled with at most that number of lineages.
For example, if we superimpose at most 3 lineages:
colplt.make_plot(length, report_condition=repr(condition), change_lineage_color=True, superimpose=3)
So far our sample was a unique colony. It is possible to plot multiples colonies in the same plot, that can be given as an iterable over colonies:
splt = SamplePlot(parser.iter_colonies(mode='samples', size=2), parser=parser, conditions=[condition, ]) splt.make_plot(length, report_condition=repr(condition), change_colony_color=True)
Here we iterated over colonies from the samples defined in
Now we will switch to the other observable,
ou, which is the instantaneous
splt3.make_plot(ou, report_condition=repr(condition), change_colony_color=True, superimpose=2)
We can also iterate over unselected samples: iteration goes through container files:
splt = SamplePlot(parser.iter_colonies(size=5), parser=parser, conditions=[condition, ]) splt.make_plot(ou, report_condition=repr(condition), change_colony_color=True, superimpose=2)
To get an idea of the divergence of growth rate, it is better to plot all timeseries in a single row plot. We mask markers and set the transparency to distinguish better individual timeseries:
splt.make_plot(ou, change_colony_color=True, superimpose='all', show_markers=False, alpha=.6)
Instead of a colony, or an iterable over colonies, one can use a lineage or an iterable over lineages as argument of the plotting environment:
splt = SamplePlot(parser.iter_lineages(size=10), parser=parser, conditions=[condition, ]) splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True, superimpose='all', alpha=.6)
One can add expectation values for the mean, and for the variance, to be plotted as a line for the mean and +/- standard deviations.
From the numerical simulation metadata, it is possible to compute the mean value and the variance of the process:
md = parser.experiment.metadata # ou expectation values ref_mean = float(md.target) ref_var = float(md.noise)/(2 * float(md.spring))
and then to plot it to check how our timeseries compare to these theoretical values:
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True, superimpose='all', alpha=.5, show_markers=False, ref_mean=ref_mean, ref_var=ref_var)
We sill review the computation of the statistics in the next document, but we
will assume it has been performed for our observable
data_statistics option is used to display results of statistics, which
is useful when no theoretical values exist (most of the time):
splt.make_plot(ou, report_condition=repr(condition), change_lineage_color=True, superimpose='all', alpha=.5, show_markers=False, data_statistics=True)