-
Notifications
You must be signed in to change notification settings - Fork 9
05. Analyzing Data
"Along with the BioTracker comes a python tool named "Data Analysis Module" to evaluate CSV output of the trackers. The simplistic user interface and provides an easy way to calculate a number of metrics for arbitrary CSV files by using user annotated columns, which is done automatically for BioTracker conformate output. It can then calculate metrics like speed, inter-individual distance and transfer entropy for all individuals, pairwise respectively and write them back to new CSV files. There is also the option to filter arbitrary columns.
Loading a CSV file for analysis
- "Browse" will open a file selction dialogue which allows the user to choose a raw data file. Only accepted type is CSV.
- "File deliminator" column seperator of the CSV file. By default this is a comma. Other accepted seperator are semicolon or tab.
- With "Skip rows" the user can choose to skip the first n rows of the raw data file. By default this value is set to 0 (no skip). Lines beginning with # are interpreted as comments and will always be skipped. Also any column names or headers will be ignored as well as rows with missing values. Missing values means, if there is less values than indicated by the header. Empty, i.e. no text between two separators, is not considered missing.
- If "view" is unchecked the data display is skipped and data is loaded automatically using the latest defined column numbers/names. This option should only be used when analyzing csv files with the same column order because loading a csv file with the wrong column order may cause errors. It is generally recommended to have 'view' checked and not skip the column selection.
- "Load" opens a display of the raw data if ‘view’ is checked, where the user can assign names to the columns (see 1.1). If ‘view’ is unchecked the data is loaded directly to the interface.
Selecting the format of the CSV file.
1.1. Display section: shows the raw data, excluding comment lines, headers, and rows the user decided to skip. Only the first N lines will be shown to avoid delays.
1.2. Column selection area: allows to attribute columns of the original datafile to the relevant variables. In the current example this means that the frame numbers are in column 1, time-stamps in milliseconds in column 2, the x-position-component of the tracked animal in column 7 etc. To change these settings see 1.3.1 and 1.3.2
1.3. Change and Finalize area:
- "Add parameters" opens a menu where the time and angle format can be set (see 1.3.1)
- "Add/Remove Agents" opens a menu where agent can be added removed and renamed
- "OK" Loads the selected columns for analysis, saves settings and returns to the main window.
Setting miscallenous options for the to be loaded CSV file
1.3.1
Check boxes allow to set the desired time format (datetime, milliseconds or seconds) and angle format (dregree (deg) or radiant (rad) ). Also additional categories can be selected but are currently not processed. Clicking "Apply" will save these settings and return to the table view.
Selecting amount and names of agents.
1.3.2
The line called "Agents" allows to specify the numbers of agents to be analysed. Clicking the "Change" Button enables the user to give meaningful names to these agents, otherwise the will have the default names agent0, agent1, etc. Clicking "OK" saves the changes and returns to table window.
View for changing start and stop time of observed data.
This window displays start time, stop time and duration of experiment (in second and frames) and allows to change the relevant time range. Independent of the selected time format the display here only used seconds (s).
"Change" Opens a window where start and stop time can be adjusted with a slider:
Example of plotting an agents speed over time
Display the minimal and maximal x and y values of the raw data and allows to change the relevant region. All displayed values are interpreted as cm.
"Change" allows to edit the boundary values to e.g. perform analysis in a subregion.
"Add Subregions" opens a window where subregions of the main area can be defined. For those areas the same analysis is performed as for the main region, results appear as additional rows in info.csv (see Results section).
"Select Filter" allows to select a filter (currently only median filter with k =5 ).
"Apply Smoothing" applied the selected filter component to the x and y component of each agent.
Allows to generate plots of the data with respect to the currently selected spatial and temporal boundaries. The plots will show in a separate window and can be edited and saved from there.
1st dropdown menu allows to select the type of plot to be generated. Available options are: "Trajectory", "Timeline", "Histogramm" and "Boxplot".
2nd dropdown menu depends on the selection in the first. For example Timelines are available for the parameters "speed", "distance" and "angle".
"Inspect" opens a display window with the desired plot.
"Options" opens a window with three tabs: “Plots”, “Folders” and “Transfer Entropy”
- “Plots” allows the user to select plots that will be automatically saved with the results files. Options are all, none or individual selection
- “Folders” provides an overview of currently used default folders e.g. for data or results
- “Transfer Entropy” allows the user to select whether she wants to calculate Transfer Entropy for the currently selected data. Since this calculation is only applicable for two agents and can require considerable time and computing power, it must be explicitly selected by the user.
"Save" opens a folder selection dialogue allowing the user to specify a results directory. A folder called "BioTrackerAnalysis" + current date / number ( e.g BioTrackerAnalysis_2018_02_22/008) will be created. This files in this folder are (1) timelines.csv (2) info.csv (3) plots. For further description see below.
Info.csv contains basic data and parameters concerning the experiment setup, single agents and pairs of agents. In the following these results are described in more detail:
1.a, general information
Source | Name of the original datafile |
---|---|
x_min | Minimum value of all x-positions |
x_max | Maximum value of all x-positions |
y_min | Minimum value of all y-positions |
y_max | Maximum value of all y-positions |
start | Start time in seconds |
stop | Stop time in seconds |
filtered | ‘True’ or ‘False’ depending on |
whether iltering was performed |
1.b. Agent specific informattion
trajectory_length | Total length of trajectory covered |
---|---|
speed_mean | Mean of agent’s speed |
speed_var | Variance of agent’s speed |
speed_min | Minimum value of agents speed |
speed_25% | 25 percentile of agent’s speed |
speed_median | Median of agent’s speed (i.e. 50 percentile) |
speed_75% | 75 percentile of agent’s speed |
speed_max | Maximum value of agents speed |
1.c. Information about pairs of agents
Key | Interpretation |
---|---|
dist_mean | Mean of distance |
dist_var | Variance of distance |
dist_min | Minimum of distance |
dist_25% | 25 percentile of distance |
dist_median | Median of distance (i.e. 50 percentile) |
dist_75% | 75 percentile of distance |
dist_max | Maximum value of distance |
closer_5cm_(s) | Time the agents spent closer to each |
other than a threshold distance | |
(default: 5, 10, 15 and 20 cm) | |
closer_5cm_(%) | Percentage of the selected time range |
that agents spent closer to each other | |
than a threshold distance | |
(same default values as above) | |
Correlation of speeds | Not yet implemented |
- Timelines.csv contains for each frame in the selected spatial and temporal range (as determined by x_min, x_max, y_min, y_max, start and stop in info.csv) the following values:
frame | Number of current frame |
---|---|
time | Timestamp of current frame in the original format |
seconds | Timestamp of current frame in seconds |
agent_x | X-Coordinate of agents position (using agents given name) |
agent_y | Y-Coordinate of agents position (using agents given name) |
agent_angle | Current angle of agent in the selected format i.e rad or deg |
agent_vx | X-Component of agents velocity |
agent_vy | Y-Component of agents velocity |
agent_speed | Agents speed calculated asagent_vx² + agent_vy². |
agent1/agent2_dist | Distance between two agents calculated via |
sqrt((agent1_x-agent2_x)²)+sqrt((agent1_y- agent)²) |