To view the visualization you need Python 3 with python libraries pandas, plotly. You can install those using pip3 install pandas plotly
. Then you can evaluate all org-babel blocks doing C-c C-v C-b. Interactive graphic will appear on a browser window.
In this example, we will insert all our data on an org-table and pass it as a variable to pandas. To make the example as simple as possible, we will use the minimal number of columns and use the same names as the export created by org-batch-agenda-csv
macro.
date | volume | notes | liter_per_day | eur_day |
[2015-12-13 Sun 17:14] | 32.5809 | unclear history | 222 | |
[2016-04-08 Fri 16:41] | 60.5927 | unclear history | 239 | |
[2016-04-16 Sat 13:07] | 62.4217 | unclear history | 233 | |
[2016-05-01 Sun 12:59] | 66.8509 | 295 | ||
[2016-06-01 Wed 18:26] | 74.1914 | 235 | ||
[2016-07-02 Sat 16:44] | 80.8214 | 214 | ||
[2016-08-01 Mon 19:17] | 87.3304 | 216 |
Column names have been simplified. This line breaks the import: “|------------------------+---------+-----------------+---------------+---------+” So we delete it!
After checking a couple of Stack Overflow articles, it seems quite simple to import a Org-table into pandas passing it as a variable to the pd.DataFrame function.
Given we are not reading using pd.read_csv to read the CSV, we must convert date and agenda-day to dates manually.
import pandas as pd
timeline = pd.DataFrame(tbl[1:], columns=tbl[0])
# Trying to use this function: timeline['date'] = pd.to_datetime(timeline['date'])
# Returns dateutil.parser._parser.ParserError: Unknown string format: [2015-12-13 Sun 17:14]
# Let's remove the brackets first
timeline['date'] = timeline['date'].str.strip('[]')
# Now, we can convert it to a datetime64 type
timeline['date'] = pd.to_datetime(timeline['date'])
timeline['date'].dtype
Workaround: Org doesn’t export end date, so we rename date as start_date, then add a day for the end_date.
Limitation: Dates must comply with ISO 8601 format. Right now, we can’t add any kind of intervals for times or dates. Pandas will complain and it won’t work.
timeline.rename(columns={'date':'start_date'}, inplace=True)
timeline['end_date'] = timeline['start_date'] + pd.Timedelta(days=1)
# TODO Debug output to be removed in the future
timeline.head()
We will use python library plotly express to display a timeline that includes all given events. Visualization will open in a browser window.
# From plotly documentation
import plotly.express as px
fig = px.timeline(timeline, x_start = 'start_date', x_end ='end_date',y = 'liter_per_day',
title='Org timeline viewer - test2',
hover_data=['start_date','volume','liter_per_day'])
fig.update_xaxes(rangeslider_visible=True)
fig.show()
None
For your particular use case, it might make more sense to use a line graph:
# From plotly documentation
import plotly.express as px
timeline = timeline.sort_values(by='start_date')
fig = px.line(timeline, x = 'start_date', y = 'volume',
title='Volume per date - linegraph - test2',
hover_data=['start_date','volume','liter_per_day'], markers=True)
fig.update_xaxes(rangeslider_visible=True)
fig.show()
None
Or a combination of linegraph plus histogram such as https://plotly.com/python/time-series/#summarizing-timeseries-data-with-histograms
This version is a draft of how a generic function could work, using YASnippet to retrieve parameters.
To test this, you need to create a new snippet using M-x yas-new-snippet
, save it into your snippet library and the execute it using M-x yas-insert-snippet
.
----- Copy from here -----
# Configuration parameters
# start_date_column: ${1:start_date}
# end_date_column: ${2:end_date}
# value_column: ${3:value_column}
# graph_title: ${4:title}
import plotly.express as px
timeline = timeline.sort_values(by='start_date')
fig = px.timeline(timeline, x_start = '$1', x_end ='$2',y = 'value_column',
title='$4',
hover_data=['$1','$3'])
fig.update_xaxes(rangeslider_visible=True)
fig.show()
----- To here -----