Open
Description
software version info
numpy 1.20.3 ,
pandas 1.3.3 ,
bokeh 2.3.3 ,
holoviews 1.14.6 ,
datashader 0.13.0 ,
Description of expected behavior and the observed behavior
I want to make a scatterplot (with color for categories) where there are some sparse points that I still want to see even if there are other regions with much higher density. I think ds.any()
should be the way to go in this case. Unfortunately, when I use dynspread on this plot, the points disappear and I the whole plot that datashader produces gets a strange background-color. (Interestingly this color is not always the same...)
Have a look at the following example:
Complete, minimal, self-contained example code that reproduces the issue
import numpy as np
import pandas as pd
import holoviews as hv
hv.extension('bokeh')
import datashader as ds
from datashader.colors import Sets1to3
from holoviews.operation.datashader import datashade,dynspread,spread
raw_data = [('Alice', 60, 'London', 5) ,
('Bob', 14, 'Delhi' , 7) ,
('Charlie', 66, np.NaN, 11) ,
('Dave', np.NaN,'Delhi' , 15) ,
('Eveline', 33, 'Delhi' , 4) ,
('Fred', 32, 'New York', np.NaN ),
('George', 95, 'Paris', 11)
]
# Create a DataFrame object
df = pd.DataFrame(raw_data, columns=['Name', 'Age', 'City', 'Experience'])
df['City']=pd.Categorical(df['City'])
x='Age'
y='Experience'
color='City'
cats=df[color].cat.categories
# Make dummy-points (currently the only way to make a legend: https://holoviews.org/user_guide/Large_Data.html)
color_key=[(name,color) for name, color in zip(cats,Sets1to3)]
color_points = hv.NdOverlay({n: hv.Points([0,0], label=str(n)).opts(color=c,size=0) for n,c in color_key})
# Create the plot with datashader
points=hv.Points(df, [x, y],label="%s vs %s" % (x, y),)#.redim.range(Age=(0,90), Experience=(0,14))
datashaded1=datashade(points,aggregator=ds.by(color)).opts(width=550, height=480)
datashaded2=datashade(points,aggregator=ds.by(color,ds.any())).opts(width=550, height=480)
dynspread(datashaded1)*color_points+dynspread(datashaded2)*color_points
# spread(datashade(points,aggregator=ds.by(color,ds.any())).opts(width=550, height=480))*color_points
We get the following result: