@@ -22,6 +22,65 @@ If you already have an older version of DataJoint installed using `pip`, upgrade
2222``` bash 
2323pip3 install --upgrade datajoint
2424``` 
25+ ## Python Native Blobs  
26+ 
27+ For the v0.12 release, the variable ` enable_python_native_blobs `  can be
28+ safely enabled for improved blob support of python datatypes if the following
29+ are true:
30+ 
31+   *  This is a new DataJoint installation / pipeline(s)
32+   *  You have not used DataJoint prior to v0.12 with your pipeline(s)
33+   *  You do not share blob data between Python and Matlab
34+ 
35+ Otherwise, please read the following carefully:
36+ 
37+ DataJoint v0.12 expands DataJoint's blob serialization mechanism with
38+ improved support for complex native python datatypes, such as dictionaries
39+ and lists of strings.
40+ 
41+ Prior to DataJoint v0.12, certain python native datatypes such as
42+ dictionaries were 'squashed' into numpy structured arrays when saved into
43+ blob attributes. This facilitated easier data sharing between Matlab
44+ and Python for certain record types. However, this created a discrepancy
45+ between insert and fetch datatypes which could cause problems in other
46+ portions of users pipelines.
47+ 
48+ For v0.12, it was decided to remove the type squashing behavior, instead
49+ creating a separate storage encoding which improves support for storing
50+ native python datatypes in blobs without squashing them into numpy
51+ structured arrays. However, this change creates a compatibility problem
52+ for pipelines which previously relied on the type squashing behavior
53+ since records saved via the old squashing format will continue to fetch
54+ as structured arrays, whereas new record inserted in DataJoint 0.12 with
55+ ` enable_python_native_blobs `  would result in records returned as the
56+ appropriate native python type (dict, etc).  Read support for python
57+ native blobs also not yet implemented in DataJoint for Matlab.
58+ 
59+ To prevent data from being stored in mixed format within a table across
60+ upgrades from previous versions of DataJoint, the
61+ ` enable_python_native_blobs `  flag was added as a temporary guard measure
62+ for the 0.12 release. This flag will trigger an exception if any of the
63+ ambiguous cases are encountered during inserts in order to allow testing
64+ and migration of pre-0.12 pipelines to 0.11 in a safe manner.
65+ 
66+ The exact process to update a specific pipeline will vary depending on
67+ the situation, but generally the following strategies may apply:
68+ 
69+   *  Altering code to directly store numpy structured arrays or plain
70+     multidimensional arrays. This strategy is likely best one for those 
71+     tables requiring compatibility with Matlab.
72+   *  Adjust code to deal with both structured array and native fetched data.
73+     In this case, insert logic is not adjusted, but downstream consumers
74+     are adjusted to handle records saved under the old and new schemes.
75+   *  Manually convert data using fetch/insert into a fresh schema.
76+     In this approach, DataJoint's create_virtual_module functionality would 
77+     be used in conjunction with a a fetch/convert/insert loop to update 
78+     the data to the new native_blob functionality.
79+   *  Drop/Recompute imported/computed tables to ensure they are in the new
80+     format.
81+ 
82+ As always, be sure that your data is safely backed up before modifying any
83+ important DataJoint schema or records.
2584
2685## Documentation and Tutorials  
2786A number of labs are currently adopting DataJoint and we are quickly getting the documentation in shape in February 2017.
0 commit comments