Releases: bullet-db/bullet-storm
Extra submit API
This release adds a new StormUtils.submit that takes a org.apache.storm.Config object in addition so you can inject a modified Config when submitting a custom topology.
1.0 First Major Release - Storm 2.0, Expressions, no JSON, PubSub changes, Storage, Replay!
This release updates Bullet Storm to the major 1.0 release and is intended to be used with other components that are 1.0+. It updates Bullet Core to 1.2.0 and Bullet DSL to 1.0.0.
There are a lot of minor bug fixes and major interface changes. The following highlights the most important changes.
- Bullet Storm will no longer be providing a Storm 0.10 artifact. This release migrates to the latest and greatest Storm 2.2 and going forward, only Storm 2.0+ will be supported. This artifact should still work in Storm 1.0 clusters but definitely will not on older clusters.
- Query Parsing and errors related to that no longer happen in Bullet Core and therefore, not in the backend. Only the BQL interface is supported and BQL parsing and error handling is handled at the API layer. It is assumed that all queries entering the backend are valid and executable. There can still be runtime errors associated with the query but they are handled appropriately. The various defaults provided to the backend with respect to queries are now provided to the Web Service instead and handled at the time of BQL query parsing.
- The PubSub no longer transfers JSON queries and instead sends serialized binary objects for queries.
- A Storage layer has been added that can be configured in the Web Service and in the backend. This layer lets you store queries from the Web Service, which can then be re-read for resiliency. In the future, this layer will be used for more such as state for intermediate results etc! You can configure this by providing the `bullet.storage.class.name" with a path to specific StorageManager of your choice. At this point, we do not have any practical storages implemented other than a few memory based ones for testing (these obviously do not communicate with the API so are not practically useful). We plan on adding common storages such as MySQL, Redis etc shortly!
- You can choose to configure replay, which is the mechanism by which the storage is read in order to re-get queries (only queries at this time) when a component like the Filter or the Join bolt fails and requests its queries. Also you can use this to repopulate your topology automatically on startup with existing queries when it is restarted or the cluster is upgraded etc. There is an admin endpoint on the API that can be triggered to force a replay in the backend if necessary.
Various configurations for enabling and tuning this (including replay compression, replay timeouts, replay batch sizes, replay bolt memory and cpu needs, new metrics associated with replays etc can be found in the settings)
0.9.1 (Storm 1.0) Stability and pluggability reconciliation with Bullet DSL 0.1.2
Reconciles the Fat Jar usage with Bullet DSL. This release uses the stable release of DSL.
0.9.1 (Storm 0.10) Stability and pluggability reconciliation with Bullet DSL 0.1.2
This release uses the latest stable release of Bullet DSL 0.1.2. Bullet Storm 0.10 no longer depends on the fat jar for DSL.
Storm version 1.2.2 and below do not allow you to add arbitrary jars when submitting a topology (unless its part of extlibs and global to the Storm cluster). So it is pointless to use the fat jar since for these versions you'll need to build an Uber jar anyway. You can then use the right BulletConnector/BulletRecordConverter/BulletDeserializer pluggable dependencies.
0.9.0 (Storm 1.0) DSL!
This release integrates Bullet Storm with Bullet DSL - the project to let you plug your data into Bullet without having to write code! The DSL works by you specifying a Connector to use to read from your datasource and a Converter to use to convert your read data into BulletRecords, the typed container format understood by Bullet. It also optionally lets you provide a Deserializer to deserialize your read data from the connector. You can read more about what connectors and converters are currently supported over at the Bullet DSL project.
To integrate DSL with Bullet Storm, the following changes were made:
-
A new DSLSpout and DSLBolt were added. You can choose to enable using Bullet DSL in Bullet Storm by enabling the DSLSpout. You can also choose to optionally enable the DSLBolt. If so, the BulletConnector you choose to use will be run in the DSLSpout and the BulletRecordConverter will be run in the separate DSLBolt. If you don't enable the DSLBolt, they will all be run in the DSLSpout. You can use the latter if your reading and converting are light. You can also enable the BulletDeserializer and it will run wherever the BulletConnector is running.
-
Originally, Bullet Storm let you plug in an arbitrary Spout by passing in a command line argument to your Spout implementation to the main class:
com.yahoo.bullet.storm.Topology. Instead, this has been changed to being specified in the BulletStormConfig file to keep all your settings in one place. You can also configure other settings related to that Spout there. -
You can still plug in an arbitrary topology that produces BulletRecords to the FilterBolt by wiring it up yourself and using the StormUtils submission methods as before.
The following new settings were added to support all these. For more information, see the settings:
bullet.topology.dsl.spout.enable: false
bullet.topology.dsl.spout.parallelism: 10
bullet.topology.dsl.spout.cpu.load: 50.0
bullet.topology.dsl.spout.memory.on.heap.load: 256.0
bullet.topology.dsl.spout.memory.off.heap.load: 160.0
bullet.topology.dsl.bolt.enable: false
bullet.topology.dsl.bolt.parallelism: 10
bullet.topology.dsl.bolt.cpu.load: 50.0
bullet.topology.dsl.bolt.memory.on.heap.load: 256.0
bullet.topology.dsl.bolt.memory.off.heap.load: 160.0
bullet.topology.dsl.deserializer.enable: false
bullet.topology.bullet.spout.class.name:
bullet.topology.bullet.spout.args: []
bullet.topology.bullet.spout.parallelism: 10
bullet.topology.bullet.spout.cpu.load: 50.0
bullet.topology.bullet.spout.memory.on.heap.load: 256.0
bullet.topology.bullet.spout.memory.off.heap.load: 160.0Note that to plug in DSL to Storm versions below 1.2.2, you have to build an Uber jar containing the relevant Bullet DSL dependencies (excluding the others) for your data source AND the relevant PubSub you are using. You don't have to write code but you still need to build this artifact. The reason for this is because Storm 1.2 and below do not let you submit anything other than an Uber jar containing all the dependencies unless it is deployed cluster wide.
For Storm versions 1.2.2 and higher, you can pass in additional jars to the topology launching command
using --artifacts. You would pass in the Bullet DSL fat jar (only containing the required Bullet DSL dependencies and not the pluggable ones), the fat jar for your PubSub of choice and the relevant pluggable jars for your DSL connector and converters.
0.9.0 (Storm 0.10) DSL!
This release integrates Bullet Storm with Bullet DSL - the project to let you plug your data into Bullet without having to write code! The DSL works by you specifying a Connector to use to read from your datasource and a Converter to use to convert your read data into BulletRecords, the typed container format understood by Bullet. It also optionally lets you provide a Deserializer to deserialize your read data from the connector. You can read more about what connectors and converters are currently supported over at the Bullet DSL project.
To integrate DSL with Bullet Storm, the following changes were made:
-
A new DSLSpout and DSLBolt were added. You can choose to enable using Bullet DSL in Bullet Storm by enabling the DSLSpout. You can also choose to optionally enable the DSLBolt. If so, the BulletConnector you choose to use will be run in the DSLSpout and the BulletRecordConverter will be run in the separate DSLBolt. If you don't enable the DSLBolt, they will all be run in the DSLSpout. You can use the latter if your reading and converting are light. You can also enable the BulletDeserializer and it will run wherever the BulletConnector is running.
-
Originally, Bullet Storm let you plug in an arbitrary Spout by passing in a command line argument to your Spout implementation to the main class:
com.yahoo.bullet.storm.Topology. Instead, this has been changed to being specified in the BulletStormConfig file to keep all your settings in one place. You can also configure other settings related to that Spout there. -
You can still plug in an arbitrary topology that produces BulletRecords to the FilterBolt by wiring it up yourself and using the StormUtils submission methods as before.
The following new settings were added to support all these. For more information, see the settings:
bullet.topology.dsl.spout.enable: false
bullet.topology.dsl.spout.parallelism: 10
bullet.topology.dsl.bolt.enable: false
bullet.topology.dsl.bolt.parallelism: 10
bullet.topology.bullet.spout.class.name:
bullet.topology.bullet.spout.args: []
bullet.topology.bullet.spout.parallelism: 10Note that to plug in DSL to Bullet Storm 0.10, you have to build an Uber jar containing the relevant Bullet DSL dependencies (excluding the others) for your data source AND the relevant PubSub you are using. You don't have to write code but you still need to build this artifact. The reason for this is because Storm 1.2 and below do not let you submit anything other than an Uber jar containing all the dependencies unless it is deployed cluster wide.
0.8.5 (Storm 1.0) Updates Core to 0.6.4
0.8.5 (Storm 0.10) Updates Core to 0.6.4
0.8.4 (Storm 0.10) Partitioning! Updates Core to 0.6.2
This release migrates to bullet-core-0.6.2 and adds partitioning capability to the FilterBolt. Enable the bullet-core partitioning setting and pick your class (currently the SimpleEqualityPartitioning is available) or implement your own.
One new setting has been added to configure the FilterBolt to control how frequently the partitioning stats are logged. See bullet.topology.filter.bolt.stats.report.ticks in the settings.
0.8.4 (Storm 1.0) Partitioning! Updates Core to 0.6.2
This release migrates to bullet-core-0.6.2 and adds partitioning capability to the FilterBolt. Enable the bullet-core partitioning setting and pick your class (currently the SimpleEqualityPartitioning is available) or implement your own.
One new setting has been added to configure the FilterBolt to control how frequently the partitioning stats are logged. See bullet.topology.filter.bolt.stats.report.ticks in the settings.