Skip to content

Releases: bullet-db/bullet-storm

0.6.2 (Storm 1.0) Fat jar for DRPC PubSub use in Web Service

25 Oct 19:47

Choose a tag to compare

This release builds a far jar with the DRPC PubSub dependencies needed only if you're plugging in the DRPC PubSub into your Web Service. For running the Storm backend, you don't need this since you'll be building a fat jar anyway with your data source plugged in. The classifier to use is fat.

0.6.2 (Storm 0.10) Fat jar for DRPC PubSub use in Web Service

25 Oct 19:46

Choose a tag to compare

This release builds a far jar with the DRPC PubSub dependencies needed only if you're plugging in the DRPC PubSub into your Web Service. For running the Storm backend, you don't need this since you'll be building a fat jar anyway with your data source plugged in. The classifier to use is fat.

0.6.1 (Storm 1.0) DRPC PubSub

18 Oct 22:34

Choose a tag to compare

This adds back the original implementation of Bullet using Storm DRPC redone using the new PubSub architecture. In particular, in QUERY_SUBMISSION context, the publisher writes a PubSubMessage to Storm DRPC using HTTP (async) and the subscriber reads the response back through the Storm DRPC HTTP channel. The payloads are JSON representations of the PubSubMessage.

To use this, set your context accordingly and set:
bullet.pubsub.class.name: "com.yahoo.bullet.storm.drpc.DRPCPubSub"

Alternatively, if you load BulletStormConfig (when building a fat jar with your data reading component plugged in for example), it uses the default settings. Take a look at the new settings that have been added. The explanations for what they do are provided above them as usual.

This release also lowers the default parallelisms for the query spouts and result bolts.

0.6.1 (Storm 0.10) DRPC PubSub

18 Oct 22:33

Choose a tag to compare

This adds back the original implementation of Bullet using Storm DRPC redone using the new PubSub architecture. In particular, in QUERY_SUBMISSION context, the publisher writes a PubSubMessage to Storm DRPC using HTTP (async) and the subscriber reads the response back through the Storm DRPC HTTP channel. The payloads are JSON representations of the PubSubMessage.

To use this, set your context accordingly and set:
bullet.pubsub.class.name: "com.yahoo.bullet.storm.drpc.DRPCPubSub"

Alternatively, if you load BulletStormConfig (when building a fat jar with your data reading component plugged in for example), it uses the default settings. Take a look at the new settings that have been added. The explanations for what they do are provided above them as usual.

This release also lowers the default parallelisms for the query spouts and result bolts.

0.6.0 (Storm 1.0) Use Bullet PubSub

30 Aug 21:23

Choose a tag to compare

Pre-release

This releases makes the changes to Bullet Storm to use our new architecture:

PubSub architecture

Since we do not have a PubSub implementation at this moment, you cannot use this release.

Stay tuned for the release of bullet-kafka for PubSub implementation using Kafka and our addition of a PubSub using Storm DRPC (this is how bullet-storm works right now).

This release removes the DRPCSpout spout, PrepareRequest bolt and the ReturnResults bolt from the Topology. Instead a QuerySpout spout takes the place of the DRPCSpout spout and the PrepareRequest bolt, while a ResultBolt bolt takes the place of the ReturnResults bolt. These two new components read queries from and write results out to the PubSub implementation using our PubSub interfaces. Here is a full list of all the major changes:

  1. Removed DRPCSpout spout, PrepareRequest bolt. Replaced with QuerySpout spout.
  2. Removed ReturnResults bolt. Replaced with ResultBolt bolt.
  3. Messages read from QuerySpout are expected to be instances of PubSubMessage with the query ID and the body.
  4. The ID used to identify queries is now a String (a UUID) generated in the Web Service.
  5. The Return information used in the Join Bolt is now a Metadata instance that is PubSub specific and used by the ResultBolt bolt and your PubSub instance to route the result accordingly.
  6. The submit method now instantiates a PubSub based on your Config.
  7. The bullet.topology.function: "tracer" setting has been removed. It will be re-added when we add in the DRPC PubSub version
  8. All the old settings relating to parallelism, CPU and memory of the DRPCSpout spout, PrepareRequest bolt and ReturnResults bolt have been removed and replaced with the new components' equivalents. See settings for details.

You may implement a PubSub and try it out, however, if you want the end-to-end solution to work, stay tuned for bullet-service to release its corresponding changes to use the PubSub.

0.6.0 (Storm 0.10) Use Bullet PubSub

30 Aug 21:24

Choose a tag to compare

Pre-release

This releases makes the changes to Bullet Storm to use our new architecture:

PubSub architecture

Since we do not have a PubSub implementation at this moment, you cannot use this release.

Stay tuned for the release of bullet-kafka for PubSub implementation using Kafka and our addition of a PubSub using Storm DRPC (this is how bullet-storm works right now).

This release removes the DRPCSpout spout, PrepareRequest bolt and the ReturnResults bolt from the Topology. Instead a QuerySpout spout takes the place of the DRPCSpout spout and the PrepareRequest bolt, while a ResultBolt bolt takes the place of the ReturnResults bolt. These two new components read queries from and write results out to the PubSub implementation using our PubSub interfaces. Here is a full list of all the major changes:

  1. Removed DRPCSpout spout, PrepareRequest bolt. Replaced with QuerySpout spout.
  2. Removed ReturnResults bolt. Replaced with ResultBolt bolt.
  3. Messages read from QuerySpout are expected to be instances of PubSubMessage with the query ID and the body.
  4. The ID used to identify queries is now a String (a UUID) generated in the Web Service.
  5. The Return information used in the Join Bolt is now a Metadata instance that is PubSub specific and used by the ResultBolt bolt and your PubSub instance to route the result accordingly.
  6. The submit method now instantiates a PubSub based on your Config.
  7. The bullet.topology.function: "tracer" setting has been removed. It will be re-added when we add in the DRPC PubSub version
  8. All the old settings relating to parallelisms of the DRPCSpout spout, PrepareRequest bolt and ReturnResults bolt have been removed and replaced with the new components' equivalents. See settings for details.

You may implement a PubSub and try it out, however, if you want the end-to-end solution to work, stay tuned for bullet-service to release its corresponding changes to use the PubSub.

0.5.0 (Storm 1.0) Removing Bullet Core

27 Jun 21:16

Choose a tag to compare

This release removes the core Bullet logic that is not Storm specific out of Bullet Storm into its own package and artifact. It is also available through JCenter if you need it. This is the first step to let us implement Bullet on other Stream Processors. This also lets us more easily reuse core Bullet logic in other places.

From a user point of view, the main class has moved from com.yahoo.bullet.Topology to com.yahoo.bullet.storm.Topology. You will have to point to this new location if you're launching Bullet from the command line passing your Spout implementation.

The configuration has also been broken up. com.yahoo.bullet.BulletConfig is now available in bullet-core and only manages settings for bullet-core. A new com.yahoo.bullet.storm.BulletStormConfig is available from this release that has the additional Storm specific settings. You can still use a single YAML file to manage all your settings but you should pass it to BulletStormConfig instead. All the methods in com.yahoo.bullet.storm.Topology now use BulletStormConfig instead of BulletConfig. If you were using your own custom IMetricsConsumer, the register that required using a BulletConfig now requires a BulletStormConfig as well. Please change accordingly.

Issues

#31

0.5.0 (Storm 0.10) Removing Bullet Core

27 Jun 21:15

Choose a tag to compare

This release removes the core Bullet logic that is not Storm specific out of Bullet Storm into its own package and artifact. It is also available through JCenter if you need it. This is the first step to let us implement Bullet on other Stream Processors. This also lets us more easily reuse core Bullet logic in other places.

From a user point of view, the main class has moved from com.yahoo.bullet.Topology to com.yahoo.bullet.storm.Topology. You will have to point to this new location if you're launching Bullet from the command line passing your Spout implementation.

The configuration has also been broken up. com.yahoo.bullet.BulletConfig is now available in bullet-core and only manages settings for bullet-core. A new com.yahoo.bullet.storm.BulletStormConfig is available from this release that has the additional Storm specific settings. You can still use a single YAML file to manage all your settings but you should pass it to BulletStormConfig instead. All the methods in com.yahoo.bullet.storm.Topology now use BulletStormConfig instead of BulletConfig. If you were using your own custom IMetricsConsumer, the register that required using a BulletConfig now requires a BulletStormConfig as well. Please change accordingly.

Issues

#31

0.4.3 (Storm 1.0) DISTRIBUTION point generation rounding and a latency metric

10 Jun 00:20

Choose a tag to compare

This release adds rounding to the generated points for the DISTRIBUTION aggregation as a topology wide setting. You can now use bullet.query.aggregation.distribution.generated.points.rounding to specify the maximum number of decimal places generated points can have. This is primarily to round off small deltas when floating point numbers are generated by Bullet. No longer will you see 0.799999999999 for a QUANTILE point when you were expecting 0.8. Note that for queries that specify points directly, the value of the point (interpreted as a floating point number) will be used and this setting will not be used to round it off.

This release also adds a new metric that can be emitted from the Filter Bolt component. If the Storm tuples sent to the Filter Bolt from your Data Source component has an Object after the BulletRecord, it will be interpreted as a Long timestamp. The Filter Bolt will then subtract this timestamp right before it acks the tuple and adds the latency to an average. This average is emitted as the bullet_filter_latency metric (in ms). If you add as a value for the timestamp in your tuple, the time when you read the record, the latency metric will essentially tell you the time taken for your record to be processed up to and including the Filter Bolt. You can then use the metric to see how your latencies go up as add more and more simultaneous queries, helping you identify bottlenecks.

This release also changes the default value of the bullet.record.inject.timestamp.key from bullet_filter_timestamp to bullet_project_timestamp to more accurately reflect what the timestamp meant.

See bullet_defaults.yaml for the additions and documentation to use them.

Issues

#28

0.4.3 (Storm 0.10) DISTRIBUTION point generation rounding and a latency metric

10 Jun 00:19

Choose a tag to compare

This release adds rounding to the generated points for the DISTRIBUTION aggregation as a topology wide setting. You can now use bullet.query.aggregation.distribution.generated.points.rounding to specify the maximum number of decimal places generated points can have. This is primarily to round off small deltas when floating point numbers are generated by Bullet. No longer will you see 0.799999999999 for a QUANTILE point when you were expecting 0.8. Note that for queries that specify points directly, the value of the point (interpreted as a floating point number) will be used and this setting will not be used to round it off.

This release also adds a new metric that can be emitted from the Filter Bolt component. If the Storm tuples sent to the Filter Bolt from your Data Source component has an Object after the BulletRecord, it will be interpreted as a Long timestamp. The Filter Bolt will then subtract this timestamp right before it acks the tuple and adds the latency to an average. This average is emitted as the bullet_filter_latency metric (in ms). If you add as a value for the timestamp in your tuple, the time when you read the record, the latency metric will essentially tell you the time taken for your record to be processed up to and including the Filter Bolt. You can then use the metric to see how your latencies go up as add more and more simultaneous queries, helping you identify bottlenecks.

This release also changes the default value of the bullet.record.inject.timestamp.key from bullet_filter_timestamp to bullet_project_timestamp to more accurately reflect what the timestamp meant.

See bullet_defaults.yaml for the additions and documentation to use them.

Issues

#28