Twitter "spritzer" Firehose Factory named "twitzer".
Builds a Firehose that emits a stream of
??
with timestamps along with ??.
The generated tuples have the form (timestamp, ????)
where the timestamp is from the twitter event.
Example spec file:
Example query using POST to /druid/v2/?w (where w is an arbitrary parameter and the date and time
is UTC):
Notes on twitter.com HTTP (REST) API: v1.0 will be disabled around 2013-03 so v1.1 should be used;
twitter4j 3.0 (not yet released) will support the v1.1 api.
Specifically, we should be using https://stream.twitter.com/1.1/statuses/sample.json
See: http://jira.twitter4j.org/browse/TFJ-186
Notes on JSON parsing: as of twitter4j 2.2.x, the json parser has some bugs (ex: Status.toString()
can have number format exceptions), so it might be necessary to extract raw json and process it
separately. If so, set twitter4.jsonStoreEnabled=true and look at DataObjectFactory#getRawJSON();
com.fasterxml.jackson.databind.ObjectMapper should be used to parse.