public abstract class PrefetchableTextFilesFirehoseFactory<T> extends AbstractTextFilesFirehoseFactory<T>
connect(StringInputRowParser, File), it caches objects in a local disk
up to maxCacheCapacityBytes. These caches are NOT deleted until the process terminates, and thus can be used for
future reads.
prefetchTriggerBytes, a background prefetch thread automatically starts to fetch remaining
objects.
maxFetchRetry.
This implementation can be useful when the cost for reading input objects is large as reading from AWS S3 because
batch tasks like IndexTask or HadoopIndexTask can read the whole data twice for determining partition specs and
generating segments if the intervals of GranularitySpec is not specified.
LineIterator reading that file.
LineIterator only when the
download operation is successfully finished.
LineIterator which directly reads the stream opened by
AbstractTextFilesFirehoseFactory.openObjectStream(T). If there is an IOException, it will throw it and the read will fail.| Constructor and Description |
|---|
PrefetchableTextFilesFirehoseFactory(Long maxCacheCapacityBytes,
Long maxFetchCapacityBytes,
Long prefetchTriggerBytes,
Long fetchTimeout,
Integer maxFetchRetry) |
| Modifier and Type | Method and Description |
|---|---|
Firehose |
connect(StringInputRowParser firehoseParser,
File temporaryDirectory)
Initialization method that connects up the fire hose.
|
long |
getFetchTimeout() |
long |
getMaxCacheCapacityBytes() |
long |
getMaxFetchCapacityBytes() |
int |
getMaxFetchRetry() |
long |
getPrefetchTriggerBytes() |
initObjects, openObjectStream, wrapObjectStreamclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitconnectpublic long getMaxCacheCapacityBytes()
public long getMaxFetchCapacityBytes()
public long getPrefetchTriggerBytes()
public long getFetchTimeout()
public int getMaxFetchRetry()
public Firehose connect(StringInputRowParser firehoseParser, File temporaryDirectory) throws IOException
FirehoseFactoryPrefetchableTextFilesFirehoseFactory may use a temporary
directory to cache data in it.connect in interface FirehoseFactory<StringInputRowParser>connect in class AbstractTextFilesFirehoseFactory<T>firehoseParser - an input row parsertemporaryDirectory - a directory where temporary files are storedIOExceptionCopyright © 2011–2018. All rights reserved.