HadoopDruidIndexerConfig (io.druid:druid 0.10.0 API)

java.lang.Object
- io.druid.indexer.HadoopDruidIndexerConfig

public class HadoopDruidIndexerConfig
extends Object

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class HadoopDruidIndexerConfig.IndexJobCounters

Nested Classes
Modifier and Type	Class and Description
`static class`	`HadoopDruidIndexerConfig.IndexJobCounters`

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`CONFIG_PROPERTY`
`static HadoopKerberosConfig`	`HADOOP_KERBEROS_CONFIG`
`static IndexIO`	`INDEX_IO`
`static IndexMerger`	`INDEX_MERGER`
`static IndexMergerV9`	`INDEX_MERGER_V9`
`static Charset`	`JAVA_NATIVE_CHARSET`
`static com.fasterxml.jackson.databind.ObjectMapper`	`JSON_MAPPER`
`static com.google.common.base.Joiner`	`TAB_JOINER`
`static com.google.common.base.Splitter`	`TAB_SPLITTER`

Constructor Summary

Constructors
Constructor and Description

HadoopDruidIndexerConfig(HadoopIngestionSpec spec)

Constructors
Constructor and Description
`HadoopDruidIndexerConfig(HadoopIngestionSpec spec)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.Job`	`addInputPaths(org.apache.hadoop.mapreduce.Job job)` Job instance should have Configuration set (by calling `addJobProperties(Job)` or via injected system properties) before this method is called.
`void`	`addJobProperties(org.apache.hadoop.mapreduce.Job job)`
`static HadoopDruidIndexerConfig`	`fromConfiguration(org.apache.hadoop.conf.Configuration conf)`
`static HadoopDruidIndexerConfig`	`fromDistributedFileSystem(String path)`
`static HadoopDruidIndexerConfig`	`fromFile(File file)`
`static HadoopDruidIndexerConfig`	`fromMap(Map<String,Object> argSpec)`
`static HadoopDruidIndexerConfig`	`fromSpec(HadoopIngestionSpec spec)`
`static HadoopDruidIndexerConfig`	`fromString(String str)`
`com.google.common.base.Optional<Iterable<Bucket>>`	`getAllBuckets()`
`com.google.common.base.Optional<Bucket>`	`getBucket(InputRow inputRow)` Get the proper bucket for some input row.
`String`	`getDataSource()`
`GranularitySpec`	`getGranularitySpec()`
`IndexSpec`	`getIndexSpec()`
`List<org.joda.time.Interval>`	`getInputIntervals()`
`com.google.common.base.Optional<List<org.joda.time.Interval>>`	`getIntervals()`
`long`	`getMaxPartitionSize()`
`InputRowParser`	`getParser()`
`PartitionsSpec`	`getPartitionsSpec()`
`PathSpec`	`getPathSpec()`
`HadoopIngestionSpec`	`getSchema()`
`com.google.common.base.Optional<Set<org.joda.time.Interval>>`	`getSegmentGranularIntervals()`
`HadoopyShardSpec`	`getShardSpec(Bucket bucket)`
`int`	`getShardSpecCount(Bucket bucket)`
`Long`	`getTargetPartitionSize()`
`String`	`getWorkingPath()`
`void`	`intoConfiguration(org.apache.hadoop.mapreduce.Job job)`
`boolean`	`isBuildV9Directly()`
`boolean`	`isCombineText()`
`boolean`	`isDeterminingPartitions()`
`boolean`	`isForceExtendableShardSpecs()`
`boolean`	`isIgnoreInvalidRows()`
`boolean`	`isOverwriteFiles()`
`boolean`	`isUpdaterJobSpecSet()`
`org.apache.hadoop.fs.Path`	`makeDescriptorInfoDir()`
`org.apache.hadoop.fs.Path`	`makeDescriptorInfoPath(DataSegment segment)`
`org.apache.hadoop.fs.Path`	`makeGroupedDataDir()`
`org.apache.hadoop.fs.Path`	`makeIntermediatePath()` Make the intermediate path for this job run.
`org.apache.hadoop.fs.Path`	`makeIntervalInfoPath()`
`org.apache.hadoop.fs.Path`	`makeSegmentPartitionInfoPath(org.joda.time.Interval bucketInterval)`
`void`	`setGranularitySpec(GranularitySpec granularitySpec)`
`void`	`setShardSpecs(Map<Long,List<HadoopyShardSpec>> shardSpecs)`
`void`	`setVersion(String version)`
`void`	`verify()`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

CONFIG_PROPERTY

public static final String CONFIG_PROPERTY

See Also:: Constant Field Values

JAVA_NATIVE_CHARSET

public static final Charset JAVA_NATIVE_CHARSET

TAB_SPLITTER

public static final com.google.common.base.Splitter TAB_SPLITTER

TAB_JOINER

public static final com.google.common.base.Joiner TAB_JOINER

JSON_MAPPER

public static final com.fasterxml.jackson.databind.ObjectMapper JSON_MAPPER

INDEX_IO
```
public static final IndexIO INDEX_IO
```

INDEX_MERGER

public static final IndexMerger INDEX_MERGER

INDEX_MERGER_V9

public static final IndexMergerV9 INDEX_MERGER_V9

HADOOP_KERBEROS_CONFIG

public static final HadoopKerberosConfig HADOOP_KERBEROS_CONFIG

Constructor Detail

HadoopDruidIndexerConfig

public HadoopDruidIndexerConfig(HadoopIngestionSpec spec)

Method Detail

fromSpec

public static HadoopDruidIndexerConfig fromSpec(HadoopIngestionSpec spec)

fromMap

public static HadoopDruidIndexerConfig fromMap(Map<String,Object> argSpec)

fromFile

public static HadoopDruidIndexerConfig fromFile(File file)

fromString

public static HadoopDruidIndexerConfig fromString(String str)

fromDistributedFileSystem

public static HadoopDruidIndexerConfig fromDistributedFileSystem(String path)

fromConfiguration

public static HadoopDruidIndexerConfig fromConfiguration(org.apache.hadoop.conf.Configuration conf)

getSchema

public HadoopIngestionSpec getSchema()

getPathSpec
```
public PathSpec getPathSpec()
```

getDataSource
```
public String getDataSource()
```

getGranularitySpec

public GranularitySpec getGranularitySpec()

setGranularitySpec

public void setGranularitySpec(GranularitySpec granularitySpec)

getPartitionsSpec

public PartitionsSpec getPartitionsSpec()

getIndexSpec
```
public IndexSpec getIndexSpec()
```

isOverwriteFiles
```
public boolean isOverwriteFiles()
```

isIgnoreInvalidRows
```
public boolean isIgnoreInvalidRows()
```

setVersion

public void setVersion(String version)

setShardSpecs

public void setShardSpecs(Map<Long,List<HadoopyShardSpec>> shardSpecs)

getIntervals

public com.google.common.base.Optional<List<org.joda.time.Interval>> getIntervals()

isDeterminingPartitions

public boolean isDeterminingPartitions()

getTargetPartitionSize
```
public Long getTargetPartitionSize()
```

isForceExtendableShardSpecs

public boolean isForceExtendableShardSpecs()

getMaxPartitionSize
```
public long getMaxPartitionSize()
```

isUpdaterJobSpecSet
```
public boolean isUpdaterJobSpecSet()
```

isCombineText
```
public boolean isCombineText()
```

getParser
```
public InputRowParser getParser()
```

getShardSpec

public HadoopyShardSpec getShardSpec(Bucket bucket)

getShardSpecCount

public int getShardSpecCount(Bucket bucket)

isBuildV9Directly
```
public boolean isBuildV9Directly()
```

addInputPaths
```
public org.apache.hadoop.mapreduce.Job addInputPaths(org.apache.hadoop.mapreduce.Job job)
                                              throws IOException
```
Job instance should have Configuration set (by calling addJobProperties(Job) or via injected system properties) before this method is called. The PathSpec may create objects which depend on the values of these configurations.

Parameters:

job -

Returns:

Throws:

IOException

getBucket
```
public com.google.common.base.Optional<Bucket> getBucket(InputRow inputRow)
```
Get the proper bucket for some input row.

Parameters:

inputRow - an InputRow

Returns:

the Bucket that this row belongs to

getSegmentGranularIntervals

public com.google.common.base.Optional<Set<org.joda.time.Interval>> getSegmentGranularIntervals()

getInputIntervals

public List<org.joda.time.Interval> getInputIntervals()

getAllBuckets

public com.google.common.base.Optional<Iterable<Bucket>> getAllBuckets()

getWorkingPath
```
public String getWorkingPath()
```

makeIntermediatePath
```
public org.apache.hadoop.fs.Path makeIntermediatePath()
```
Make the intermediate path for this job run.

Returns:

the intermediate path for this job run.

makeSegmentPartitionInfoPath

public org.apache.hadoop.fs.Path makeSegmentPartitionInfoPath(org.joda.time.Interval bucketInterval)

makeIntervalInfoPath

public org.apache.hadoop.fs.Path makeIntervalInfoPath()

makeDescriptorInfoDir

public org.apache.hadoop.fs.Path makeDescriptorInfoDir()

makeGroupedDataDir

public org.apache.hadoop.fs.Path makeGroupedDataDir()

makeDescriptorInfoPath

public org.apache.hadoop.fs.Path makeDescriptorInfoPath(DataSegment segment)

addJobProperties

public void addJobProperties(org.apache.hadoop.mapreduce.Job job)

intoConfiguration

public void intoConfiguration(org.apache.hadoop.mapreduce.Job job)

verify
```
public void verify()
```

Class HadoopDruidIndexerConfig

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

CONFIG_PROPERTY

JAVA_NATIVE_CHARSET

TAB_SPLITTER

TAB_JOINER

JSON_MAPPER

INDEX_IO

INDEX_MERGER

INDEX_MERGER_V9

HADOOP_KERBEROS_CONFIG

Constructor Detail

HadoopDruidIndexerConfig

Method Detail

fromSpec

fromMap

fromFile

fromString

fromDistributedFileSystem

fromConfiguration

getSchema

getPathSpec

getDataSource

getGranularitySpec

setGranularitySpec

getPartitionsSpec

getIndexSpec

isOverwriteFiles

isIgnoreInvalidRows

setVersion

setShardSpecs

getIntervals

isDeterminingPartitions

getTargetPartitionSize

isForceExtendableShardSpecs

getMaxPartitionSize

isUpdaterJobSpecSet

isCombineText

getParser

getShardSpec

getShardSpecCount

isBuildV9Directly

addInputPaths

getBucket

getSegmentGranularIntervals

getInputIntervals

getAllBuckets

getWorkingPath

makeIntermediatePath

makeSegmentPartitionInfoPath

makeIntervalInfoPath

makeDescriptorInfoDir

makeGroupedDataDir

makeDescriptorInfoPath

addJobProperties

intoConfiguration

verify