A platform for high-performance distributed tool and library development written in C++. It can be deployed in two different cluster modes: standalone or distributed. API for v0.5.0, released on June 13, 2018.
|
#include <TupleSetJobStageBuilder.h>
Private Attributes | |
std::string | jobId |
int | jobStageId |
std::string | sourceTupleSetName |
std::string | targetTupleSetName |
std::string | targetComputationName |
Handle< ComputePlan > | computePlan |
std::vector< std::string > | buildTheseTupleSets |
std::string | outputTypeName |
Handle< SetIdentifier > | sourceContext |
Handle< SetIdentifier > | combinerContext |
Handle< SetIdentifier > | sinkContext |
Handle< Map< String, String > > | hashSetsToProbe |
bool | isBroadcasting |
bool | isRepartitioning |
bool | isRepartitionJoin |
bool | isRepartitionVector |
bool | isProbing |
AllocatorPolicy | policy |
bool | inputAggHashOut |
bool | isCollectAsMap |
int | numNodesToCollect |
Definition at line 29 of file TupleSetJobStageBuilder.h.
pdb::TupleSetJobStageBuilder::TupleSetJobStageBuilder | ( | ) |
Initializes the builder
Definition at line 23 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::addHashSetToProbe | ( | const std::string & | outputName, |
const std::string & | hashSetName | ||
) |
Add a hash set to probe
outputName | - the output name of the computation |
hashSetName | - the name of the hash set |
Definition at line 72 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::addTupleSetToBuildPipeline | ( | const std::string & | buildMe | ) |
Add a tuple set to the list of tuple sets that build up the pipeline
buildMe | the tuple set name |
Definition at line 68 of file TupleSetJobStageBuilder.cc.
Handle< TupleSetJobStage > pdb::TupleSetJobStageBuilder::build | ( | ) |
Return the build TupleSetJobStage
Definition at line 144 of file TupleSetJobStageBuilder.cc.
const std::string & pdb::TupleSetJobStageBuilder::getLastSetThatBuildsPipeline | ( | ) | const |
TODO this is not something that should be in a builder see how we can remove it.. Returns the last set that builds the pipeline
Definition at line 140 of file TupleSetJobStageBuilder.cc.
Handle< SetIdentifier > pdb::TupleSetJobStageBuilder::getSourceSetIdentifier | ( | ) |
TODO this is not something that should be in a builder see how we can remove it... Returns the identifier of the source set
Definition at line 132 of file TupleSetJobStageBuilder.cc.
const std::string & pdb::TupleSetJobStageBuilder::getSourceTupleSetName | ( | ) | const |
TODO this is not something that should be in a builder see how we can remove it... Returns the source tuple set name
Definition at line 136 of file TupleSetJobStageBuilder.cc.
bool pdb::TupleSetJobStageBuilder::isPipelineProbing | ( | ) |
Will this pipeline probe a hash set?
Definition at line 128 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setAllocatorPolicy | ( | AllocatorPolicy | policy | ) |
The allocation policy of the computation
policy |
Definition at line 96 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setBroadcasting | ( | bool | broadcastOrNot | ) |
True if we are running a pipeline with a broadcast sink, false otherwise
broadcastOrNot | - the value |
Definition at line 108 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setCollectAsMap | ( | bool | collectAsMapOrNot | ) |
If this parameter is set to true, the results of a pipeline with shuffle sink will be send to the 0 through (numNodesToCollect - 1) nodes
collectAsMapOrNot | - the value |
Definition at line 120 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setCombiner | ( | Handle< SetIdentifier > | combinerContext | ) |
Sets a combiner for this tuple stage
combinerContext |
Definition at line 84 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setComputePlan | ( | const Handle< ComputePlan > & | plan | ) |
Sets the compute plan to the jobStage we are building
plan | - ComputePlan generated from input computations and the input TCAP string |
Definition at line 52 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setInputAggHashOut | ( | bool | inputAggHashOut | ) |
True if the input is a result of an aggregation
inputAggHashOut | - true if it is, false otherwise |
Definition at line 116 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setJobId | ( | const std::string & | jobId | ) |
The id of the job this job stage belongs to
jobId | - string identifier of the job |
Definition at line 44 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setJobStageId | ( | int | jobStageId | ) |
Sets the id of this job stage
jobStageId | - the id that uniquely identifies this stage within the current job |
Definition at line 48 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setNumNodesToCollect | ( | int | numNodesToCollect | ) |
If isCollectAsMap is set to true this parameter will set the number of nodes we want to send the shuffle sink data to.
numNodesToCollect | - the value |
Definition at line 124 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setOutputTypeName | ( | const std::string & | outputTypeName | ) |
The type associated with the sink context by default it has a type of "IntermediateData"
outputTypeName | the type name |
Definition at line 88 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setProbing | ( | bool | isProbing | ) |
Is this pipeline probing a hashset
isProbing | true if it is, false otherwise |
Definition at line 92 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setRepartition | ( | bool | repartitionOrNot | ) |
True if we are running pipeline with shuffle sink
repartitionOrNot | - the value |
Definition at line 112 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setRepartitionJoin | ( | bool | repartitionJoinOrNot | ) |
This is true if we are running a pipeline with a hash partition sink for JoinMaps
repartitionJoinOrNot |
Definition at line 100 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setRepartitionVector | ( | bool | repartitionVectorOrNot | ) |
This is true if we are running a pipeline with a hash partition sink for Vectors
repartitionVectorOrNot |
Definition at line 104 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setSinkContext | ( | const Handle< SetIdentifier > & | sinkContext | ) |
Sets the set identifier of the output This is used by the
sinkContext | - the set identifier |
Definition at line 80 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setSourceContext | ( | const Handle< SetIdentifier > & | sourceContext | ) |
Sets the set identifier by the source set This is used by the
sourceContext | - the set identifier |
Definition at line 76 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setSourceTupleSetName | ( | const std::string & | sourceTupleSetSpecifier | ) |
Sets the source tuple set
sourceTupleSetSpecifier | - the tuple set we use for the source |
Definition at line 56 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setTargetComputationName | ( | const std::string & | targetComputationSpecifier | ) |
Sets the target computation specifier TODO this thing is used in the execution engine just to set : [NumNodes, NumPartitions, BatchSize] for aggregation or join
targetComputationSpecifier | - the name of the computation |
Definition at line 64 of file TupleSetJobStageBuilder.cc.
void pdb::TupleSetJobStageBuilder::setTargetTupleSetName | ( | const std::string & | targetTupleSetName | ) |
Sets the target tuple set
targetTupleSetSpecifier | - the tuple set we use for the sink |
Definition at line 60 of file TupleSetJobStageBuilder.cc.
|
private |
Tuple sets produced by atomic computations we want to build this pipeline
Definition at line 244 of file TupleSetJobStageBuilder.h.
|
private |
The set identifier of the combiner set This is used by the
Definition at line 261 of file TupleSetJobStageBuilder.h.
|
private |
The ComputePlan generated from input computations and the input TCAP string
Definition at line 238 of file TupleSetJobStageBuilder.h.
Hash sets we to probe in current stage.
Definition at line 272 of file TupleSetJobStageBuilder.h.
|
private |
Aggregation output should not be kept across stages; if an aggregation has more than one consumers, we need materialize aggregation results.
Definition at line 311 of file TupleSetJobStageBuilder.h.
|
private |
This is true if we are running a pipeline with a broadcast sink
Definition at line 277 of file TupleSetJobStageBuilder.h.
|
private |
If this parameter is set to true, the results of a pipeline with shuffle sink will be send to the 0 through (numNodesToCollect - 1) nodes
Definition at line 317 of file TupleSetJobStageBuilder.h.
|
private |
True if this pipeline is probing a hash set
Definition at line 300 of file TupleSetJobStageBuilder.h.
|
private |
True if we are running pipeline with shuffle sink
Definition at line 282 of file TupleSetJobStageBuilder.h.
|
private |
This is true if we are running a pipeline with a hash partition sink for JoinMaps
Definition at line 288 of file TupleSetJobStageBuilder.h.
|
private |
This is true if we are running a pipeline with a hash partition sink for Vectors
Definition at line 294 of file TupleSetJobStageBuilder.h.
|
private |
The id of the job this job stage belongs to
Definition at line 212 of file TupleSetJobStageBuilder.h.
|
private |
The id of this job stage. It uniquely identifies this stage within this job
Definition at line 217 of file TupleSetJobStageBuilder.h.
|
private |
If isCollectAsMap is set to true this parameter will set the number of nodes we want to send the shuffle sink data to.
Definition at line 323 of file TupleSetJobStageBuilder.h.
|
private |
The type of the output, if it is intermediate data between stages it is set to "IntermediateData"
Definition at line 249 of file TupleSetJobStageBuilder.h.
|
private |
The allocation policy of the computation
Definition at line 305 of file TupleSetJobStageBuilder.h.
|
private |
The set identifier of the output This is used by the
Definition at line 267 of file TupleSetJobStageBuilder.h.
|
private |
The set identifier by the source set This is used by the
Definition at line 255 of file TupleSetJobStageBuilder.h.
|
private |
The tuple set we use for the source
Definition at line 222 of file TupleSetJobStageBuilder.h.
|
private |
Sets the target computation specifier TODO this thing is used in the execution engine just to set : [NumNodes, NumPartitions, BatchSize] for aggregation or join
Definition at line 233 of file TupleSetJobStageBuilder.h.
|
private |
The tuple set we use for the sink
Definition at line 227 of file TupleSetJobStageBuilder.h.