A platform for high-performance distributed tool and library development written in C++. It can be deployed in two different cluster modes: standalone or distributed. API for v0.5.0, released on June 13, 2018.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
pdb::TupleSetJobStageBuilder Class Reference

#include <TupleSetJobStageBuilder.h>

+ Collaboration diagram for pdb::TupleSetJobStageBuilder:

Public Member Functions

 TupleSetJobStageBuilder ()
 
void setJobId (const std::string &jobId)
 
void setJobStageId (int jobStageId)
 
void setComputePlan (const Handle< ComputePlan > &plan)
 
void setSourceTupleSetName (const std::string &sourceTupleSetSpecifier)
 
void setTargetTupleSetName (const std::string &targetTupleSetName)
 
void setTargetComputationName (const std::string &targetComputationSpecifier)
 
void addTupleSetToBuildPipeline (const std::string &buildMe)
 
void addHashSetToProbe (const std::string &outputName, const std::string &hashSetName)
 
void setSourceContext (const Handle< SetIdentifier > &sourceContext)
 
void setSinkContext (const Handle< SetIdentifier > &sinkContext)
 
void setOutputTypeName (const std::string &outputTypeName)
 
void setProbing (bool isProbing)
 
void setAllocatorPolicy (AllocatorPolicy policy)
 
void setRepartitionJoin (bool repartitionJoinOrNot)
 
void setRepartitionVector (bool repartitionVectorOrNot)
 
void setBroadcasting (bool broadcastOrNot)
 
void setRepartition (bool repartitionOrNot)
 
void setCollectAsMap (bool collectAsMapOrNot)
 
void setInputAggHashOut (bool inputAggHashOut)
 
void setNumNodesToCollect (int numNodesToCollect)
 
void setCombiner (Handle< SetIdentifier > combinerContext)
 
bool isPipelineProbing ()
 
Handle< SetIdentifiergetSourceSetIdentifier ()
 
const std::string & getSourceTupleSetName () const
 
const std::string & getLastSetThatBuildsPipeline () const
 
Handle< TupleSetJobStagebuild ()
 

Private Attributes

std::string jobId
 
int jobStageId
 
std::string sourceTupleSetName
 
std::string targetTupleSetName
 
std::string targetComputationName
 
Handle< ComputePlancomputePlan
 
std::vector< std::string > buildTheseTupleSets
 
std::string outputTypeName
 
Handle< SetIdentifiersourceContext
 
Handle< SetIdentifiercombinerContext
 
Handle< SetIdentifiersinkContext
 
Handle< Map< String, String > > hashSetsToProbe
 
bool isBroadcasting
 
bool isRepartitioning
 
bool isRepartitionJoin
 
bool isRepartitionVector
 
bool isProbing
 
AllocatorPolicy policy
 
bool inputAggHashOut
 
bool isCollectAsMap
 
int numNodesToCollect
 

Detailed Description

Definition at line 29 of file TupleSetJobStageBuilder.h.

Constructor & Destructor Documentation

pdb::TupleSetJobStageBuilder::TupleSetJobStageBuilder ( )

Initializes the builder

Definition at line 23 of file TupleSetJobStageBuilder.cc.

Member Function Documentation

void pdb::TupleSetJobStageBuilder::addHashSetToProbe ( const std::string &  outputName,
const std::string &  hashSetName 
)

Add a hash set to probe

Parameters
outputName- the output name of the computation
hashSetName- the name of the hash set

Definition at line 72 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::addTupleSetToBuildPipeline ( const std::string &  buildMe)

Add a tuple set to the list of tuple sets that build up the pipeline

Parameters
buildMethe tuple set name

Definition at line 68 of file TupleSetJobStageBuilder.cc.

Handle< TupleSetJobStage > pdb::TupleSetJobStageBuilder::build ( )

Return the build TupleSetJobStage

Returns
the TupleSetJobStage

Definition at line 144 of file TupleSetJobStageBuilder.cc.

const std::string & pdb::TupleSetJobStageBuilder::getLastSetThatBuildsPipeline ( ) const

TODO this is not something that should be in a builder see how we can remove it.. Returns the last set that builds the pipeline

Returns
the name of the set

Definition at line 140 of file TupleSetJobStageBuilder.cc.

Handle< SetIdentifier > pdb::TupleSetJobStageBuilder::getSourceSetIdentifier ( )

TODO this is not something that should be in a builder see how we can remove it... Returns the identifier of the source set

Returns
the set

Definition at line 132 of file TupleSetJobStageBuilder.cc.

const std::string & pdb::TupleSetJobStageBuilder::getSourceTupleSetName ( ) const

TODO this is not something that should be in a builder see how we can remove it... Returns the source tuple set name

Returns

Definition at line 136 of file TupleSetJobStageBuilder.cc.

bool pdb::TupleSetJobStageBuilder::isPipelineProbing ( )

Will this pipeline probe a hash set?

Returns
true if the pipeline will probe a hash set

Definition at line 128 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setAllocatorPolicy ( AllocatorPolicy  policy)

The allocation policy of the computation

Parameters
policy

Definition at line 96 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setBroadcasting ( bool  broadcastOrNot)

True if we are running a pipeline with a broadcast sink, false otherwise

Parameters
broadcastOrNot- the value

Definition at line 108 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setCollectAsMap ( bool  collectAsMapOrNot)

If this parameter is set to true, the results of a pipeline with shuffle sink will be send to the 0 through (numNodesToCollect - 1) nodes

Parameters
collectAsMapOrNot- the value

Definition at line 120 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setCombiner ( Handle< SetIdentifier combinerContext)

Sets a combiner for this tuple stage

Parameters
combinerContext

Definition at line 84 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setComputePlan ( const Handle< ComputePlan > &  plan)

Sets the compute plan to the jobStage we are building

Parameters
plan- ComputePlan generated from input computations and the input TCAP string

Definition at line 52 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setInputAggHashOut ( bool  inputAggHashOut)

True if the input is a result of an aggregation

Parameters
inputAggHashOut- true if it is, false otherwise

Definition at line 116 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setJobId ( const std::string &  jobId)

The id of the job this job stage belongs to

Parameters
jobId- string identifier of the job

Definition at line 44 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setJobStageId ( int  jobStageId)

Sets the id of this job stage

Parameters
jobStageId- the id that uniquely identifies this stage within the current job

Definition at line 48 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setNumNodesToCollect ( int  numNodesToCollect)

If isCollectAsMap is set to true this parameter will set the number of nodes we want to send the shuffle sink data to.

Parameters
numNodesToCollect- the value

Definition at line 124 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setOutputTypeName ( const std::string &  outputTypeName)

The type associated with the sink context by default it has a type of "IntermediateData"

Parameters
outputTypeNamethe type name

Definition at line 88 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setProbing ( bool  isProbing)

Is this pipeline probing a hashset

Parameters
isProbingtrue if it is, false otherwise

Definition at line 92 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setRepartition ( bool  repartitionOrNot)

True if we are running pipeline with shuffle sink

Parameters
repartitionOrNot- the value

Definition at line 112 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setRepartitionJoin ( bool  repartitionJoinOrNot)

This is true if we are running a pipeline with a hash partition sink for JoinMaps

Parameters
repartitionJoinOrNot

Definition at line 100 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setRepartitionVector ( bool  repartitionVectorOrNot)

This is true if we are running a pipeline with a hash partition sink for Vectors

Parameters
repartitionVectorOrNot

Definition at line 104 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setSinkContext ( const Handle< SetIdentifier > &  sinkContext)

Sets the set identifier of the output This is used by the

See Also
FrontendQueryTestServer to create the output set
Parameters
sinkContext- the set identifier

Definition at line 80 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setSourceContext ( const Handle< SetIdentifier > &  sourceContext)

Sets the set identifier by the source set This is used by the

See Also
FrontendQueryTestServer to get the info about the source set
Parameters
sourceContext- the set identifier

Definition at line 76 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setSourceTupleSetName ( const std::string &  sourceTupleSetSpecifier)

Sets the source tuple set

Parameters
sourceTupleSetSpecifier- the tuple set we use for the source

Definition at line 56 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setTargetComputationName ( const std::string &  targetComputationSpecifier)

Sets the target computation specifier TODO this thing is used in the execution engine just to set : [NumNodes, NumPartitions, BatchSize] for aggregation or join

Parameters
targetComputationSpecifier- the name of the computation

Definition at line 64 of file TupleSetJobStageBuilder.cc.

void pdb::TupleSetJobStageBuilder::setTargetTupleSetName ( const std::string &  targetTupleSetName)

Sets the target tuple set

Parameters
targetTupleSetSpecifier- the tuple set we use for the sink

Definition at line 60 of file TupleSetJobStageBuilder.cc.

Member Data Documentation

std::vector<std::string> pdb::TupleSetJobStageBuilder::buildTheseTupleSets
private

Tuple sets produced by atomic computations we want to build this pipeline

See Also
pdb::ComputePlan::buildPipeline

Definition at line 244 of file TupleSetJobStageBuilder.h.

Handle<SetIdentifier> pdb::TupleSetJobStageBuilder::combinerContext
private

The set identifier of the combiner set This is used by the

See Also
FrontendQueryTestServer to create the combiner set (where we put the result of the combiner)

Definition at line 261 of file TupleSetJobStageBuilder.h.

Handle<ComputePlan> pdb::TupleSetJobStageBuilder::computePlan
private

The ComputePlan generated from input computations and the input TCAP string

Definition at line 238 of file TupleSetJobStageBuilder.h.

Handle<Map<String, String> > pdb::TupleSetJobStageBuilder::hashSetsToProbe
private

Hash sets we to probe in current stage.

Definition at line 272 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::inputAggHashOut
private

Aggregation output should not be kept across stages; if an aggregation has more than one consumers, we need materialize aggregation results.

Definition at line 311 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isBroadcasting
private

This is true if we are running a pipeline with a broadcast sink

Definition at line 277 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isCollectAsMap
private

If this parameter is set to true, the results of a pipeline with shuffle sink will be send to the 0 through (numNodesToCollect - 1) nodes

Definition at line 317 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isProbing
private

True if this pipeline is probing a hash set

Definition at line 300 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isRepartitioning
private

True if we are running pipeline with shuffle sink

Definition at line 282 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isRepartitionJoin
private

This is true if we are running a pipeline with a hash partition sink for JoinMaps

Definition at line 288 of file TupleSetJobStageBuilder.h.

bool pdb::TupleSetJobStageBuilder::isRepartitionVector
private

This is true if we are running a pipeline with a hash partition sink for Vectors

Definition at line 294 of file TupleSetJobStageBuilder.h.

std::string pdb::TupleSetJobStageBuilder::jobId
private

The id of the job this job stage belongs to

Definition at line 212 of file TupleSetJobStageBuilder.h.

int pdb::TupleSetJobStageBuilder::jobStageId
private

The id of this job stage. It uniquely identifies this stage within this job

Definition at line 217 of file TupleSetJobStageBuilder.h.

int pdb::TupleSetJobStageBuilder::numNodesToCollect
private

If isCollectAsMap is set to true this parameter will set the number of nodes we want to send the shuffle sink data to.

Definition at line 323 of file TupleSetJobStageBuilder.h.

std::string pdb::TupleSetJobStageBuilder::outputTypeName
private

The type of the output, if it is intermediate data between stages it is set to "IntermediateData"

Definition at line 249 of file TupleSetJobStageBuilder.h.

AllocatorPolicy pdb::TupleSetJobStageBuilder::policy
private

The allocation policy of the computation

Definition at line 305 of file TupleSetJobStageBuilder.h.

Handle<SetIdentifier> pdb::TupleSetJobStageBuilder::sinkContext
private

The set identifier of the output This is used by the

See Also
FrontendQueryTestServer to create the output set

Definition at line 267 of file TupleSetJobStageBuilder.h.

Handle<SetIdentifier> pdb::TupleSetJobStageBuilder::sourceContext
private

The set identifier by the source set This is used by the

See Also
FrontendQueryTestServer to get the info about the source set

Definition at line 255 of file TupleSetJobStageBuilder.h.

std::string pdb::TupleSetJobStageBuilder::sourceTupleSetName
private

The tuple set we use for the source

Definition at line 222 of file TupleSetJobStageBuilder.h.

std::string pdb::TupleSetJobStageBuilder::targetComputationName
private

Sets the target computation specifier TODO this thing is used in the execution engine just to set : [NumNodes, NumPartitions, BatchSize] for aggregation or join

Definition at line 233 of file TupleSetJobStageBuilder.h.

std::string pdb::TupleSetJobStageBuilder::targetTupleSetName
private

The tuple set we use for the sink

Definition at line 227 of file TupleSetJobStageBuilder.h.


The documentation for this class was generated from the following files: