A platform for high-performance distributed tool and library development written in C++. It can be deployed in two different cluster modes: standalone or distributed. API for v0.5.0, released on June 13, 2018.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
PartitionedFile Class Reference

#include <PartitionedFile.h>

+ Inheritance diagram for PartitionedFile:
+ Collaboration diagram for PartitionedFile:

Public Member Functions

 PartitionedFile (NodeID nodeId, DatabaseID dbId, UserTypeID typeId, SetID setId, string metaPartitionPath, vector< string > dataPartitionPaths, pdb::PDBLoggerPtr logger, size_t pageSize)
 
 PartitionedFile (NodeID nodeId, DatabaseID dbId, UserTypeID typeId, SetID setId, string metaPartitionPath, pdb::PDBLoggerPtr logger)
 
 ~PartitionedFile ()
 
bool openAll () override
 
bool openMeta ()
 
bool openData ()
 
bool openDataDirect ()
 
bool closeAll () override
 
bool closeDirect ()
 
void clear () override
 
int appendPage (FilePartitionID partitionId, PDBPagePtr page) override
 
int appendPageDirect (FilePartitionID partitionId, PDBPagePtr page)
 
int writeMeta () override
 
int updateMeta () override
 
size_t loadPage (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length) override
 
size_t loadPageDirect (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length)
 
size_t loadPageFromCurPos (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length)
 
PageID loadPageId (FilePartitionID partitionId, unsigned int pageSeqInPartition)
 
PageID loadPageIdFromCurPos (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length)
 
unsigned int getAndSetNumFlushedPages () override
 
unsigned int getNumFlushedPages () override
 
PageID getLastFlushedPageID () override
 
PageID getLatestPageID () override
 
NodeID getNodeId () override
 
DatabaseID getDbId () override
 
UserTypeID getTypeId () override
 
SetID getSetId () override
 
size_t getPageSize () override
 
size_t getPageSizeInMeta () override
 
FileType getFileType () override
 
PartitionedFileMetaDataPtr getMetaData ()
 
void initializeDataFiles ()
 
void setDataPartitionPaths (const vector< string > &dataPartitionPaths)
 
void buildMetaDataFromMetaPartition (SharedMemPtr shm)
 
unsigned int getNumPartitions ()
 
- Public Member Functions inherited from PDBFileInterface
virtual ~PDBFileInterface ()
 

Protected Member Functions

int writeData (FILE *file, void *data, size_t length)
 
int writeDataDirect (int handle, void *data, size_t length)
 
int seekPage (FILE *file, unsigned int pageSeqInPartition)
 
int seekPageDirect (int handle, unsigned int pageSeqInPartition)
 
int seekPageSizeInMeta ()
 
int seekNumFlushedPagesInMeta ()
 
int seekNumFlushedPagesInPartitionMeta (FilePartitionID partitionId)
 

Private Attributes

pthread_mutex_t fileMutex
 
FILE * metaFile = nullptr
 
vector< FILE * > dataFiles
 
vector< int > dataHandles
 
string metaPartitionPath
 
vector< string > dataPartitionPaths
 
pdb::PDBLoggerPtr logger = nullptr
 
PartitionedFileMetaDataPtr metaData = nullptr
 
NodeID nodeId
 
DatabaseID dbId
 
UserTypeID typeId
 
SetID setId
 
size_t pageSize = 0
 
bool usingDirect
 
bool cleared
 

Detailed Description

This class wraps a PartitionedFile class that implements the PDBFileInterface. If using PartitionedFile, each set will be flushed to multiple file partitions in configured data directories. The metadata will be flushed to a partition in the meta directory.

Meta partition format:

  • MetaSize
  • FileType
  • Version
  • PageSize
  • NumFlushedPages
  • LatestPageID
  • NumPartitions
  • PartitionID for 1st partition
  • NumFlushedPages in 1st partition
  • Length of the path to 1st partition
  • Path to 1st partition
  • PartitionID for 2nd partition
  • NumFlushedPages in 2nd partition
  • Length of the path to 2nd partition
  • Path to 2nd partition ...
  • PartitionID for PageID 1
  • PageSeqIDInPartition for PageID 1
  • PartitionID for PageID 2
  • PageSeqIDInPartition for PageID 2
  • PartitionID for PageID 3
  • PageSeqIDInPartition for PageID 3 ...

Data partition format:

  • 1st pageId
  • 1st page in the partition
  • 2nd pageId
  • 2nd page in the partition
  • ...

Definition at line 69 of file PartitionedFile.h.

Constructor & Destructor Documentation

PartitionedFile::PartitionedFile ( NodeID  nodeId,
DatabaseID  dbId,
UserTypeID  typeId,
SetID  setId,
string  metaPartitionPath,
vector< string >  dataPartitionPaths,
pdb::PDBLoggerPtr  logger,
size_t  pageSize 
)

Create a new PartitionedFile instance.

Definition at line 38 of file PartitionedFile.cc.

PartitionedFile::PartitionedFile ( NodeID  nodeId,
DatabaseID  dbId,
UserTypeID  typeId,
SetID  setId,
string  metaPartitionPath,
pdb::PDBLoggerPtr  logger 
)

Initialize a partitionedFile instance from existing meta data.

Create a simple PartitionedFile instance, and later will add information to the instance by parsing existing meta data or initialize the instance with nullptrs.

Definition at line 87 of file PartitionedFile.cc.

PartitionedFile::~PartitionedFile ( )

Destructor, it will NOT delete on-disk files.

Definition at line 110 of file PartitionedFile.cc.

Member Function Documentation

int PartitionedFile::appendPage ( FilePartitionID  partitionId,
PDBPagePtr  page 
)
overridevirtual

Append page to the partition identified by partitionId Return the PageSeqInPartition, if success, return -1 on failure.

Append page to the partition identified by partitionId

Implements PDBFileInterface.

Definition at line 266 of file PartitionedFile.cc.

int PartitionedFile::appendPageDirect ( FilePartitionID  partitionId,
PDBPagePtr  page 
)

Append page using direct I/O

Definition at line 305 of file PartitionedFile.cc.

void PartitionedFile::buildMetaDataFromMetaPartition ( SharedMemPtr  shm)

Set up meta data by parsing meta partition

Meta partition format:

  • Metadata Size
  • FileType
  • Version
  • PageSize
  • NumFlushedPages
    • LatestPageID (Add on Mar 21, 2016)
  • NumPartitions
  • PartitionID for 1st partition
  • NumFlushedPages in 1st partition
  • Length of the path to 1st partition
  • Path to 1st partition
  • PartitionID for 2nd partition
  • NumFlushedPages in 2nd partition
  • Length of the path to 2nd partition
  • Path to 2nd partition
  • ...
    • pageId for the 1st page
    • FilePartitionID for the 1st page
    • PageSeqIdInPartition for the 1st page
    • ...

Definition at line 672 of file PartitionedFile.cc.

void PartitionedFile::clear ( )
overridevirtual

To delete a file instance.

Implements PDBFileInterface.

Definition at line 236 of file PartitionedFile.cc.

bool PartitionedFile::closeAll ( )
overridevirtual

Close meta partition and all data partitions

Implements PDBFileInterface.

Definition at line 204 of file PartitionedFile.cc.

bool PartitionedFile::closeDirect ( )

Close meta partition and all data partitions using direct I/O

Definition at line 222 of file PartitionedFile.cc.

unsigned int PartitionedFile::getAndSetNumFlushedPages ( )
overridevirtual

Read from the meta partition about lastFlushedPageId, set the lastFlushedPageId variable. Also set the numFlushedPages variable as lastFlushedPageId + 1. (This is true for the time being!)

Used when initialize PartitionedFile instance from metaPartition file on disk. Read from the meta partition about numFlushedPages, set the numFlushedPages variable. (This is true for the time being!)

Implements PDBFileInterface.

Definition at line 584 of file PartitionedFile.cc.

DatabaseID PartitionedFile::getDbId ( )
overridevirtual

Return DatabaseID of this file

Implements PDBFileInterface.

Definition at line 636 of file PartitionedFile.cc.

FileType PartitionedFile::getFileType ( )
overridevirtual

To return the file type of the file: SequenceFile or PartitionedFile

Implements PDBFileInterface.

Definition at line 658 of file PartitionedFile.cc.

PageID PartitionedFile::getLastFlushedPageID ( )
overridevirtual

Return lastFlushedPageID

Implements PDBFileInterface.

Definition at line 615 of file PartitionedFile.cc.

PageID PartitionedFile::getLatestPageID ( )
overridevirtual

Return latestPageID

Return latestPageID (Note, actually it's the largest page id)

Implements PDBFileInterface.

Definition at line 622 of file PartitionedFile.cc.

PartitionedFileMetaDataPtr PartitionedFile::getMetaData ( )

Return a smart pointer pointing to the metaData of this PartitionedFile instance.

Definition at line 260 of file PartitionedFile.cc.

+ Here is the caller graph for this function:

NodeID PartitionedFile::getNodeId ( )
overridevirtual

Return NodeID of this file

Implements PDBFileInterface.

Definition at line 629 of file PartitionedFile.cc.

unsigned int PartitionedFile::getNumFlushedPages ( )
overridevirtual

Return numFlushedPages

Implements PDBFileInterface.

Definition at line 608 of file PartitionedFile.cc.

unsigned int PartitionedFile::getNumPartitions ( )

To return the number of data partitions in the file

Definition at line 665 of file PartitionedFile.cc.

+ Here is the caller graph for this function:

size_t PartitionedFile::getPageSize ( )
overridevirtual

Return page size of this file

Implements PDBFileInterface.

Definition at line 803 of file PartitionedFile.cc.

size_t PartitionedFile::getPageSizeInMeta ( )
overridevirtual

Read meta partition to get and set pageSize of this file

Read meta partition to return the pageSize of this file

Implements PDBFileInterface.

Definition at line 814 of file PartitionedFile.cc.

SetID PartitionedFile::getSetId ( )
overridevirtual

Return SetID of this file

Implements PDBFileInterface.

Definition at line 651 of file PartitionedFile.cc.

UserTypeID PartitionedFile::getTypeId ( )
overridevirtual

Return UserTypeID of this file

Implements PDBFileInterface.

Definition at line 644 of file PartitionedFile.cc.

void PartitionedFile::initializeDataFiles ( )

initialize data files;

Definition at line 836 of file PartitionedFile.cc.

size_t PartitionedFile::loadPage ( FilePartitionID  partitionId,
unsigned int  pageSeqInPartition,
char *  pageInCache,
size_t  length 
)
overridevirtual

To load the page from a given sequence/ordering number of the page in the partition that is specified by partitionId. This function will be invoked by PageCache instance, length is the size of shared memory allocated for this load. The PageCache instance should make sure that length == pageSize. Otherwise, if length > pageSize, some space will be wasted, and if length < pageSize, some data on the page may not get loaded.

To load the page from a given sequence/ordering number of the page in the partition that is specified by partitionId. This function will be invoked by PageCache instance, length is the size of shared memory allocated for this load. The PageCache instance should make sure that length == pageSize. Otherwise, if length > pageSize, some space will be wasted, and if length < pageSize, some data on the page may not get loaded.

Return loaded size;

Implements PDBFileInterface.

Definition at line 483 of file PartitionedFile.cc.

size_t PartitionedFile::loadPageDirect ( FilePartitionID  partitionId,
unsigned int  pageSeqInPartition,
char *  pageInCache,
size_t  length 
)

To load page using direct I/O.

Definition at line 505 of file PartitionedFile.cc.

size_t PartitionedFile::loadPageFromCurPos ( FilePartitionID  partitionId,
unsigned int  pageSeqInPartition,
char *  pageInCache,
size_t  length 
)

Similar with above method. The difference is this method will not seek, it just load sequentially, so the caller needs to make sure current position is correct.

Definition at line 537 of file PartitionedFile.cc.

PageID PartitionedFile::loadPageId ( FilePartitionID  partitionId,
unsigned int  pageSeqInPartition 
)

To load the pageId for a specified page. Return the pageId, if page exists, otherwise, return (unsigned int)(-1)

Definition at line 527 of file PartitionedFile.cc.

PageID PartitionedFile::loadPageIdFromCurPos ( FilePartitionID  partitionId,
unsigned int  pageSeqInPartition,
char *  pageInCache,
size_t  length 
)

Similar with above method. The difference is this method will not seek, it just load sequentially, so the caller needs to make sure current position is correct.

Definition at line 558 of file PartitionedFile.cc.

bool PartitionedFile::openAll ( )
overridevirtual

Open meta partition and all data partitions

Implements PDBFileInterface.

Definition at line 196 of file PartitionedFile.cc.

bool PartitionedFile::openData ( )

Open data partitions

Definition at line 146 of file PartitionedFile.cc.

bool PartitionedFile::openDataDirect ( )

Open data partitions using direct I/O

Definition at line 169 of file PartitionedFile.cc.

bool PartitionedFile::openMeta ( )

Open meta partition only

Definition at line 122 of file PartitionedFile.cc.

int PartitionedFile::seekNumFlushedPagesInMeta ( )
protected

Seek to the numFlushedPages field in meta data.

Seek to numFlushedPages field in meta data.

Definition at line 923 of file PartitionedFile.cc.

int PartitionedFile::seekNumFlushedPagesInPartitionMeta ( FilePartitionID  partitionId)
protected

Seek to the numFlushedPages field in partition meta data.

Definition at line 935 of file PartitionedFile.cc.

int PartitionedFile::seekPage ( FILE *  partition,
unsigned int  pageSeqInPartition 
)
protected

Seek to the beginning of the page data of a page specified in the file.

Seek to the beginning of the page data for a page specified in the file.

Definition at line 891 of file PartitionedFile.cc.

int PartitionedFile::seekPageDirect ( int  handle,
unsigned int  pageSeqInPartition 
)
protected

Seek to the beginning of the page data of the page specified in the file.

Definition at line 901 of file PartitionedFile.cc.

int PartitionedFile::seekPageSizeInMeta ( )
protected

Seek to the page size field in meta data.

Definition at line 912 of file PartitionedFile.cc.

void PartitionedFile::setDataPartitionPaths ( const vector< string > &  dataPartitionPaths)

Set dataPartitionPaths;

Definition at line 847 of file PartitionedFile.cc.

int PartitionedFile::updateMeta ( )
overridevirtual

Update the meta partition

Below function is buggy, please use writeMeta() instead.

Update the meta partition based on counters. This can be used after a batch of flushing. The difference between writeMeta() and updateMeta() is that the latter function will only update a few fields.

Implements PDBFileInterface.

Definition at line 451 of file PartitionedFile.cc.

int PartitionedFile::writeData ( FILE *  file,
void *  data,
size_t  length 
)
protected

Write data specified to the current file position.

Definition at line 855 of file PartitionedFile.cc.

int PartitionedFile::writeDataDirect ( int  handle,
void *  data,
size_t  length 
)
protected

Write data specified to the current file position using direct I/O.

Definition at line 873 of file PartitionedFile.cc.

int PartitionedFile::writeMeta ( )
overridevirtual

Initialize the meta partition

Initialize the meta partition, with following format:

  • Metadata Size
  • FileType
  • Version
  • PageSize
  • NumFlushedPages
  • LastFlushedPageID (Added Mar21,2016)
  • NumPartitions
  • PartitionID for 1st partition
  • NumFlushedPages in 1st partition
  • Length of the path to 1st partition
  • Path to 1st partition
  • PartitionID for 2nd partition
  • NumFlushedPages in 2nd partition
  • Length of the path to 2nd partition
  • Path to 2nd partition
  • ...
  • PageId for the 1st page
  • PartitionId for the 1st page
  • PageSeqIdInPartition for the 1st page
  • ...

Implements PDBFileInterface.

Definition at line 358 of file PartitionedFile.cc.

Member Data Documentation

bool PartitionedFile::cleared
private

whether the file is cleared

Definition at line 396 of file PartitionedFile.h.

vector<FILE*> PartitionedFile::dataFiles
private

Data files

Definition at line 340 of file PartitionedFile.h.

vector<int> PartitionedFile::dataHandles
private

Definition at line 341 of file PartitionedFile.h.

vector<string> PartitionedFile::dataPartitionPaths
private

Paths to data partitions

Definition at line 351 of file PartitionedFile.h.

DatabaseID PartitionedFile::dbId
private

DatabaseID of this PartitionedFile instance

Definition at line 371 of file PartitionedFile.h.

pthread_mutex_t PartitionedFile::fileMutex
private

Lock to synchronize delete and append operations

Definition at line 328 of file PartitionedFile.h.

pdb::PDBLoggerPtr PartitionedFile::logger = nullptr
private

Logger instance

Definition at line 356 of file PartitionedFile.h.

PartitionedFileMetaDataPtr PartitionedFile::metaData = nullptr
private

Meta data instance

Definition at line 361 of file PartitionedFile.h.

FILE* PartitionedFile::metaFile = nullptr
private

Meta file

Definition at line 334 of file PartitionedFile.h.

string PartitionedFile::metaPartitionPath
private

Path to meta partition

Definition at line 346 of file PartitionedFile.h.

NodeID PartitionedFile::nodeId
private

NodeID of this PartitionedFile instance

Definition at line 366 of file PartitionedFile.h.

size_t PartitionedFile::pageSize = 0
private

Configured page size.

Definition at line 386 of file PartitionedFile.h.

SetID PartitionedFile::setId
private

SetID of this PartitionedFile instance

Definition at line 381 of file PartitionedFile.h.

UserTypeID PartitionedFile::typeId
private

UserTypeID of this PartitionedFile instance

Definition at line 376 of file PartitionedFile.h.

bool PartitionedFile::usingDirect
private

Using direct I/O or not

Definition at line 391 of file PartitionedFile.h.


The documentation for this class was generated from the following files: