A platform for high-performance distributed tool and library development written in C++. It can be deployed in two different cluster modes: standalone or distributed. API for v0.5.0, released on June 13, 2018.
|
#include <PartitionedFile.h>
Public Member Functions | |
PartitionedFile (NodeID nodeId, DatabaseID dbId, UserTypeID typeId, SetID setId, string metaPartitionPath, vector< string > dataPartitionPaths, pdb::PDBLoggerPtr logger, size_t pageSize) | |
PartitionedFile (NodeID nodeId, DatabaseID dbId, UserTypeID typeId, SetID setId, string metaPartitionPath, pdb::PDBLoggerPtr logger) | |
~PartitionedFile () | |
bool | openAll () override |
bool | openMeta () |
bool | openData () |
bool | openDataDirect () |
bool | closeAll () override |
bool | closeDirect () |
void | clear () override |
int | appendPage (FilePartitionID partitionId, PDBPagePtr page) override |
int | appendPageDirect (FilePartitionID partitionId, PDBPagePtr page) |
int | writeMeta () override |
int | updateMeta () override |
size_t | loadPage (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length) override |
size_t | loadPageDirect (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length) |
size_t | loadPageFromCurPos (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length) |
PageID | loadPageId (FilePartitionID partitionId, unsigned int pageSeqInPartition) |
PageID | loadPageIdFromCurPos (FilePartitionID partitionId, unsigned int pageSeqInPartition, char *pageInCache, size_t length) |
unsigned int | getAndSetNumFlushedPages () override |
unsigned int | getNumFlushedPages () override |
PageID | getLastFlushedPageID () override |
PageID | getLatestPageID () override |
NodeID | getNodeId () override |
DatabaseID | getDbId () override |
UserTypeID | getTypeId () override |
SetID | getSetId () override |
size_t | getPageSize () override |
size_t | getPageSizeInMeta () override |
FileType | getFileType () override |
PartitionedFileMetaDataPtr | getMetaData () |
void | initializeDataFiles () |
void | setDataPartitionPaths (const vector< string > &dataPartitionPaths) |
void | buildMetaDataFromMetaPartition (SharedMemPtr shm) |
unsigned int | getNumPartitions () |
Public Member Functions inherited from PDBFileInterface | |
virtual | ~PDBFileInterface () |
Protected Member Functions | |
int | writeData (FILE *file, void *data, size_t length) |
int | writeDataDirect (int handle, void *data, size_t length) |
int | seekPage (FILE *file, unsigned int pageSeqInPartition) |
int | seekPageDirect (int handle, unsigned int pageSeqInPartition) |
int | seekPageSizeInMeta () |
int | seekNumFlushedPagesInMeta () |
int | seekNumFlushedPagesInPartitionMeta (FilePartitionID partitionId) |
Private Attributes | |
pthread_mutex_t | fileMutex |
FILE * | metaFile = nullptr |
vector< FILE * > | dataFiles |
vector< int > | dataHandles |
string | metaPartitionPath |
vector< string > | dataPartitionPaths |
pdb::PDBLoggerPtr | logger = nullptr |
PartitionedFileMetaDataPtr | metaData = nullptr |
NodeID | nodeId |
DatabaseID | dbId |
UserTypeID | typeId |
SetID | setId |
size_t | pageSize = 0 |
bool | usingDirect |
bool | cleared |
This class wraps a PartitionedFile class that implements the PDBFileInterface. If using PartitionedFile, each set will be flushed to multiple file partitions in configured data directories. The metadata will be flushed to a partition in the meta directory.
Meta partition format:
Data partition format:
Definition at line 69 of file PartitionedFile.h.
PartitionedFile::PartitionedFile | ( | NodeID | nodeId, |
DatabaseID | dbId, | ||
UserTypeID | typeId, | ||
SetID | setId, | ||
string | metaPartitionPath, | ||
vector< string > | dataPartitionPaths, | ||
pdb::PDBLoggerPtr | logger, | ||
size_t | pageSize | ||
) |
Create a new PartitionedFile instance.
Definition at line 38 of file PartitionedFile.cc.
PartitionedFile::PartitionedFile | ( | NodeID | nodeId, |
DatabaseID | dbId, | ||
UserTypeID | typeId, | ||
SetID | setId, | ||
string | metaPartitionPath, | ||
pdb::PDBLoggerPtr | logger | ||
) |
Initialize a partitionedFile instance from existing meta data.
Create a simple PartitionedFile instance, and later will add information to the instance by parsing existing meta data or initialize the instance with nullptrs.
Definition at line 87 of file PartitionedFile.cc.
PartitionedFile::~PartitionedFile | ( | ) |
Destructor, it will NOT delete on-disk files.
Definition at line 110 of file PartitionedFile.cc.
|
overridevirtual |
Append page to the partition identified by partitionId Return the PageSeqInPartition, if success, return -1 on failure.
Append page to the partition identified by partitionId
Implements PDBFileInterface.
Definition at line 266 of file PartitionedFile.cc.
int PartitionedFile::appendPageDirect | ( | FilePartitionID | partitionId, |
PDBPagePtr | page | ||
) |
Append page using direct I/O
Definition at line 305 of file PartitionedFile.cc.
void PartitionedFile::buildMetaDataFromMetaPartition | ( | SharedMemPtr | shm | ) |
Set up meta data by parsing meta partition
Meta partition format:
Definition at line 672 of file PartitionedFile.cc.
|
overridevirtual |
To delete a file instance.
Implements PDBFileInterface.
Definition at line 236 of file PartitionedFile.cc.
|
overridevirtual |
Close meta partition and all data partitions
Implements PDBFileInterface.
Definition at line 204 of file PartitionedFile.cc.
bool PartitionedFile::closeDirect | ( | ) |
Close meta partition and all data partitions using direct I/O
Definition at line 222 of file PartitionedFile.cc.
|
overridevirtual |
Read from the meta partition about lastFlushedPageId, set the lastFlushedPageId variable. Also set the numFlushedPages variable as lastFlushedPageId + 1. (This is true for the time being!)
Used when initialize PartitionedFile instance from metaPartition file on disk. Read from the meta partition about numFlushedPages, set the numFlushedPages variable. (This is true for the time being!)
Implements PDBFileInterface.
Definition at line 584 of file PartitionedFile.cc.
|
overridevirtual |
Return DatabaseID of this file
Implements PDBFileInterface.
Definition at line 636 of file PartitionedFile.cc.
|
overridevirtual |
To return the file type of the file: SequenceFile or PartitionedFile
Implements PDBFileInterface.
Definition at line 658 of file PartitionedFile.cc.
|
overridevirtual |
Return lastFlushedPageID
Implements PDBFileInterface.
Definition at line 615 of file PartitionedFile.cc.
|
overridevirtual |
Return latestPageID
Return latestPageID (Note, actually it's the largest page id)
Implements PDBFileInterface.
Definition at line 622 of file PartitionedFile.cc.
PartitionedFileMetaDataPtr PartitionedFile::getMetaData | ( | ) |
Return a smart pointer pointing to the metaData of this PartitionedFile instance.
Definition at line 260 of file PartitionedFile.cc.
|
overridevirtual |
Return NodeID of this file
Implements PDBFileInterface.
Definition at line 629 of file PartitionedFile.cc.
|
overridevirtual |
Return numFlushedPages
Implements PDBFileInterface.
Definition at line 608 of file PartitionedFile.cc.
unsigned int PartitionedFile::getNumPartitions | ( | ) |
To return the number of data partitions in the file
Definition at line 665 of file PartitionedFile.cc.
|
overridevirtual |
Return page size of this file
Implements PDBFileInterface.
Definition at line 803 of file PartitionedFile.cc.
|
overridevirtual |
Read meta partition to get and set pageSize of this file
Read meta partition to return the pageSize of this file
Implements PDBFileInterface.
Definition at line 814 of file PartitionedFile.cc.
|
overridevirtual |
Return SetID of this file
Implements PDBFileInterface.
Definition at line 651 of file PartitionedFile.cc.
|
overridevirtual |
Return UserTypeID of this file
Implements PDBFileInterface.
Definition at line 644 of file PartitionedFile.cc.
void PartitionedFile::initializeDataFiles | ( | ) |
initialize data files;
Definition at line 836 of file PartitionedFile.cc.
|
overridevirtual |
To load the page from a given sequence/ordering number of the page in the partition that is specified by partitionId. This function will be invoked by PageCache instance, length is the size of shared memory allocated for this load. The PageCache instance should make sure that length == pageSize. Otherwise, if length > pageSize, some space will be wasted, and if length < pageSize, some data on the page may not get loaded.
To load the page from a given sequence/ordering number of the page in the partition that is specified by partitionId. This function will be invoked by PageCache instance, length is the size of shared memory allocated for this load. The PageCache instance should make sure that length == pageSize. Otherwise, if length > pageSize, some space will be wasted, and if length < pageSize, some data on the page may not get loaded.
Return loaded size;
Implements PDBFileInterface.
Definition at line 483 of file PartitionedFile.cc.
size_t PartitionedFile::loadPageDirect | ( | FilePartitionID | partitionId, |
unsigned int | pageSeqInPartition, | ||
char * | pageInCache, | ||
size_t | length | ||
) |
To load page using direct I/O.
Definition at line 505 of file PartitionedFile.cc.
size_t PartitionedFile::loadPageFromCurPos | ( | FilePartitionID | partitionId, |
unsigned int | pageSeqInPartition, | ||
char * | pageInCache, | ||
size_t | length | ||
) |
Similar with above method. The difference is this method will not seek, it just load sequentially, so the caller needs to make sure current position is correct.
Definition at line 537 of file PartitionedFile.cc.
PageID PartitionedFile::loadPageId | ( | FilePartitionID | partitionId, |
unsigned int | pageSeqInPartition | ||
) |
To load the pageId for a specified page. Return the pageId, if page exists, otherwise, return (unsigned int)(-1)
Definition at line 527 of file PartitionedFile.cc.
PageID PartitionedFile::loadPageIdFromCurPos | ( | FilePartitionID | partitionId, |
unsigned int | pageSeqInPartition, | ||
char * | pageInCache, | ||
size_t | length | ||
) |
Similar with above method. The difference is this method will not seek, it just load sequentially, so the caller needs to make sure current position is correct.
Definition at line 558 of file PartitionedFile.cc.
|
overridevirtual |
Open meta partition and all data partitions
Implements PDBFileInterface.
Definition at line 196 of file PartitionedFile.cc.
bool PartitionedFile::openData | ( | ) |
Open data partitions
Definition at line 146 of file PartitionedFile.cc.
bool PartitionedFile::openDataDirect | ( | ) |
Open data partitions using direct I/O
Definition at line 169 of file PartitionedFile.cc.
bool PartitionedFile::openMeta | ( | ) |
Open meta partition only
Definition at line 122 of file PartitionedFile.cc.
|
protected |
Seek to the numFlushedPages field in meta data.
Seek to numFlushedPages field in meta data.
Definition at line 923 of file PartitionedFile.cc.
|
protected |
Seek to the numFlushedPages field in partition meta data.
Definition at line 935 of file PartitionedFile.cc.
|
protected |
Seek to the beginning of the page data of a page specified in the file.
Seek to the beginning of the page data for a page specified in the file.
Definition at line 891 of file PartitionedFile.cc.
|
protected |
Seek to the beginning of the page data of the page specified in the file.
Definition at line 901 of file PartitionedFile.cc.
|
protected |
Seek to the page size field in meta data.
Definition at line 912 of file PartitionedFile.cc.
void PartitionedFile::setDataPartitionPaths | ( | const vector< string > & | dataPartitionPaths | ) |
Set dataPartitionPaths;
Definition at line 847 of file PartitionedFile.cc.
|
overridevirtual |
Update the meta partition
Below function is buggy, please use writeMeta() instead.
Update the meta partition based on counters. This can be used after a batch of flushing. The difference between writeMeta() and updateMeta() is that the latter function will only update a few fields.
Implements PDBFileInterface.
Definition at line 451 of file PartitionedFile.cc.
|
protected |
Write data specified to the current file position.
Definition at line 855 of file PartitionedFile.cc.
|
protected |
Write data specified to the current file position using direct I/O.
Definition at line 873 of file PartitionedFile.cc.
|
overridevirtual |
Initialize the meta partition
Initialize the meta partition, with following format:
Implements PDBFileInterface.
Definition at line 358 of file PartitionedFile.cc.
|
private |
whether the file is cleared
Definition at line 396 of file PartitionedFile.h.
|
private |
Data files
Definition at line 340 of file PartitionedFile.h.
|
private |
Definition at line 341 of file PartitionedFile.h.
|
private |
Paths to data partitions
Definition at line 351 of file PartitionedFile.h.
|
private |
DatabaseID of this PartitionedFile instance
Definition at line 371 of file PartitionedFile.h.
|
private |
Lock to synchronize delete and append operations
Definition at line 328 of file PartitionedFile.h.
|
private |
Logger instance
Definition at line 356 of file PartitionedFile.h.
|
private |
Meta data instance
Definition at line 361 of file PartitionedFile.h.
|
private |
Meta file
Definition at line 334 of file PartitionedFile.h.
|
private |
Path to meta partition
Definition at line 346 of file PartitionedFile.h.
|
private |
NodeID of this PartitionedFile instance
Definition at line 366 of file PartitionedFile.h.
|
private |
Configured page size.
Definition at line 386 of file PartitionedFile.h.
|
private |
SetID of this PartitionedFile instance
Definition at line 381 of file PartitionedFile.h.
|
private |
UserTypeID of this PartitionedFile instance
Definition at line 376 of file PartitionedFile.h.
|
private |
Using direct I/O or not
Definition at line 391 of file PartitionedFile.h.