Virtual Fule System – Part 2

Internal Data Structures

FileSystem Interface
  1. class FileSystem
  2. {
  3. public:
  4.  
  5.     /* Public method interface */
  6.  
  7. private:
  8.  
  9.     // Internal file system data types.
  10.     struct FileHeader;
  11.     struct DirEntry;
  12.     struct DirFileEntry;
  13.     class DataFilter;
  14.     class ZLibTest;    // Test filter using zlib
  15.     class IReader;
  16.     class IWriter;
  17.     class FileReader;
  18.     class MemoryReader;
  19.     class FileWriter;
  20.     class MemoryWriter;
  21.  
  22.     // Internal bookeeping typedefs for generating a PAK file.
  23.     typedef SmartPointer< File >                            FilePointer;
  24.     typedef SmartPointer< DataFilter >                        FilterPointer;
  25.     typedef String                                            FilePath;
  26.     typedef String                                            DirPath;
  27.     typedef String                                            Directory;
  28.     typedef SmartPointer< FilePath >                        FilePathPointer;
  29.     typedef SmartPointer< DirPath >                            DirPathPointer;
  30.     typedef SmartPointer< Directory >                        DirPointer;
  31.     typedef List< FilePathPointer >                            FilePathList;
  32.     typedef List< DirPathPointer >                            DirPathList;
  33.     typedef List< DirPointer >                                DirList;
  34.     typedef SmartPointer< DirFileEntry >                FileEntryPointer;
  35.     typedef PSX_Pair< FileEntryPointer, DirPathPointer >    FileEntryPair;
  36.     typedef SmartPointer< DirEntry >                    DirEntryPointer;
  37.     typedef    List< DirEntryPointer >                            PAKGenDirList;
  38.     typedef List< FileEntryPair >                            PAKGenFilePairList;
  39.     typedef List< FileEntryPointer >                        PAKGenFileList;
  40.     typedef PSX_Pair< FILTER_TYPE, FilterPointer >            FilterPair;
  41.     typedef Map< FILTER_TYPE, FilterPointer >                FilterMap;
  42.     typedef PSX_Pair< String, FileEntryPointer >            FileMapPair;
  43.     typedef Map< String, FileEntryPointer >                    FileEntryMap;
  44.  
  45.     // Internal functions used to manage Pulse file data
  46. };

Okay, I think I can hear your voice screaming now! Calm down! Take a deep breath and just relax… because most of these internal data structures are just small containers with one or two methods consisting of a few, straight-forward lines of code. First, let’s concentrate on the internal data structures.

  • FileHeader : This structure contains all the important high-level information of a PAK file.

  1.     // NOTE: This is exactly 56 bytes in size. We can get away with this w/o
  2.     // doing a #pragma pack(1).
  3.     struct FileSystem::FileHeader
  4.     {
  5.         Signature    m_signature;        // 16 bit GUID signature check.
  6.         Char        m_ID[4];            // 3 letter file format for format check.
  7.         DWORD        m_version;            // Version of this Pulse File.
  8.         WORD        m_numDirs;            // Number of directories.
  9.         WORD        m_numFiles;            // Number of files.
  10.         I32            m_filterBitField;    // Max of 32 possible filter algorithms to choose from.
  11.         SIZE_T64    m_size;                // Size of the file.
  12.         POS_T64        m_dirDiskStart;
  13.         POS_T64        m_fileDiskStart;
  14.  
  15.         FileHeader( void )
  16.         {
  17.             PSX_ZeroMem( this, sizeof( FileHeader ) );
  18.         }
  19.  
  20.         void WriteData( FileSystem::IWriter *pWriter );
  21.         void ReadData( FileSystem::IReader *pReader );
  22.     };

 

 

 

 

 

 

 

 

 

 

 

 

Most of the data members are self explanatory. The Signature 16-bit size data type is just a structure that stores a GUID or Global Unique Identifier. This is used as a signature check to make sure that the file being opened is truly our version of a PAK file. You can read more information about Microsoft’s GUID here:
http://msdn.microsoft.com/en-us/library/aa373931(VS.85).aspx
http://en.wikipedia.org/wiki/Globally_Unique_Identifier

  • DirEntry : The PAK file internal format contains two tables. The first one stores the directory entries. While the second one stores the file entries. DirEntry simply stores the directory name for now. I can’t think of anything that we need to add in here right now.
  1. struct FileSystem::DirEntry
  2. {
  3.     struct PAKData
  4.     {
  5.         WORD m_nameLen;
  6.     };
  7.  
  8.     String m_name;
  9.     PAKData m_PAKData;
  10.  
  11.     void WriteData( FileSystem::IWriter *pWriter );
  12.     void ReadData( FileSystem::IReader *pReader );
  13. };

There is one odd thing with this though. We have m_name that stores the name and m_nameLen that stores the length of the directory name WHICH is interestingly encapsulated inside a structure called PAKData. The reason we want to encapsulate this inside a separate structure has something to do with how we will read/write our data from/into a PAK file. We want to minimize read/write calls by simply writing or reading all of the bits if possible. m_name isn’t included because of its internal data structures. The String class dynamically allocates memory for storing and manipulating its string. If we simple read or write it directly, it would cause some problems with since its pointer would point in some random memory and could possible crash your application. Although it’s kind of redundant since there’s only one data contained in struct PAKData, this is still helpful in case we want to add some additional info in the future. The next structure, shows how struct PAKData  is effectively used in DirFileEntry.

  • DirFileEntry : This structure contains all the important information about a file stored in a PAK file.
  1. struct FileSystem::DirFileEntry
  2. {
  3.     //#pragma pack( 1 ) // Needed to pack this for direct read and write
  4.     // I am trusting the data alignment and size of this struct in the hands of the compiler.
  5.     // This struct should have a size of 50 bytes. So that we won’t suffer any performance
  6.     // from reading in memory.
  7.     struct PAKData
  8.     {
  9.         SIZE_T64    m_size;
  10.         SIZE_T64    m_compressedSize;
  11.         SIZE_T64    m_diskStart;
  12.         DWORD        m_filterBit;
  13.         DWORD        m_pathIndex;        // Points to the position of the dirpath entries
  14.         DWORD        m_nameLen;
  15.         BYTE        _padd[4];
  16.  
  17.         PAKData( void ) { PSX_ZeroMem( this, sizeof( PAKData ) ); }
  18.     };
  19.     //#pragma pack()
  20.  
  21.     String    m_name;
  22.     PAKData m_PAKData;
  23.  
  24.     void WriteData( FileSystem::IWriter *pWriter );
  25.     void ReadData( FileSystem::IReader *pReader );
  26. };

Notice all but m_name are all stored inside struct PAKData. Instead of writing each data member we could just simply do something like this
fstream.write( m&_PAKData, sizeof(PAKData) );

Here is a diagram showing how a PAK file is composed of these important data structures:
PAKFormat
Internal structure of a PAK file

  • DataFilter, ZLibtest: DataFilter is a base class for derived(or concrete) filter classes. As an example we have a test filter called ZLibtest. This filter uses the infamous deflate and inflate algorithm to compress and decompress data when needed.
  • IReader, IWriter : These interface classes serves as an abstraction layer for reading from and writing to sources. This makes our reads and writes easier by not caring about whether we’re reading from a file or memory or writing to a file or memory. FileReader, MemoryReader, FileWriter, MemoryWriter are derived classes designed to handle the reads and writes either from/to a memory or file. Below are the interfaces for the IReader and IWriter classes.
  1. class FileSystem::IReader
  2. {
  3. public:
  4.     virtual ~IReader( void ) { }
  5.     virtual SIZE_T Read( BYTE *pBuffer, SIZE_T size ) = 0;
  6.     virtual BOOL IsDone( void ) = 0;
  7.     virtual SIZE_T64 BytesLeft( void ) = 0;
  8. };
  9.  
  10. class FileSystem::IWriter
  11. {
  12. public:
  13.     virtual ~IWriter( void ) { }
  14.     virtual SIZE_T Write( BYTE *pBuffer, SIZE_T size ) = 0;
  15. };
  • A bunch of typedefs : After the class declarations, billions of typedefs follows. I really apologize for the unnecessary confusion. But while i was still  developing this system, i was experimenting with what class helpers and containers to use. The typedefs made it easier for me to quicky change from one data type to another with minimum changes. One thing you may notice aside from the normal container classes is the SmartPointer<> class. The SmartPointer<> class acts like the c++ boost’s shared_ptr. Basically, this container class will keep track anything that uses this object then automatically deletes it when no one is using it anymore. It is able to do this by using reference count. We’ll be using this to store our data structures so that we don’t have to worry manually deleting our allocated resources contained in our containers. Here is a quick example
  1. int main( void )
  2. {
  3.     // Store dynamically allocated int in SmartPointer
  4.     SmartPointer< int * > pInt1( new int );  // Internal ref is set to 1
  5.  
  6.     {
  7.         SmartPointer< int * > pInt2( pInt1 ); // Internal ref is now set to 2.
  8.         // When pInt2 gets destroyed ref is automatically deremented by 1.
  9.     }
  10.  
  11.     //pInt1 gets destroyed when it falls out of scope here… ref is 0 then it gets automatially
  12.     // delete w/o requiring us to do anything… 🙂
  13. }

You can learn more about shared_ptr or SmartPointers in this link http://www.boost.org/doc/libs/1_41_0/libs/smart_ptr/shared_ptr.htm

  • VFS Data Members : Now that we now know what data types we’ll be using, our VFS will be storing data members shown below. The class declaration and typedefs are also included as a reference.
FileSystem Interface
  1. class FileSystem
  2. {
  3. public:
  4.  
  5.     /* public interface */
  6.  
  7. private:
  8.  
  9.     // Internal file system data types.
  10.     struct FileHeader;
  11.     struct DirEntry;
  12.     struct DirFileEntry;
  13.     class DataFilter;
  14.     class ZLibTest;    // Test filter using zlib
  15.     class IReader;
  16.     class IWriter;
  17.     class FileReader;
  18.     class MemoryReader;
  19.     class FileWriter;
  20.     class MemoryWriter;
  21.  
  22.     // Internal bookeeping typedefs for generating a PAK file.
  23.     typedef SmartPointer< File >                            FilePointer;
  24.     typedef SmartPointer< DataFilter >                        FilterPointer;
  25.     typedef String                                            FilePath;
  26.     typedef String                                            DirPath;
  27.     typedef String                                            Directory;
  28.     typedef SmartPointer< FilePath >                        FilePathPointer;
  29.     typedef SmartPointer< DirPath >                            DirPathPointer;
  30.     typedef SmartPointer< Directory >                        DirPointer;
  31.     typedef List< FilePathPointer >                            FilePathList;
  32.     typedef List< DirPathPointer >                            DirPathList;
  33.     typedef List< DirPointer >                                DirList;
  34.     typedef SmartPointer< DirFileEntry >                    FileEntryPointer;
  35.     typedef PSX_Pair< FileEntryPointer, DirPathPointer >    FileEntryPair;
  36.     typedef SmartPointer< DirEntry >                        DirEntryPointer;
  37.     typedef    List< DirEntryPointer >                            PAKGenDirList;
  38.     typedef List< FileEntryPair >                            PAKGenFilePairList;
  39.     typedef List< FileEntryPointer >                        PAKGenFileList;
  40.     typedef PSX_Pair< FILTER_TYPE, FilterPointer >            FilterPair;
  41.     typedef Map< FILTER_TYPE, FilterPointer >                FilterMap;
  42.     typedef PSX_Pair< String, FileEntryPointer >            FileMapPair;
  43.     typedef Map< String, FileEntryPointer >                    FileEntryMap;
  44.  
  45.     /* Internal functions used to manage Pulse file data */
  46.  
  47. private:
  48.  
  49.     FileHeader                *m_pHeader;
  50.     PAKGenDirList            *m_pGenDirs;   
  51.     PAKGenFilePairList        *m_pGenFiles;        // Used in creating Pulse File
  52.     PAKGenFileList            *m_pGenFileList;    // Used in opening Pulse File
  53.     FileEntryMap            *m_pFileMap;        // Used in loading Pulse File
  54.     FileIO                    *m_pPulseFile;        // Used in reading loaded Pulse File
  55.     BOOL                    m_bLoaded;
  56.     FilterMap                m_filters;
  57.     OnProcessFileCallback    m_pOnProcessFile;    // Callback for selecting a filter when a file is about to be processed
  58.  
  59. };

 

Before we move on to the internal utility methods, i suggest taking the time to get familiarized with the insane amount of typedefs so you won’t get confused when we get to the actual implementations. 🙂 Most of them are just simple data types stored in SmartPointers then contained in a map or list.

For the continuation of this article, see Virtual Fule System – Part 3

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s