Shared Memory Allocator Tests

As part of the effort to optimize LHC physics software applications to run on multi-core cpus, this page shows some test examples in order to create a C++ STL-compliant shared memory allocator. In order to reproduce the examples it is needed the Boost.Interprocess library version 1.35, gcc 3.4 or higher and eventually gdb 6.8.

The design of the examples should try to focus on:

  • Simplification of client code
  • Error handling through exceptions
  • Total avoidance of segmentation faults
  • Hiding synchronization mechanisms for the creation and destruction of shared resources (Work in progress, under construction)

Prerequisites to understand the code:

  • Read and understand the shared memory test examples.
  • Read Stroustrup Chapter 9.4 and the Dr. Dobb's article to clarify your ideas about STL allocators.

Download sources from here.

Test 1: Shared vector of integers

Main: allocator1/master.cc, allocator1/slave.cc Description: Two process running on different programs uses a shared vector of integers. A Factory of vectors is created for this purpose. The same example has been reproduced using forked processes (allocator1/forked.cc).

Restrictions and Comments: It is worth to mention the following comments and restrictions to the code example:

  1. The current Factory implementation can only be used for vectors of ints. The idea is easy to generalize to vector/maps/sets/etc of multiple types via templates.
  2. The Factory does not solve the synchronization problem between process (e.g. The "slave" should not read until the "master" has created the container). Additional study of Boost Interprocess synchronization mechanisms is needed.
  3. The created container is not a std::vector but a Factory::Vector (not a template). This restriction is general because the STL containers does not use the Allocator::pointer but the T* (for more information read Boost Interprocess documentation). This implies that the client code will need to make substitutions from "std::vector" to "Factory::Vector" and so.
  4. If the process that creates and destroys the shared memory segment is killed then the Shared memory is left "alive". The current solution is to destroy the shared segment and create it again if the shared memory is present. This is not a good coding practice. An exception should be thrown in that case. However, an adequate mechanism equivalent to RIIA idiom should be found for shared memory.
  5. Except for the previous comment all the error handling is done using exceptions and asserts.
  6. The heavy usage of templates makes the code difficult to develop. The compiler messages are difficult to understand:
In file included from master.cc:3:
Factory.h: In member function `void Factory::createSegment()':
Factory.h:79: error: no matching function for call to `boost::interprocess::basic_managed_shared_memory<char, boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, boost::interprocess::iset_index>::basic_managed_shared_memory(const boost::interprocess::create_only_t&, std::string&, unsigned int&)'
/usr/local/include/boost-1_35/boost/interprocess/interprocess_fwd.hpp:214: note: candidates are: boost::interprocess::basic_managed_shared_memory<char, boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, boost::interprocess::iset_index>::basic_managed_shared_memory(const boost::interprocess::basic_managed_shared_memory<char, boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, boost::interprocess::iset_index>&)
/usr/local/include/boost-1_35/boost/interprocess/managed_shared_memory.hpp:115: note:                 boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType>::basic_managed_shared_memory(boost::interprocess::detail::moved_object<boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType> >&) [with CharType = char, AllocationAlgorithm = boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, IndexType = boost::interprocess::iset_index]
/usr/local/include/boost-1_35/boost/interprocess/managed_shared_memory.hpp:104: note:                 boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType>::basic_managed_shared_memory(boost::interprocess::open_only_t, const char*, const void*) [with CharType = char, AllocationAlgorithm = boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, IndexType = boost::interprocess::iset_index]
/usr/local/include/boost-1_35/boost/interprocess/managed_shared_memory.hpp:94: note:                 boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType>::basic_managed_shared_memory(boost::interprocess::open_or_create_t, const char*, size_t, const void*) [with CharType = char, AllocationAlgorithm = boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, IndexType = boost::interprocess::iset_index]
/usr/local/include/boost-1_35/boost/interprocess/managed_shared_memory.hpp:82: note:                 boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType>::basic_managed_shared_memory(boost::interprocess::create_only_t, const char*, size_t, const void*) [with CharType = char, AllocationAlgorithm = boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, IndexType = boost::interprocess::iset_index]
/usr/local/include/boost-1_35/boost/interprocess/managed_shared_memory.hpp:61: note:                 boost::interprocess::basic_managed_shared_memory<CharType, MemoryAlgorithm, IndexType>::basic_managed_shared_memory() [with CharType = char, AllocationAlgorithm = boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void>, 0u>, IndexType = boost::interprocess::iset_index]

Test 2: Shared vector of POD

Main: allocator2/forked.cc

Description: Create an example of shared memory in forked process that uses shared STL containers that contains [http://en.wikipedia.org/wiki/Plain_Old_Data_Structures][POD]] objects.

Comments: The implemented Factory class can not evolve to a multi-container Factory. A re-design of the class is needed involving "typedef templates". We do not want in the client code something like:

Factory::PODVector& podv = f.createPODVector("podv");
where Factory::PODVector is declared like,
typedef ip::allocator<PODClass, ip::managed_shared_memory::segment_manager> PODAllocator;
typedef ip::vector<PODClass, PODAllocator> PODVector;

Instead, it is desirable a client code like:

Vector<PODClass>& v = f.create("podv");
where Vector container is declared like:
template <typename T> typedef ip::vector< T, ip::allocator<T, ip::managed_shared_memory::segment_manager>> Vector
Actually this is not possible in C++.

Test 3: Templatized Factory of a Shared vector of POD

Main: allocator3/forked.cc

Description: Create an example of shared memory in forked process that uses shared STL containers that contains [http://en.wikipedia.org/wiki/Plain_Old_Data_Structures][POD]] objects. The Factory class uses a kind of "template typedef" in order to be simpler and more readable. Now the Vector container is declared like:

template <typename T> struct Allocator {
  typedef ip::allocator<T, ip::managed_shared_memory::segment_manager> Type;
};
template <typename T> struct Vector {
  typedef typename ip::vector<T, typename Allocator<T>::Type> Type;
};

Comments: In that phase the client code looks like:

Allocator<int>::Type a_int= f.getAllocator<int>();
Vector<int>::Type* v= f.getSegment().construct<Vector<int>::Type>("v")(a_int);
Still is to complex and the final Type identifier is quite ugly. The Factory does not hide the creation and destruction of shared objects.

We have observed the following error:

Factory.h:44: error: expected primary-expression before '>' token 
The error came from inside a template method:
 
res = this->getSegment().find< T >(id.c_str()); 

After some search we have seen that the name "find" can not be resolved by the C++ compiler. The line should be modified to:

this->getSegment().template find< T >(id.c_str()); 

Test 4: Templatized Factory of a Shared vector of POD (and 2)

Main: allocator4/forked.cc Description: Instead of using the "template typedef" trick we try to create a subclass of the corresponding container. This is the Vector is now defined like:
template <typename T> class Allocator2: public ip::allocator<T, ip::managed_shared_memory::segment_manager>  {
public:
  Allocator2(ip::managed_shared_memory::segment_manager* _seg):ip::allocator<T, ip::managed_shared_memory::segment_manager>(_seg) {;}
};
template <typename T> class Vector2: public ip::vector<T, Allocator2<T> > {
public:
  Vector2(Allocator2<T>& _all): ip::vector<T, Allocator2<T> >(_all) {;}
};
This means that the client code can remove the Type identifier like:
Allocator<int> a_int= f.getAllocator<int>();
Vector<int>* v= f.getSegment().construct<Vector<int> >("v")(a_int);
This is a better syntax.

Comments: the subclass "trick" has a performance penalty compared to the previous one?

Test 5: Shared Vector of Shared Vectors

Main:allocator5/forked.cc

Description: A forked process creates a shard vector of objects that contain vectors of PODs. We want to embed shared vectors in objects that are contained in shared vectors (i.e. shared vectors within shared vectors). A "process-wide Resource Initialization Is Acquisition" (RIIA) semantics has been chosen for the creation and destruction of the shared objects. This semantics allows the simplification of the client code. Now the destruction of the shared objects and the shared memory segments are implicit. The client code looks like:

f.createContainer<Vector2<PODWithVector> >("v");
Vector2<PODWithVector>& v = f.get<Vector2<PODWithVector> >("v");

f.create< PODClass> ("pod");
PODClass& pod= f. get<PODClass>("pod");

Comment: Several comments are needed.

  1. A concrete semantics to create and destroy the shared objects is needed. In the current implementation it has been decided that the process and code block that creates the objects also owns the object. Therefore, the shard objects are locally scoped. This simplifies the memory management and can make a further reduction in the client code.
  2. The creation and destruction of the shared memory segment follows the same semantics. The creator of the memory segment will be always the destructor.
  3. The class PODWithVector uses a "trick" to create the shared memory identifiers automatically. We create the identifiers using 4 random integers. The usage of strings or char* is not possible because we will need identifiers again for these objects
  4. In order to understand the code a specialized knowledge on templates is needed.

Work in progress, under constructionTest 6: Shared Vectors with Virtual Class Objects

Description: The objective is to find a design of the Factory class that allows the creation of multiple

Work in progress, under constructionTest 7: Shard vectors that Contain Objects with Pointers

Description: The objective is to find a design of the Factory class that allows the creation of multiple

Work in progress, under constructionConclusion

  • Using shared memory implies intrusion in the client code. This is unavoidable (see shared memory test examples)
  • The shared memory segment can not grow. It is needed to estimate the size of the required shared memory.
  • Production code is based on templates. This implies that the code is difficult to design and implement. Compile time errors in the client are also difficult to understand (also in the STL library).
  • It seems that the solution based in "Interprocess Resource Initialization is Acquisition" is simple to understand and avoids errors. This semantics copes with all the use cases?

Work in progress, under constructionCMSsandbox.ToDo List

  • Hiding the interprocess synchorization mechanism to create and destroy shared objects
  • Code review
  • Assure exception and thread safety
  • Performance measurements STL versus shared STL
  • Create test suite in oder to check for fault-free and fault correctness.

References

Stroustrup, The C++ Programming Language, Chapter 19.4 Allocators. Austern, M., The Standard Library: What Allocators are Good for?, Dr. Dobb's. Meyers, S., Effective C++, Memory Managment, Item 5-10. Meyers, S., More Effective C++, Understanding the different meaning of new and delete, Item 8. http://www.boost.org/doc/libs/1_35_0/doc/html/interprocess/allocators_containers.html, Allocators, Containers and Memory Allocation Algorithms. Failed Sourceforge "alloctor" project. Hutter, H., "The New C++: Typedef Templates", Dr. Dobb's, Dec 2002. D. Vandervoode and N. M. Josuttis, C++ Templates: The Complete Guide, Addison-Wesley Professional, 2002.

-- MarcMagransDeAbril - 04 Jul 2008

Topic attachments
I Attachment History Action Size Date Who Comment
Compressed Zip archivezip allocator20080702.zip r1 manage 2706.5 K 2008-07-02 - 17:51 MarcMagransDeAbril STL allocator examples and smaps scripts. Requires boost 1.35
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2010-11-11 - MatthiasStein
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback