Ganga coding sprint July 2015

On the 7th of July 2015 we will meet at Imperial College for a Ganga coding sprint. The main topics of the day are:

  • Refactoring the 'prepare job' logic to make it less tangled with the rest of the code
  • Look at how Alex's rewrites of parts of the system can inform a change in design
  • Reduce and simplify the dependencies between modules
All this work is leading towards a simpler, more modern codebase for which we can start to get compatible with Python 3. A simplified testing framework where we can start to do proper unit tests without having to import the whole Ganga process will help too.

Location: CMS center in Blackett Building (544 on the 5th floor left as you come out of the lift and on the left again).

Prepare job

Some queries about the prepared state repository structure/implementation

A lot of very expensive operations are performed on shutdown when the user likely just wants to close a session and go away. Can we consider moving prepared state repository sanity checks to a background service on the queues system?

This is appearing as a repository which is storing all data in element 0 or in some cases of a corrupt repo a few entries. This entry is locked when being modified but contains data possibly from multiple jobs. This is less than ideal from the perspective of queues.

The unittest for this is currently not fit for purpose. It's difficult to decode and fails very early on on asset 3 of about ~50 asserts with no functions. This unittest will have to be completely reworked for the Jenkins build system and will require some effort to work out the expected behaviour of the prepared state repo. This is either ideally suited for an expert or a good training exercise for someone wanting to learn how this works (any PhD students looking for work @Imperial/Birmigham?)

Places the code is used

Can all of this code simply be broken out into a supporting library (or 2, or 3?) which allows for a clearer understanding of what actions are being performed rather than have all of the logic embedded across the core.

Schema.py

Item has a preparable attribute which is available on all schema items.

IPrepareApp.py

This is where most of the logic should be moved to I think.

It assumes that all its subclasses have an is_prepared schema property. The property is supposed to only be None or boolean but there's lots of places which set it to a ShareDir. This at least needs sorting out!

There should also be a hash property on all subclasses but that seems like it should be only kept on the base class?

Objects.py

copyFrom a Node updates the application's shared registry counter. This will only work if it's an IPrepareApp

printPrepTree should move to IPrepareApp

Descriptor.__set__ passes on the preprable property if it's setting a preparable sequence

Proxy.py

Loads the Preparable config section and <gangadir>/shared/<user> location

ProxyDataDescriptor.__set__ has lots of special logic for both if the item is preparable and if setting the is_prepared item. Should all be moved out if possible but would help to have a rewrite of the schema system.

GPIProxyClassFactory._copy has lots more special logic for handling prepared and preparable objects

IRuntimeHandler.py

The runtime handler is the interface between the application and the backend. It provides prepare and master_prepare methods which packs the application information into a standard form that the backend can understand.

master_prepare runs once per job but prepare will run on each subjob

The prepare methods can access the is_prepared property to decide whether something different needs doing.

Executable.py

As an example of an IPrepareApp and IRuntimeHandler...

To quote the docstring for Executable.prepare():

A method to place the Executable application into a prepared state.

The application wil have a Shared Directory object created for it. 
If the application's 'exe' attribute references a File() object or
is a string equivalent to the absolute path of a file, the file 
will be copied into the Shared Directory.

Otherwise, it is assumed that the 'exe' attribute is referencing a 
file available in the user's path (as per the default "echo Hello World"
example). In this case, a wrapper script which calls this same command 
is created and placed into the Shared Directory.

When the application is submitted for execution, it is the contents of the
Shared Directory that are shipped to the execution backend. 

The Shared Directory contents can be queried with 
shareref.ls('directory_name')

See help(shareref) for further information.

So most of the logic for actually preparing the shared directory is in this method.

IBackend.py

master_prepare seems to mostly dealing with a different meaning of the term 'prepare' but it does have special logic for dealing with prepared applications.

Job.py

There's lots of stuff in Job for handling preparable applications. I feel that since it's really only a application's business whether it's using a shared directory for its input sandbox then Job shouldn't care. Trying to find a way to make IPrepareApp automatically do the right thing at the right time would be much more beneficial and would avoid too much spaghetti.

GangaList.py

GangaList has some special logic for deciding whether it's readonly depending on whether its parent is an is_prepared application.

Rewritten modules

Alex mentioned some of his plans/code in a meeting and there were two main things that stood out:

  • A different repository system using a database like SQLite for improved concurrency etc.
  • Rewritten Schema (and Proxy?) stuff to make it more Pythonic.
The latter will be particularly worth bearing in mind while making these changes.

For any more information you will have to ask Alex smile

Repository Locking

Ganga is currently leaking a lot of repo locks. I spotted this after doing some work with the afs 'lock-file' and multiple job submission would no longer work due to bleeding of lock files from the prepared state.

Keeping track in JIRA

There's a meta-issue I have opened at GANGA-1989. Any issues we open which are related should be marked as 'blocking' GANGA-1989 (via 'More'→'Link' on an issue page).

Should we close existing issues beyond a certain age? - rcurrie

I did a trawl through the old open issues to close what I could but there was too many I had no knowledge or understanding of. It would need more experienced people to have a look. -- Matt W

New Testing System

What do we want and what do we need?

What can Jenkins deliver?

What do we need to do to work with Jenkins?

Are we losing any functionality for the gains we get from this?

Stability

Now that prep.metadata is no longer plaguing our startup/shutdown/unittests is there anything to be learned from this?

Shutdown, potentially fragile but working, does this need to be further formalised or are we happy with our working solution?

Startup, this is a loaded issue but are there issues which lead to instability during startup?

MetaPlugin

Is it possible to make a simple meta-plugin or meta-application which given a recipe will construct a templated Application and Job structure? (Maybe even a task system?)

This would allow us to significantly reduce the barrier of entry for future projects to join/use Ganga.

Code Quality

Shall we move to trying to keep trunk stable and do we want to introduce any policies for improving the code readability and possibly performance.

Performance and Profiling

Some feedback on Profiling Ganga (rcurrie). The good, the bad and the ugly. Some strange effects can be seen from 'trivial' changes.

Externals

We're looking toward python 2.7/3.x and we're still shipping with ipython 0.16 (if memory serves) is this a good combination?

I think it's version 0.6.13 (at least that's the LHCb version) which is exactly 10 years old. It's probably worth updating it as it certainly won't support Python 3. Neither will any of the external dependencies probably. In general I would like to investigate replacing the whole install/dependency system with something more modern using pip and maybe virtualenv --Matt W

Just a warning, having dealt with installing/debugging externals from some other projects which require compiling I think we should never share source code as the default install method for users, we compile it against slc6 and push the binaries out. - rcurrie

Luckily we have no binary dependencies, only pure Python (as far as I can tell). The only exception in /external/ is pycrypto (which isn't even used from what I can see). Even if we did need binaries, we could build wheels (pep-0427). You're right about being careful though. --Matt W

Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2015-07-06 - RobCurrie
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback