Ganga coding sprint July 2015
On the 7th of July 2015 we will meet at Imperial College for a Ganga coding sprint. The main topics of the day are:
- Refactoring the 'prepare job' logic to make it less tangled with the rest of the code
- Look at how Alex's rewrites of parts of the system can inform a change in design
- Reduce and simplify the dependencies between modules
All this work is leading towards a simpler, more modern codebase for which we can start to get compatible with Python 3. A simplified testing framework where we can start to do proper unit tests without having to import the whole Ganga process will help too.
Location: CMS center in Blackett Building (544 on the 5th floor left as you come out of the lift and on the left again).
Prepare job
Some queries about the prepared state repository structure/implementation
A lot of very expensive operations are performed on shutdown when the user likely just wants to close a session and go away.
Can we consider moving prepared state repository sanity checks to a background service on the queues system?
This is appearing as a repository which is storing all data in element 0 or in some cases of a corrupt repo a few entries.
This entry is locked when being modified but contains data possibly from multiple jobs. This is less than ideal from the perspective of queues.
The unittest for this is currently not fit for purpose. It's difficult to decode and fails very early on on asset 3 of about ~50 asserts with no functions.
This unittest will have to be completely reworked for the Jenkins build system and will require some effort to work out the expected behaviour of the prepared state repo.
This is either ideally suited for an expert or a good training exercise for someone wanting to learn how this works (any
PhD students looking for work
@Imperial
/Birmigham?)
Places the code is used
Can all of this code simply be broken out into a supporting library (or 2, or 3?) which allows for a clearer understanding of what actions are being performed rather than have all of the logic embedded across the core.
Schema.py
Item
has a
preparable
attribute which is available on all schema items.
This is where most of the logic should be moved to I think.
It assumes that all its subclasses have an
is_prepared
schema property. The property is supposed to only be
None
or boolean but there's lots of places which set it to a
ShareDir
. This at least needs sorting out!
There should also be a hash property on all subclasses but that seems like it should be only kept on the base class?
Objects.py
copyFrom
a
Node
updates the application's shared registry counter. This will only work if it's an
IPrepareApp
printPrepTree
should move to
IPrepareApp
Descriptor.__set__
passes on the
preprable
property if it's setting a preparable sequence
Proxy.py
Loads the
Preparable
config section and
<gangadir>/shared/<user>
location
ProxyDataDescriptor.__set__
has lots of special logic for both if the item is
preparable
and if setting the
is_prepared
item. Should all be moved out if possible but would help to have a rewrite of the schema system.
GPIProxyClassFactory._copy has lots more special logic for handling prepared and preparable objects
The runtime handler is the interface between the application and the backend. It provides
prepare
and
master_prepare
methods which packs the application information into a standard form that the backend can understand.
master_prepare
runs once per job but
prepare
will run on each subjob
The prepare methods can access the
is_prepared
property to decide whether something different needs doing.
Executable.py
As an example of an
IPrepareApp and
IRuntimeHandler...
To quote the docstring for Executable.prepare():
A method to place the Executable application into a prepared state.
The application wil have a Shared Directory object created for it.
If the application's 'exe' attribute references a File() object or
is a string equivalent to the absolute path of a file, the file
will be copied into the Shared Directory.
Otherwise, it is assumed that the 'exe' attribute is referencing a
file available in the user's path (as per the default "echo Hello World"
example). In this case, a wrapper script which calls this same command
is created and placed into the Shared Directory.
When the application is submitted for execution, it is the contents of the
Shared Directory that are shipped to the execution backend.
The Shared Directory contents can be queried with
shareref.ls('directory_name')
See help(shareref) for further information.
So most of the logic for actually preparing the shared directory is in this method.
IBackend.py
master_prepare seems to mostly dealing with a different meaning of the term 'prepare' but it does have special logic for dealing with prepared applications.
Job.py
There's lots of stuff in
Job
for handling preparable applications. I feel that since it's really only a application's business whether it's using a shared directory for its input sandbox then
Job
shouldn't care. Trying to find a way to make
IPrepareApp automatically do the right thing at the right time would be much more beneficial and would avoid too much spaghetti.
GangaList
has some special logic for deciding whether it's readonly depending on whether its parent is an
is_prepared
application.
Rewritten modules
Alex mentioned some of his plans/code in a meeting and there were two main things that stood out:
- A different repository system using a database like SQLite for improved concurrency etc.
- Rewritten Schema (and Proxy?) stuff to make it more Pythonic.
The latter will be particularly worth bearing in mind while making these changes.
For any more information you will have to ask Alex
Repository Locking
Ganga is currently leaking a lot of repo locks. I spotted this after doing some work with the afs 'lock-file' and multiple job submission would no longer work due to bleeding of lock files from the prepared state.
Keeping track in JIRA
There's a meta-issue I have opened at
GANGA-1989. Any issues we open which are related should be marked as 'blocking' GANGA-1989 (via 'More'→'Link' on an issue page).
Should we close existing issues beyond a certain age? - rcurrie
I did a trawl through the old open issues to close what I could but there was too many I had no knowledge or understanding of. It would need more experienced people to have a look. -- Matt W
New Testing System
What do we want and what do we need?
What can Jenkins deliver?
What do we need to do to work with Jenkins?
Are we losing any functionality for the gains we get from this?
Stability
Now that prep.metadata is no longer plaguing our startup/shutdown/unittests is there anything to be learned from this?
Shutdown, potentially fragile but working, does this need to be further formalised or are we happy with our working solution?
Startup, this is a loaded issue but are there issues which lead to instability during startup?
Is it possible to make a simple meta-plugin or meta-application which given a recipe will construct a templated Application and Job structure? (Maybe even a task system?)
This would allow us to significantly reduce the barrier of entry for future projects to join/use Ganga.
Code Quality
Shall we move to trying to keep trunk stable and do we want to introduce any policies for improving the code readability and possibly performance.
Performance and Profiling
Some feedback on Profiling Ganga (rcurrie). The good, the bad and the ugly.
Some strange effects can be seen from 'trivial' changes.
Externals
We're looking toward python 2.7/3.x and we're still shipping with ipython 0.16 (if memory serves) is this a good combination?
I think it's version 0.6.13 (at least that's the LHCb version) which is exactly 10 years old. It's probably worth updating it as it certainly won't support Python 3. Neither will any of the external dependencies probably.
In general I would like to investigate replacing the whole install/dependency system with something more modern using
pip
and maybe
virtualenv
--Matt W
Just a warning, having dealt with installing/debugging externals from some other projects which require compiling I think we should
never share source code as the default install method for users, we compile it against slc6 and push the binaries out. - rcurrie
Luckily we have no binary dependencies, only pure Python (as far as I can tell). The only exception in /external/ is pycrypto (which isn't even used from what I can see). Even if we did need binaries, we could build wheels (
pep-0427
). You're right about being careful though. --Matt W