How to proceed:

  • Controversial: ValidDuration and Identity
    • collect use cases until Jan 26
    • make a decision
  • Deadline draft: Jan 31

Comments

ValidDuration controversial

We discuss on the opportunity to include/exclude the property ValidDuration:
  • ValidDuration - duration indicating for how long time the measurement is valid from its measurement time

assumptions

  • StAR record is a representation of the storage resource consumption; within each record are described:
    • a storage resource description
    • the identity of consumers
    • the amount of resource consumed.
  • StAR records are created by Storage Provider off and on.
  • To create a StAR, the Storage Provider performs a measurement, indicated as MeasureTime.
    • MeasureTime - timestamp indicating when measurement of the resource consumption was made

some notes

  • ValidDuration is a property that is not related to measurement, but rather it is a metadata used only when StAR will be processed.
  • A key comment comes from John Kennedy: "Who would define this?".

Use case 1: ValidDuration is determined by the Storage Provider

ValidDuration value is determined in some way in the side of the service making the measurement (storage provider). It is on interest of the storage provider to ensure that ValidDuration is as long as possible in order to ensure that, in case of any failure (*), the accounting will continue. There are some possibilities:

Use case 2: ValidDuration is determined by the Accounting Service

ValidDuration is determined in some way in the side of the service that draw up consumption statements for each VO and Storage Resource (accounting service).
  • Q: If ValidDuration is determined by the Accounting Service, why the Storage Provider has to include it on each StAR?

Use case 3: ValidDuration is agreed between the various VOs and Storage Providers.

The value of ValidDuration is part of the SLA between the various VOs and Storage Providers.

Use case 4: Records can be created frequently or infrequently

Records can be taken frequently, with a period 't', or infrequently, e.g., based on data access or changes in the amount of data. If the records are taken frequently, the storage provider can choose to have a ValidDuration '2*t'. In this case, ValidDuration can be helpful to monitor that the recording process is working. If records are taken infrequently, there is no clear way to define the size of ValidDuration, and it will not help to discover if the recording process is working.
  • ValidDuration does not seem to help for monitoring the accounting process.

Identity controversial

  • Currently:
User identity:
 LocalUserName
 UserIdentity
Project/VO/VRC identity:
 LocalUserGroup 
 Group
 GroupPartition
 GroupRole
 GroupAuthority
  • Pauls proposal
  <ns:Id scope="local" type="name">fred</ns:Id>
  <ns:Id scope="local" type="uid">51010</ns:Id>
  <ns:Id scope="/C=DE/O=UtopiaGrid/CN=Friendly-CA"
         type="dn">/C=DE/O=UtopiaGrid/OU=Example/CN=Fred Bloggs</ns:Id>

  <ns:Group type='gid' scope='local'>1000</ns:Group>
  <ns:Group type='name' scope='example.org'>atlas</ns:Group>
  <ns:Group type='fqan' scope='lcg-voms.cern.ch'>/atlas</ns:Group>

  <ns:Record>
    <ns:Identity>
      <ns:Group scope="lcg-voms.cern.ch"
                type="fqan">/atlas/higgs/Role=production</ns:Group>
      <ns:PartOf>
        <ns:Group scope="lcg-voms.cern.ch" type="fqan">/atlas/higgs</ns:Group>
        <ns:Group scope="lcg-voms.cern.ch" type="fqan">/atlas</ns:Group>
      </ns:PartOf>
    </ns:Identity>
  </ns:Record>
  • with general attributes
<GroupAttributes>
   <Attribute>
     <AttributeType>role</Attributetype>
     <AttributeValue>production</AttributeValue>
  </Attribute>
   .. more attributes ..
</GroupAttribute>
  • Henrik
   <UserIdentity scope="ndgf.org">htj</UserIdentity>
   <UserGroup scope="ndgf.org">staff</UserGroup>
    • I don't see the value of "type". Either it identifies the user in the scope or not. Also note that the scope is not local or global, but can instead be a system. This could be combined with my suggestion for group attributes, into something fairly simple and useful. The bad news is that we do not have a lot of time, and this shoud probably be discussed further, so I cannot really suggest it at the current time.

Feedback

  • 1) UniqueID: A unique ID of the resource used (we use a hash of StorageSystem:StorageType:StorageShare). Useful for identifying the unique logical partitions.
    • discussion:
      • agreed that this is a redundancy, can also be done by the accounting service independent of Star - No

  • 2) Types: You enumerate storage type to only be disk or tape. I think it's conflating two ideas: the class or quality of the storage, and the type of the share. For example, the class can be disk, tape, tape-backed disk, SSD, while the type might be a disk pool, a quota, a space reservation, a namespace directory, etc. - In our system, we had a measurement type (disk/tape/logical) and storage type (directory/quota/SE/pool group/pool/space reservation). So, each record only contained one measurement - if you wanted to keep the raw and logical, it was two separate records.
    • discussion
      • Jon: To me, this looks like the description of StorageMedia and StorageClass. Not sure if the problem is naming of properties or description of the properties.

  • 3)
    • Storage share "size": You do not record the size of the storage share. I realize you explicitly do not want to track available disk space, but you should at least record the size of the share. There is precedence here in the JobUsageRecord - we record the size of the batch slot in terms of # of CPUs (so, you record the size of the resource you consumed, not the size of the entire system). "Size" here can refer to either number of files or number of bytes.
    • - Also, without the totals, storage accounting is not-so-useful; I think this is just a fundamental difference from CPU accounting.
    • - We should note "Available = Total - Used" whenever possible.
    • discussion:
      • Ralph: Might be interesting for the user to see how much space there is available, but I don't see the point from the accounting perspecitve
      • Jon: Agree
  • 4) Multiple ownership: Unlike in job accounting, it is reasonable for a share to be set aside for multiple VOs. For us, example would be CMS sites which have 95% owned by CMS and set aside 5% for "everyone else". It may not be desirable or possible to measure each VO's use inside of a multi-VO share. I'd suggest some sort of wildcard syntax for user/group to handle these situations. Another example would be a dCache unit group - it can have multiple VOs, and it might be difficult to determine who is using what. You might want to do one measurement for the whole area.
    • discussion: see identity above

  • 5) Overlapping areas: Unlike job accounting, usage shares can have overlapping parent-child relationships. An example here would be dCache unit groups and space reservations - each space reservation will be a subset of a unit group. One needs to understand the hierarchy in order to present a coherent picture of the usage. You can use the UniqueID I suggested in (1) and also record a ParentID field.
    • discussion:

  • 6) It is difficult for me to read the document and discern which fields are mandatory and which are optional. Could you summarize this in a table in the document?
    • discussion:
      • Ralph will do it - DONE pushed to git

  • - 1.1 "... there is no fixed start and end time..." I think it's more "... there is no fixed end time... " indeed if there where no record but the file was already present in the storage the start time is unknown but for convention is taken the time of measurement.
    • discussion:
      • Jon: This depends on the next paragraph, which needs to be finished.

  • - 1.3 "... Certain fields are aggregates..." maybe, if the definition of the UR allows for a fine level of detail, up to the file, the same field could be used for the space required by the single file. In this case this sentence should be: "... Certain fields might be aggregates..."
    • discussion
      • Jon: I think the fields can be considered aggregates regardless of the number of files the aggregate consist of. To me, the sentence is fine.

  • - 4.1 "... and be used duplicate detection..." "... and be used for duplicate detection" RecordIdentity maybe could be a
    • Jon will do it - DONE pushed to git

  • - 4.4 I belive that the example should be written with StorageMedia
    • Jon will do it - DONE pushed to git

  • - 4.13 I was thinking if it could be also useful to have something like a: GroupSubPartition There are already now VO like CMS that have some distinction between production and analysis. (the problem is that where should we stop? There can be SubSubPartion and so on)
    • See identity outcome

  • - 4.14 In this case I think that the role would be the role of the user in term of privileges. For example it could be: user or administrator.

  • - 4.17 "... Note that a record can be “nullified” if a newer record is manifested, i.e., the most recent information should be used..." I think it should not be completely nullified but it's still valid for the time that doesn't overlap with the newer record. But the problem I see here is: which record should be nullified by the new record? Should be the new record identical to the old one in all attributes except for the disk usage (GroupRole, GroupPartition, Group, UserIdentity, LocalUserGroup, SubjectIdentity, DirectoryPath, FileCount, StorageClass, StorageMedia, ...). If for any reason it is different then the same accounting record might be counted twice, in the overlapping time.

  • - 5
    • Site: what do you mean that is already defined elsewhere? The FQAN that might be used for StorageSystem but is not even mandatory. If you have a repository that collects record from different sites how can they be distinguished?
    • - FileNames: I agree that by not using a per file accounting at all make this unnecessary.
    • - Transfer Information: instead of calling it like that it could be AccessInformation. This could be useful. It could create a record (or an aggregate) that shows when a file has been accessed. This can be used to decide which files are necessary, which can be deleted, moved to another type of storage depending on the usage.

  • - 6 this explains my comments in 4.17. In that section for me it wasn't clear. But the question still remains: which record should be nullified by the new record? When you make an aggregation of record what happens? It could be created a new record that represent a certain disk consumption that has the validity equal from the MeasureTime of the first record to the end of the validity of the last record.

  • 1) Regarding the difference in compute/storage accounting. (section 1.1) I feel the arguments here need a little work/clarification. I think we all feel there's a difference but it's how to define this.The arguments you use to show a difference don't seem to me to show a real difference. Jobs have exclusive access to CPU cores - I would say files have exclusive access to bytes on storage. Storage shared among several parties each using part of the resources, I feel this is the same on a batch farm. The farm has many cores which are shared between users. The same point regarding use, different participants use differerent cpu/storage at different times. And again with time, there is a set start and finish time for each use of a unit of storage (file create and file delete)? I think I may be missing the point to some of your arguments and a bit of clarification would help me. The difference to me seems in the point in time measurement which you are defining for storage and the summed measurement used for cpu. I guess it also depends on what you wish to measure/know. Maybe with storage you just want the snapshot so u can see how much of a set quota of storage people are using. With batch systems etc the quotas tend to be set regarding an aggregated usage.
    • Ralph - DONE, proposal pushed to git

  • 2) Measurement time v's CreateTime Could you say why they could be different. Do you mean the measurement could be made and then say n days/hours later the record could be created? Would this make sense?
    • Ralph - DONE, proposal pushed to git, added a sentence to the RecordIdentity section

  • 3) ValidDuration Who would define this? If it is the people consuming the record is it actually needed.

  • 4) Regarding one record invalidating another. The two would have to be exacly the same in what they measure, I believe you are aiming at this when you talk about records describing thr same consumption?

  • 5) ResourceCapacityUsed and LogicalCapacityUsed I think one of these MUST be present in the record for it to contain useable info? (you say SHOULD - I didn't read the RFC so maybe this point is covered)
    • Ralph: This is right. Should be changed to "MUST"

  • 6) StorageShrare Is this aimed at covering concepts such as space tokens?

  • 7) You mention the fact that it currently isn't clear if the StAR format will be new or re-use the UR. I would say that trying to aim for a situation where both records could be used together to for a single record for accounting for storage and CPU would be very benefitial. If this is not the case then we'd run into a situation where people would need to do lots of manual work to account for usage of a projects resources. If it's possible to allow this I think it would make life much easier for people.

  • 8) In the minimal example. You say "this would probably", I think you could be stronger since you are defining the format and say "this "SHOULD" be interpreted as"
    • Ralph - DONE, pushed to git

TODOs

  • Add identity block: Henrik
  • Decide on ValidDuration
  • Decide on ID block
  • Fix typos
  • Finish related work (can we just drop Amazon S3 accounting?)
  • Processing model section - should it be moved out to separate document? or included as appendix?
  • ...

-- RalphMuellerPfefferkorn - 20-Jan-2011

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2011-02-01 - RalphGerhardMuellerPfefferkornExCern
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EMI All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback