Hi, Here are my responses to your comments - I've deleted some of your text here for brevity but I'll upload your full comments to the document page as well as this reply. I'm updating the document with change tracking - hopefully there should be a new version in a week or so. > For many (most?) attributes the allowed number of occurances is only > mentioned in passing, as part of the description of attributions (Mandatory, > Recommended, etc). Some attribute descriptions contain an oblique > reference (e.g., AdminDomain.OtherInfo:ICON Validation). In general the schema itself specifies whether attributes are single- or multivalued, so I don't think there's any general need to repeat that - the profile is a supplement to the schema definition. However there may be cases where the profile should impose a restriction - and as you suggest that may apply particularly to OtherInfo attributes which are in a sense an auxiliary schema. I'll look at that when I go through the document in detail. > There's also the possibility of an interaction between the > Mandatory/Recommended/etc of an attribute and the cardinality; for > example, saying that a Mandatory attribute has one or more values is > ambiguous because it isn't clear how important is the second value. A > (contrived) example is an attribute where "one or more" values is > Mandatory, "two or more" is Recommended, "three or more" is Desirable, > "four or more" is Optional. That's possible but I'm not sure it arises in practice - again I'll have to go through the whole document to see if we have any such cases. > The profile makes little mention of the links between different objects. One > exception is that a Share needn't have a MappingProfile when it is purely for > debugging purposes. Yes, several people have made similar comments. The problem is that the schema allows many-to-many relations almost everywhere so it's rather hard to define what constraints should exist - and it may also depend on the properties of a particular object. I think the best I can do is add a section for each class describing what relations may exist and highlighting any restrictions. > There's several places where the profile says DNs should be represented in > OpenSSL format (a comma-separated list). The openssl format is slash-separated, i.e. the usual things like /C=UK/O=eScience/OU=CLRC/L=RAL/CN=stephen burke. > In general, this is bad idea as > OpenSSL format isn't documented, has changed over the lifetime of OpenSSL > and has ambiguities. All that's true, but the openssl format is so deeply embedded in all our software, user experience etc that I don't think it's practical to avoid it here. > GLUE v2.0 says that RFC 4514 format is to be used (page 71, DN_t). > Suggest the document is updated to reflect this. Hmm ... this seems to be somewhat inconsistent. The schema doc says "Distinguished Name as defined by RFC 4514 (http://www.rfc-editor.org/rfc/rfc4514.txt). X.509 uses a X.500 namespace, represented as several Relative Domain-Names (RDNs) concatenated by forward-slashes." However RFC 4514 has a comma, not a slash, as the separator, and also that RFC is for LDAP DNs whereas the schema use of that type is for CA DNs which have nothing to do with LDAP! (We do of course use the RFC 4514 style for LDAP DNs in the BDII but that's a separate question.) Also I'm not sure if we have any tools to return or manipulate DNs in that format. I think this one needs to be referred to the GLUE WG as an anomaly. > There are many places where attributes are provided for human readable > text. The profile has a somewhat inconsistent approach to describing these > attributes: for some the document says that the explanation should be in > English (e.g., AdminDomain.Description) but not others (e.g., > Endpoint.HealthStateInfo). I imagine this is a simple oversight. It isn't exactly an oversight, the situation is somewhat different for those attributes. The AdminDomainDescription is a high-level description of the site, and I think it's reasonable for EGI to say that it should be in English as that's the common working language. HealthStateInfo is low-level diagnostic output, typically from "/sbin/service xxx status", and while it's probably more useful for that to be in English I don't think it's vital to insist that sites shouldn't use localised strings. However it's probably worth saying that English is preferable in general. > The issue of multiple lines is not really addressed: which, if any, attributes > may have multiple lines? If they have multiple lines, which encoding is to be > used for newlines: CR+LF or LF? > > Another issue is the acceptable length of natural text attributes. > For example, I would imagine that Endpoint.HealthStateInfo should be fairly > succinct. Perhaps other attributes could be longer. For both of these I'll add a general comment, that multiline text should not be used and that length should be limited to 255 characters - I doubt there is any need for more than that. > Suggest making a clear statement (using RFC 2117 language) about > Undesirable attributes; e.g., Information marked Undesirable SHOULD NOT > be published. OK. > I was in two minds about how place-holder values should be handled: > the current level (INFO) suggests that the value might be correct. > For place-holder values we know that something is wrong. On the other > hand, such published values are valid (in a technical sense). > > Suggest using WARNING instead of INFO, but this isn't a strong suggestion --- > as I said, I was in two minds about it. I included the comment "should be treated as INFO unless the information for a particular attribute specifies otherwise" to imply that it may be different - I'll add an explicit comment that it may be varied to either WARNING or ignored. > The encoding of profile support with two attributes (ProfileName=EGI, > ProfileVersion=) is clumsy. It's aligned with how we're publishing other things. > This allows info-provider to publishing conformance to many profiles > concurrently without confusion over which applies to which > Profile. I'm not convinced that it makes sense to conform to multiple profiles - in general they will conflict. For now what I'm proposing is to publish conformance to (at most) one version of one profile per object - if we find we need something more the profile can be updated! > It isn't clear in which language the place should be described; the allowed > length is also not specified (although unlikely to be a problem). Unspecified > whether multiple lines are allowed. As above, I'll cover most of that with general comments. > The example is unfortunate as it doesn't indicate the order of the city and > state. Suggest using a different example. Good point, although I doubt Americans would find it ambiguous! Changed to "Atlanta, Georgia" (the first one that came to mind). > The encoding isn't specified. Is the published value a code (based on ISO > 3166-1 or top-level domain names) or the full name of the country? It's the full name - I'll add France, New Zealand and UK as examples. > Assuming the country name is published: in which language should it be > specified (an official language for that country, or always English)? Unspecified - much too political! > The requirement to use mailto: seems wrong: it isn't strong enough for email > addresses while excluding other means of contact. It says "SHOULD" rather than "MUST", but in general the expectation is that they are email addresses , and having it as a URL means that if the value is displayed in a web page you can just click on it. I'll expand the text a bit. > The phrase "unless the site BDII is down" is unfortunate as it "forces" the > validator to somehow discover the site BDII and verify whether it is up or not. > There may be other situations when it is reasonable to suppress the warning > (e.g., if the validator knew that the site is in down-time). > > Suggest replacing it with something more vague, about the expectation that > published values may be incomplete. The site BDII may be kept as an > example: > > [..] ERROR if the object is missing unless there is an expectation > that published values may be incomplete (e.g., the site BDII is > down). Yes, rewritten. > ICON= > > The format of the image isn't specified: SVG, GIF, PNG, JPG, ... ? Is one > format preferred? In other words, what formats are info-consumers > expected to support? At a quick survey we have .gif, .jpg, .ico and .png with the latter being the favourite. I'll put those four as SHOULD. > WLCG_NAME= (etc) I'm not defining the WLCG attributes, just mentioning them - it's up to WLCG how they get defined and used. These are existing attributes in GLUE 1 so I just copied them as they are. > The requirement that all published services exist in GOCDB seems overly > prescriptive. The converse (all GOCDB entries must exist as a Service object) > seems fine. In fact I think it's the opposite: there are things in the GOC DB (e.g. the GOC DB itself!) which are unlikely to be published, but we would like to have every service registered in the GOC DB. > I see a GLUE2 Service as a more light-weight concept than GOCDB entry: > GOCDB services imply a level of service along with monitoring, etc. > In contrast, a GLUE2 Service could be some highly experimental service that > is still being developed. Even non-production services are registered in the GOC DB. I'll add a comment that validation running outside the EGI Grid completely, e.g. on testbeds, is excluded. > There is a suggestion that the Type of a Service be the same as the > Endpoint.InterfaceName if the Service has only one Endpoint. This sounds > reasonable for many cases; however, I think for storage this isn't such a good > recommendation. A storage system is a storage system, even when it > happens to have only one protocol-specific endpoint. Therefore, I would > recommend rephrasing this to make is more a suggestion than a > recommendation -- i.e., in the absence of other, more compelling reasons. OK, I'll change RECOMMENDED to recommended and explicitly allow exceptions. > The first line of the description contains "... MUST be published except in ...". > At a purely English language level, suggest rephrasing the sentence to avoid > saying that something that MUST be is conditional (just to avoid potential > confusion). Reworded. > Suggest mentioning that StorageAccessProtocol.Type is an enumeration and > cite the URL where the list of "registered" types is kept. In general I'm not repeating information which is part of the main schema definition. I will add a general comment about where the lists of enumerated types are kept. > I'm not sure how helpful this is, but I mention it for completeness. ! I've tried to make the text fairly general - the key prescription is that whatever an implementation does, it should be documented. > This section should cite the SRM 2.2 specification. Added at the first mention of SRM. > The phrase "dynamic values" isn't defined. Moreover, it presupposes how > an info-provider works (that some values are static, others are dynamic). > This assumption appears elsewhere in the document; e.g., the > StorageService attributes are described as "static" (IIRC). You have a point, but the contrast being made here is explained fairly explicitly, i.e. between installed values which include all the servers you've got and dynamic values which change if some servers go down. I'll think about whether I can define dynamic and static or use different words. > "The Version should be an overall version number ..." On a distributed > system, there may be no one "overall version number". Well, that's up to you as an implementor - for example if you have three main components you could just concatenate the versions. > Suggest updating the description to say that some meaningful number > should be used; for example, in a distributed system the version of some > specific, critical component. The current text says "The Version should be an overall version number for the product suitable for tracking at least major releases" which defines the purpose of publication but leaves the detail to you. I'll add a couple of examples as above. Stephen