Comments are against EGI profile v0.5 General comments: MULTIPLICITY For many (most?) attributes the allowed number of occurances is only mentioned in passing, as part of the description of attributions (Mandatory, Recommended, etc). Some attribute descriptions contain an oblique reference (e.g., AdminDomain.OtherInfo:ICON Validation). In general, attributes may have differing allowed carnality. The common options are exactly one attribute is to be published, one-or-more are to be published, zero-or-more or zero-or-one. For many attributes, publishing multiple attributes isn't a problem and has a natural interpretation (e.g., a site may have multiple blogs); however, I suggest that, for each attribute, the profile document includes an explicit mention of the allowed cardinally. This should prevent such problems as an info-consumer assuming an attribute appears only once but an info-provider publishing it multiple times. The carnality information could appear in a structured fashion, similar to how the requirements are shown (Mandatory, Recommended, ...) as part of the table. There's also the possibility of an interaction between the Mandatory/Recommended/etc of an attribute and the cardinality; for example, saying that a Mandatory attribute has one or more values is ambiguous because it isn't clear how important is the second value. A (contrived) example is an attribute where "one or more" values is Mandatory, "two or more" is Recommended, "three or more" is Desirable, "four or more" is Optional. ASSOCIATIONS / LINKS The profile makes little mention of the links between different objects. One exception is that a Share needn't have a MappingProfile when it is purely for debugging purposes. You should consider taking a more systematic approach and describe the desirability of each link's carnality. In some cases, links between objects are optional, yet people may assume that the link is present. REPRESENTATION OF DNS There's several places where the profile says DNs should be represented in OpenSSL format (a comma-separated list). In general, this is bad idea as OpenSSL format isn't documented, has changed over the lifetime of OpenSSL and has ambiguities. GLUE v2.0 says that RFC 4514 format is to be used (page 71, DN_t). Suggest the document is updated to reflect this. NATURAL TEXT There are many places where attributes are provided for human readable text. The profile has a somewhat inconsistent approach to describing these attributes: for some the document says that the explanation should be in English (e.g., AdminDomain.Description) but not others (e.g., Endpoint.HealthStateInfo). I imagine this is a simple oversight. The issue of multiple lines is not really addressed: which, if any, attributes may have multiple lines? If they have multiple lines, which encoding is to be used for newlines: CR+LF or LF? Another issue is the acceptable length of natural text attributes. For example, I would imagine that Endpoint.HealthStateInfo should be fairly succinct. Perhaps other attributes could be longer. SPECIFIC COMMENTS Page 6. Suggest making a clear statement (using RFC 2117 language) about Undesirable attributes; e.g., Information marked Undesirable SHOULD NOT be published. Page 7. I was in two minds about how place-holder values should be handled: the current level (INFO) suggests that the value might be correct. For place-holder values we know that something is wrong. On the other hand, such published values are valid (in a technical sense). Suggest using WARNING instead of INFO, but this isn't a strong suggestion --- as I said, I was in two minds about it. Page 10. The encoding of profile support with two attributes (ProfileName=EGI, ProfileVersion=) is clumsy. Suggest combining the two values as a comma-separated list published as a single attribute "Profile=EGI," for example: Profile=EGI,0.5 This allows info-provider to publishing conformance to many profiles concurrently without confusion over which applies to which Profile. The following is ambiguous (is OSG v1.2 supported?) ProfileName=EGI ProfileName=OSG ProfileVersion=1.1 ProfileVersion=1.2 The following is clear: Profile=EGI,1.1 Profile=EGI,1.2 Profile=OSG,1.1 Consumers of information can then easily select objects based on whichever profile version they understand. Page 12. 2.3.2 Place It isn't clear in which language the place should be described; the allowed length is also not specified (although unlikely to be a problem). Unspecified whether multiple lines are allowed. The example is unfortunate as it doesn't indicate the order of the city and state. Suggest using a different example. Page 13. 2.3.3. Country The encoding isn't specified. Is the published value a code (based on ISO 3166-1 or top-level domain names) or the full name of the country? Assuming the country name is published: in which language should it be specified (an official language for that country, or always English)? The allowed length of the published value isn't specified and number of lines also isn't specified (although unlikely to a problem). Page 14. 2.4.1 Detail The requirement to use mailto: seems wrong: it isn't strong enough for email addresses while excluding other means of contact. Suggest something like the Detail value MUST have schema type 'mailto' for email addresses, 'xmpp' for Jabber contact, 'http' for insecure HTTP and 'https' for HTTP secured with SSL/TLS. Other methods of communication are allowed, but SHOULD NOT use the above schema type. 2.6 AdminDomain The phrase "unless the site BDII is down" is unfortunate as it "forces" the validator to somehow discover the site BDII and verify whether it is up or not. There may be other situations when it is reasonable to suppress the warning (e.g., if the validator knew that the site is in down-time). Suggest replacing it with something more vague, about the expectation that published values may be incomplete. The site BDII may be kept as an example: [..] ERROR if the object is missing unless there is an expectation that published values may be incomplete (e.g., the site BDII is down). 2.6.1 Description The profile provides no guidance of acceptable length of this attribute nor whether newlines are permitted. Page 15. 2.6.4 OtherInfo ICON= The format of the image isn't specified: SVG, GIF, PNG, JPG, ... ? Is one format preferred? In other words, what formats are info-consumers expected to support? WLCG_NAME= How should a site's name be derived if the WLCG_NAME isn't published? WLCG_TIER A site's tier more something that a VO bestows on a site than something intrinsic to the site. Consider somehow linking the VO for which the site is a n-Tier. One way (there are others) of doing this would be to concatenate the VO name with the tier number is the published attribute value; for example: WLCG-TIER=1,atlas WLCG-TIER=1,alice WLCG-TIER=1,cms The WLCG_* OtherInfo attributes are not documented in the summary table. Page 17 2.8 Service The requirement that all published services exist in GOCDB seems overly prescriptive. The converse (all GOCDB entries must exist as a Service object) seems fine. I see a GLUE2 Service as a more light-weight concept than GOCDB entry: GOCDB services imply a level of service along with monitoring, etc. In contrast, a GLUE2 Service could be some highly experimental service that is still being developed. Therefore, suggest keeping GOCDB entries that are missing in GLUE as a WARNING but make GLUE2 services missing in GOCDB an INFO. Page 18 2.8.2 Type There is a suggestion that the Type of a Service be the same as the Endpoint.InterfaceName if the Service has only one Endpoint. This sounds reasonable for many cases; however, I think for storage this isn't such a good recommendation. A storage system is a storage system, even when it happens to have only one protocol-specific endpoint. Therefore, I would recommend rephrasing this to make is more a suggestion than a recommendation -- i.e., in the absence of other, more compelling reasons. 2.9 Endpoint The first line of the description contains "... MUST be published except in ...". At a purely English language level, suggest rephrasing the sentence to avoid saying that something that MUST be is conditional (just to avoid potential confusion). Page 20 2.9.15 HeathStateInfo Missing description of which language the text should be in; missing limitation of value's length. 4.4 StorageAccessProtocol Suggest mentioning that StorageAccessProtocol.Type is an enumeration and cite the URL where the list of "registered" types is kept. Page 46 4.6 StorageShare For dCache's info-provider, there are two kinds of disk-based StorageShare: those that represent a set of disk systems (with the same access profile) and those that represent SRM reservations (with the same description and same ownership). Obviously, the latter have StorageShare.tag attributes (the SRM "description") and the former doesn't. Recall that, in dCache (unlike other systems) SRM space reservations are not bound to any particular disk or set of disks. The system honours the reservations but the reservation introduces no guarantee on which disk incoming data will be stored. So, to some extent, these different StorageShare types correspond to physical and logical partitioning, respectively. To represent this, the StorageShare objects that correspond to physical storage (i.e., a set of RAID systems) contain a link to the corresponding DataStore object. The StorageShare objects that describe the logical storage (i.e., SRM reservations) do not have a link to a DataStore object. I'm not sure how helpful this is, but I mention it for completeness. Page 47 4.6.1 Description Which language is to be used? What are the length constraints? Are newlines allows; if so, how are they encoded? 4.6.5 AL, RP, EM This section should cite the SRM 2.2 specification. Page 49 4.7.1 Type The phrase "dynamic values" isn't defined. Moreover, it presupposes how an info-provider works (that some values are static, others are dynamic). This assumption appears elsewhere in the document; e.g., the StorageService attributes are described as "static" (IIRC). This isn't true for all info-providers: for dCache's info-provider, all values are generated dynamically. Suggest removing phrases like "static" and "dynamic" since they don't help and may cause confusion. 4.8 StorageManager "The Version should be an overall version number ..." On a distributed system, there may be no one "overall version number". Suggest updating the description to say that some meaningful number should be used; for example, in a distributed system the version of some specific, critical component.