Accessing the contents of a record via a web API

Looking at one of CMS’s open data records: CERN Open Data Portal, it has an associated file list.

Given the record number, 1507, is there a web api/rest way for me to get to the list of files that are attached, or list of file lists attached?

Many thanks!

Gordon, if you are talking about listing the index files (with root files in them), the cernopendata-client tool might be of use. For instance, if I understand correctly what you want, either of these two lines could work (and the similar ones if working directly on a local install and not docker):

docker run -it --rm cernopendata/cernopendata-client get-metadata --recid 1507 |grep "root://eospublic"

docker run -it --rm cernopendata/cernopendata-client list-directory "/eos/opendata/cms/MonteCarlo2011/Summer11LegDR/SMHiggsToZZTo4L_M-125_7TeV-powheg15-JHUgenV3-pythia6/AODSIM/PU_S13_START53_LV6-v1/file-indexes"

Thanks! This is very close to what I want. This will get the listing of txt files (or json equivalents - which I really like). I’m missing, however, how to get for the filelist, at least, from the eos to the http address. I didn’t see that included. I can find this from the web - for example, this page gives me a link to this list of files.

Hi Gordon, you can use the get-file-locations command for that. For example:

$ cernopendata-client get-file-locations --recid 1507

will return paths to all the ROOT data files of this data set. (If you want only the paths to the index files, you can use --no-expand option.)

You can combine the above with another useful option --protocol xrootd which will return XRootD paths, rather than HTTP paths, to the data files.

Please see the get-file-locations documentation for more details.

1 Like

Ahhh… that is perfect, other than the fact that I can’t spell protocol correctly. Thank you so much for your help!