Open Evtgen: incompatibility between hepmc version number (2) given in metadata and the version number written in the hepmc file

Dear open data team,

In some evtgen samples (e.g. dataset 410472) the metadata states that the files are in hepmc2 format, whilst if you read the top of (at least some of) the hepmc files, they actually say “HepMC::Version 3.03.00”. Which is correct?

Cheers,

Rakhi

not sure if the following problem is related to the hepmc version ambiguity above, but when we run checkmate on some evtgen datasets e.g. #513097 or #501719 (also classified as hepmc2 in the metadata but hepmc3.03 in the hepmc file) we find nonsensical cutflows for a small number of individual samples, where the number of events increase after subsequent cuts, or the normalized number of events goes negative. The sum total of the dataset gives you sensible results, though we’re not sure what to think about the nonsensical cutflows. Checkmate author Krzysztof seems to think the two issues may be related (checkmate hasn’t yet been validated on hepmc3).

Hi @rakhi ,

This is a feature of the HepMC library: it tells you the version of the library that wrote the file, rather than the version of the standard that was used to write the file (ie the format). We used the HepMC3 library to write HepMC2-format files, so the top of the file says HepMC3.

That second part is quite interesting, and I don’t have a good explanation unless there are events with very large negative-weight events that are affecting you?

Best,
Zach

Ah thanks for the clarification. We’re checking the weights now.