Question about Sh_2214_Wmunu_maxHTpTV2_CFilterBVeto

rakhi · 8 December 2025 14:10

The metadata for evtgen dataset 700781 reads as follows:
{‘dataset_number’: ‘700781’,
‘physics_short’: ‘Sh_2214_Wmunu_maxHTpTV2_CFilterBVeto’,
‘e_tag’: None,
‘cross_section_pb’: 22935.0,
‘genFiltEff’: 0.1484349,
‘kFactor’: 1.0,
‘nEvents’: 49999391,
‘sumOfWeights’: None,
‘sumOfWeightsSquared’: None,
‘process’: None,
‘generator’: ‘Sherpa’,
‘keywords’: [‘Baseline’, ‘jets’, ‘muon’, ‘nlo’, ‘sm’, ‘w’],
‘description’: ‘Sherpa W+/W- → mu nu + 0,1,2j@NLO + 3,4,5j@LO with c-jet filter taking input from existing unfiltered input file.’,
‘job_path’: ‘Files · master · atlas-physics / pmg / MC Job Options · GitLab,
‘CoMEnergy’: 13600.0,
‘GenEvents’: 373518577,
‘GenTune’: ‘NULL’,
‘PDF’: ‘NULL’,
‘Release’: ‘AthGeneration_23.6.3’,
‘Filters’: ‘HeavyFlavorHadronFilter’,
‘cross_section_uncertainty’: 0.0,
‘hepmc_version’: 2,
‘release’: {‘name’: ‘2025r-evgen-13p6tev’}}
I’m still a little confused as to what this dataset contains because I’m not sure about the abbreviations in the physics_short description (and the job_path python file listed doesn’t help). Is this just W+c, with the c (mesons) decaying muonically, and the genFiltEfficiency taking into account both the c-jet filter efficiency and branching fraction for the c-meson muonic decay? Or am I interpreting this incorrectly?

zmarshal · 9 December 2025 06:09

Hi @rakhi ,

I’ve got some more documentation of the physics shorts in review now, so I hope it’ll be a little better soon. Breaking this one down:

Sh: Sherpa
2214: Sherpa version 2.2.14
Wmunu: W bosons decaying to mu neutrino (it’s the W decay specifically that we’re talking about here)
maxHTpTV2: the sample is biased based on the maximum of the HT and the pT of the vector boson. Biasing means that the unweighted spectrum of events has more high-energy events, so that we have better statistical uncertainties at high energies (high HT / pTV here). That allows us to generate a single sample that has good statistics across the entire spectrum, rather than needing separate samples for measurements interested in the bulk of the cross section and for searches that are interested in tails.
CFilterBVeto: the matrix elements are required to have at least one c-quark and no b-quark. Similarly, BFilter means at least one b-quark in the matrix element, and CVetoBVeto means no b-quark or c-quark in the matrix element.

I hope that helps!

Best,
Zach

rakhi · 9 December 2025 08:44

Thanks Zach, that’s very helpful. (Assume in the final sentence you meant either “CVetoBVeto” or" “at least one b-quark and at least one c-quark in the matrix element”?)

Any chance we can lay our hands on an alternate set where the c-meson decays muonically too? Some collaborators and I are trying to estimate the FNP background for a dimuon search using the fake factor method, and this would likely be our leading background.

zmarshal · 10 December 2025 07:16

Hi @rakhi ,

Sorry, yes — I’ve edited the post above to be correct. The new documentation is live, btw:

I’m not sure if we have an inclusive sample like the one you’re describing. We’ve got this that’s included in the 13 TeV open data:

(‘300005’, ‘Pythia8B_AU2_CTEQ6L1_JpsimumuWmunu’)

I’m not sure I see something even internally that’s what you’re after, but I’ll ask around in case I’ve missed it.

Best,
Zach

zmarshal · 10 December 2025 18:29

And now the second part of the answer: yes! I just had to learn about the sample.

{'dataset_number': '501719',
 'physics_short': 'MGPy8EG_Wmunu_FxFx_3jets_HT2bias_SMT',
 'e_tag': None,
 'cross_section_pb': 22165.0,
 'genFiltEff': 0.003464519,
 'kFactor': 1.0,
 'nEvents': 10000000,
 'sumOfWeights': None,
 'sumOfWeightsSquared': None,
 'process': None,
 'generator': 'aMcAtNlo+Pythia8(v.245p3.lhcb7)+EvtGen(v.1.7.0)',
 'keywords': ['Specialised', 'jets', 'muon', 'nlo', 'sm', 'w'],
 'description': 'aMcAtNlo Wmunu+0,1,2,3j NLO FxFx HT2-biased SMT Filter',
 'job_path': 'https://gitlab.cern.ch/atlas-physics/pmg/mcjoboptions/-/blob/master/501xxx/501719/mc.MGPy8EG_Wmunu_FxFx_3jets_HT2bias_SMT.py',
 'CoMEnergy': 13000.0,
 'GenEvents': 181210500,
 'GenTune': 'A14 NNPDF23LO',
 'PDF': 'NULL',
 'Release': 'AthGeneration_21.6.80',
 'Filters': 'MuonFilter, MultiMuonFilter',
 'cross_section_uncertainty': 0.0,
 'hepmc_version': 2,
 'release': {'name': '2025r-evgen-13tev'}}

The MultiMuonFilter was the hint. SMT (in the name of the sample) is for “soft muon tagger”, developed for this paper. It requires one muon above 15 GeV (which will be the one from the W most of the time) and a second muon above 3.25 GeV (which will be from a b or c decay most of the time).

We’ve only got that sample at 13 TeV at the moment.

Best,
Zach

rakhi · 11 December 2025 10:28

fabulous!!! Thank you Zach!

sc1n23 · 12 March 2026 17:37

Hi Zach,

Thank you for providing this sample. I’m currently working on this sample for some detector-level analysis with Rakhi, but it seems I can only access 10M of events (nEvents) of the dataset for this sample (MGPy8EG_Wmunu_FxFx_3jets_HT2bias_SMT) with links provided for the public. It seems, if I understand it correctly, there were around 181M of events generated for this sample, as the GenEvents is in this number. May I ask how I could have access to more events for this sample (ideally 23M events or more)? Many thanks for your help and time.

Best,

Shu

zmarshal · 12 March 2026 23:14

Hi @sc1n23 ,

Thanks for asking (and glad to hear you’re working with our data!). The nEvents you see is what we’ve released publicly. To release more, we need a motivated request. Basically, we need to see what it is that you’re doing at a level of detail sufficient to convince us that you need another 2.5x in statistics. If for any reason you’d prefer not to post that here (getting scooped is a thing, and I imagine it’s more comfortable discussing these things away from public fora) you are welcome to send an email to atlas-outreach-opendata-support@cern.ch

Best,
Zach

sc1n23 · 13 March 2026 16:40

Hi @zmarshal ,

Thanks for your reply and advice. I have just sent our motivation for more events through email. Thank you again for your help .

Best,
Shu