I have another question from the Open Data analysis team here at the University of Notre Dame’s Quark Net Center.
We’re currently working on an analysis in Python of one of the NanoAOD-like versions of the 2010 data. We’re hoping to keep the analysis entirely in Python, but are open to using other tools if it’s the quickest way.
We’re cutting the data set on exactly two muons, but requiring at least one Global muon and that the muons are of opposite sign. Because the Muon_isGlobal and the Muon_isTracker variables are booleans, we had to make those cuts in loops. That’s not the primary issue we have, though I’m open to finding another way to make those cuts (I’m currently exploring the “Getting Started” guide for uproot and saw some things that look promising).
Our primary problem is that we’ve been putting the collected data that passes our requirements into arrays. So far, we’ve needed to make those arrays small compared to the data set to work with our computer memory capacities.
We saw a tutorial from the HEP Software Foundation that uses a fill command from Hist, but installing Hist in the python container updated numpy which then was the wrong version of numpy to work with uproot (I think, from the error messages). We were hoping for some other command that works like that fill command in Hist or the fill command in ROOT to fill the values from the loop into a histogram (or a group of histograms, one for each variable) rather than an array in the hope that it would fit better in our memory.
Again, we hope to work just in Python. We’re grateful for any assistance anyone can give. Thanks in advance!
~Jill Ziegler