Overwatch Processing Package (overwatch.processing
)¶
General information from the README is available immediately below. Further module specific documentation is available further below in the package reference, with the modules listed below.
Package reference¶
overwatch.processing.mergeFiles module¶
This module contains functions used to merge histograms and ROOT files.
For further information on functionality and options, see the function docstrings.
-
overwatch.processing.mergeFiles.
merge
(currentDir, run, subsystem, cumulativeMode=True, timeSlice=None)[source]¶ For a given run and subsystem, handles merging of files into a “combined file” which is suitable for processing.
For a standard subsystem, the “combined file” is the file which stores the most recent data for a particular run and subsystem. The merge is only performed if we have received new files in the specified run. The details of the merge are determined by the
cumuluativeMode
setting. This setting, which is determined by the options sent in the data request sent to the HLT by the ZMQ receiver, denotes whether the data that we received are reset each time the data is sent. If the data is being reset, then the files are not cumulative, and thereforecumulativeMode
should be set toFalse
. Note that although this function supports both modes, we tend to operate in cumulative mode because resetting the objects would interfere with other subscribes to the HLT data. For example, if both Overwatch and another subscriber were set to request data with resets every minute and were offset by 30 seconds, they would both only receive approximately half the data! Thus, it’s preferred to operate in cumulative mode.For cumulative mode, the combined objects are created in two different ways: 1) For a standard combined file, by simply copying the most recent file (because it contains data covering the entire run); 2) For time slices, by subtracting the objects in two corresponding ROOT files. For reset mode,
TFileMerger
is used to merge all files within the available timestamps together.This function also handles merging files for time slices. The relevant parameters should be specified in a
timeSliceContainer
. The min and max requested times are extracted, and this function only merges files within the fixed time range corresponding to those values. The output format of this file is identical to any other combined file.Note
As side effects of this function, if it executes successfully for a combined file, the combined filename will be updated in
subsystemContainer
.Parameters: - dirPrefix (str) – Path to the root directory where the data is stored.
- runDir (str) – Run directory of current run. Of the form
Run######
. - subsystem (str) – The current subsystem by three letter, all capital name (ex.
EMC
). - cumulativeMode (bool) – Specifies whether the histograms we receive are cumulative or if they have been reset between each acquired ROOT file, i.e. whether we merge in “subscribe mode” or “request/reset mode”. Default: True.
- timeSlice (processingClasses.timeSliceContainer) – Stores the properties of the requested time slice. If not specified, it will be ignored and it will create a standard “combined file”. Default: None
Returns: On success,
None
is returned. Otherwise, an exception is raise.Return type: None
Raises: ValueError
– If the number of input files doesn’t match the number of files in the merger. Perhaps if a file is inaccessible.
-
overwatch.processing.mergeFiles.
mergeRootFiles
(runs, dirPrefix, forceNewMerge=False, cumulativeMode=True)[source]¶ Driver function for creating combined files for each subsystem within a given set of runs.
For a given list of runs, this function will iterate over all available subsystems, merging or moving files as appropriate. To speed up this function, operations will only be performed if a new file is available for a particular subsystem (ie.
subsystemContainer.newFile == True
). This function will result in a combined file per subsystem per run. For further information on the format of this file, seemerge()
.Parameters: - runs (dict) – Dict of
runContainers
to perform the merge over. The keys are the runDirs, in the from ofRun######
. - dirPrefix (str) – Path to the root directory where the data is stored.
- forceNewMerge (bool) – Flag to force merging for all runs, regardless of whether it is supposed to be merged. Default: False.
- cumulativeMode (bool) – Specifies whether the histograms we receive are cumulative or if they
have been reset between each acquired ROOT file, i.e. whether we merge in “subscribe mode” or
“request/reset mode”. See
merge()
for further information on this mode. Default: True.
Returns: None
- runs (dict) – Dict of
-
overwatch.processing.mergeFiles.
subtractFiles
(minFile, maxFile, outfile)[source]¶ Subtract histograms in one file from matching histograms in another.
This function is used for creating time slices in cumulative mode. Since each file is cumulative, the later time stamped file needs to be subtracted from the earlier time stamped file. The remaining data corresponds to the data stored during the time window
early-late
.This function is not used for creating a standard combined file because the cumulative information is already stored in the most recent file.
Note
The names of the histograms in each file must match exactly for them to be subtracted.
Note
The output file is opened with “RECREATE”, so it will always overwrite an existing file with the given filename!
Parameters: - minFile (str) – Filename of the ROOT file containing data to be subtracted.
- maxFile (str) – Filename of the ROOT file containing data to to subtracted from.
- outfile (str) – Filename of the output file which will contain the subtracted histograms.
Returns: None.
overwatch.processing.processRuns module¶
Steers and executes Overwatch histogram processing.
Takes files received from the HLT, organizes the information within a directory structure, and processes the histograms within. It provides plugin opportunities throughout the all processing and trending steps.
-
overwatch.processing.processRuns.
compareProcessingOptionsDicts
(inputProcessingOptions, processingOptions, errors)[source]¶ Compare an input and existing processing options dictionaries.
Compare the dictionary values stored in the input (
inputProcessingOptions
) to those in the reference (processingOptions
). Both the keys and values are compared. For both dictionaries, keys are names of options, while values are the corresponding option values.Note
The existing processing options can have more entries than the input. Only the keys and values in the input are checked.
Note
For the error format in
errors
, see the web app README.Parameters: - inputProcessingOptions (dict) – Processing options specified in the time slice.
- processingOptions (dict) – Processing options used during standard processing which serve as the reference options. These usually should be the subsystem processing options.
Returns: - (processingOptionsAreTheSame, errors) where
processingOptionsAreTheSame
(bool) isTrue
if all input options are the same values as in the existing options and
errors
(dict) is an error dictionary in the proper format.
Return type: tuple
-
overwatch.processing.processRuns.
createNewSubsystemFromMovedFilesInformation
(runs, subsystem, runDict, runDir)[source]¶ Creates a new subsystem based on the information from the moved files.
This function determines the
fileLocationSubsystem
and then creates a new subsystem based on the given information, including adding the files to the subsystem. It also ensures that the subsystem will be processed by enabling thenewFile
flag in the subsystem.Note
In the case of a subsystem which doesn’t have it’s own files in a run where the
HLT
is not available (for example, andEMC
standalone run), theValueError
exception will be raised. If the run has just been created, this is just fine - the subsystem just won’t be created (as it shouldn’t be). In this case, it’s advisable to catch and log exception and continue with standard execution. However, in other cases (such as adding a file later in the run), this shouldn’t be possible, so we want the exception to be raised and it needs to be handled carefully (in such a case, it likely indicates that something is broken).Parameters: - runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
runDir
format (“Run123456”), while the values arerunContainer
objects. - subsystem (str) – The current subsystem by three letter, all capital name (ex.
EMC
). Default:None
. - runDict (dict) – Nested dict which contains the new filenames and the HLT mode. For the precise
structure,
base.utilities.moveFiles()
. - runDir (str) – String containing the requested run number. For an example run 123456, it
should be formatted as
Run123456
.
Returns: None. However, the run container is modified to store the newly created subsystem.
Raises: ValueError
– If the subsystem requests doesn’t have it’s own receiver files and files from the HLT receiver are also not available.- runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
-
overwatch.processing.processRuns.
processAllRuns
(dbRoot=None, connection=None)[source]¶ Driver function for processing all available data, storing the results in a database and on disk.
This function is responsible for driving all processing functionality in Overwatch. This spans from initial preprocessing of the received ROOT files to trending information extracted from histograms. In particular, it directs:
- Retrieve the run information or recreate it if it doesn’t exist. If recreated, it will be populated with existing information already stored in the data directory.
- Retrieve the trending object or recreate it if it doesn’t exist. As of August 2018, the trending objects will be empty when recreated.
- Move new files into the Overwatch file structure and create runs and/or subsystems from those new files. If the corresponding objects already exist, then they are updated.
- Perform the actual processing, which includes executing the subsystem (detector) plug-in functionality. The processing will only be performed if necessary (ie if there are new files which need processing). This can also be overridden by specifically requesting reprocessing.
- Perform the trending. It also has subsystem (detector) plug-in functionality.
- Transferring the processed data if requested.
For further information on the technical details of how all of this is accomplished, see the processing README, as well as the package documentation. For further information on the subsystem (detector) plug-in functionality, see the detector subsystem and trending README.
Note
Configuration for this processing is provided by the Overwatch configuration system. For further information, see the Overwatch base module README.
Parameters: - dbRoot – Database root. Default: None. If either argument is None, this function will retrieve the database information itself (and close the connection at the end).
- connection – Database connection. Default: None. If either argument is None, this function will retrieve the database information itself (and close the connection at the end).
Returns: - None. However, it has extensive side effects. It changes values in the database related to runs,
subsystems, etc, as well as writing image and
json
files to disk.
-
overwatch.processing.processRuns.
processHist
(subsystem, hist, canvas, outputFormatting, processingOptions, subsystemName=None, trendingManager=None)[source]¶ Main histogram processing function.
This function is responsible for taking a given
histogramContainer
, process the underlying histogram via processing functions, fill trending objects (if applicable), and then store the result in images andjson
for display in the web app. Here, we execute the plug-in functionality assigned earlier and perform the actual drawing of the hist onto a canvas.In more detail, we processing steps performed here are:
- Setup the canvas.
- Apply the projection functions (if applicable) to get the proper histogram.
- Draw the histogram.
- Apply the processing functions (if applicable).
- Write the output to image and
json
. - Cleanup the hist and canvas by removing reference to them.
Note
The hist is drawn before calling the processing function to allow the plug-ins to draw on top of the histogram.
Note
The
json
that is written is byTBufferJSON
for display viajsRoot
. While it stores the information, it requiresjsRoot
to be displayed meaningfully.Note
This function is built in such a way that it works for processing both histograms and trending objects. For this to work, both the
subsystemContainer
and thetrendingContainer
must support the following methods:imgDir()
, which is the image storage directory, andjsonDir()
, which is thejson
storage directory. Both should except to get formatted with `` % {“subsystem”: subsystemName}``.Parameters: - subsystem (subsystemContainer or trendingContainer) – Subsystem or trending container which contains the histogram being processed. It only uses a subset of either classes methods. See the note for information about the requirements of this object.
- hist (histogramContainer) – Histogram to be processed.
- canvas (TCanvas) – Canvas on which the histogram should be plotted. It will be stored in the
histogramContainer
for the purposes of processing the hist. - outputFormatting (str) – Specially formatted string which contains a generic path to be used when printing histograms.
It must contain
base
,name
, andext
, wherebase
is the base path,name
is the filename andext
is the extension. Ex:{base}/{name}.{ext}
. - processingOptions (dict) – Implemented by the subsystem to note options used during standard processing. Keys
are names of options, while values are the corresponding option values. Default:
None
. Note: In this case, it will use the default subsystem or trending processing options. - subsystemName (str) – The current subsystem by three letter, all capital name (ex.
EMC
). Default:None
. In that case ofNone
, the subsystem name is retrieved fromsubsystem.subsystem
. This argument is used for processing the trending objects where we don’t have access to their correspondingsubsystemContainer
. The subsystem name of thetrendingContainer
(TDG
) does not necessarily correspond to the subsystem of the object being processed, so we have to pass it here. - trendingManager (TrendingManager) – Will be notified when as histogram is processed to allow the use of the histogram values in trending.
Returns: - None. However, the subsystem, histogram, etc are modified and their representations in images
and
json
are written to disk.
-
overwatch.processing.processRuns.
processMovedFilesIntoRuns
(runs, runDict)[source]¶ Convert the list of moved files into run and subsystem containers stored in the database.
In the case that the run has not been created, a new run container is created and an attempt is made to create all subsystems that were requested in the configuration. If the subsystem already exists, the moved files are added to the existing objects. It also includes the capability to add new subsystems part of the way through a run in the unlikely event that we pick up new data during the run.
Parameters: - runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
runDir
format (“Run123456”), while the values arerunContainer
objects. - runDict (dict) – Nested dict which contains the new filenames and the HLT mode. For the precise
structure,
base.utilities.moveFiles()
.
Returns: - None. Subsystems are created inside of the
runContainer
objects for which there are entries in the runDict
.
- runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
-
overwatch.processing.processRuns.
processRootFile
(filename, outputFormatting, subsystem, processingOptions=None, forceRecreateSubsystem=False, trendingManager=None)[source]¶ Given a root file, process all histograms for a given subsystem.
Processing includes assigning the contained histograms to a subsystem, allowing for customization via the plugin system. For a new subsystem, the processing proceeds in the following order:
- Create histogram containers for histograms in the file.
- Create new histograms (in addition to those already in the file).
- Create histogram stacks.
- Specify histogram options.
- Create histogram groups.
- Sort histograms into histogram groups.
- For each sorted histogram:
- Determine which processing functions to apply to which histograms.
- Determine which trending functions require which histograms.
Processing then proceeds to apply those functions to all sorted histograms. The final histograms are then stored as images and as
json
. In the case that the subsystem already exists, we can skip all of those steps and simply apply the processing functions. If a histogram was not sorted then it belongs to another subsystem and could be processed by it later (depending on the configured subsystems).Note
Trending objects are filled (in
processHist()
) when the relevant hists are processed in this function.Parameters: - filename (str) – The full path to the file to be processed.
- outputFormatting (str) – Specially formatted string which contains a generic path to be used when printing
histograms. It must contain
base
,name
, andext
, wherebase
is the base path,name
is the filename andext
is the extension. Ex:{base}/{name}.{ext}
. - subsystem (subsystemContainer) – Contains information about the current subsystem.
- processingOptions (dict) – Implemented by the subsystem to note options used during standard processing. Keys
are names of options, while values are the corresponding option values. Default:
None
. Note: In this case, it will use the default subsystem processing options. - forceRecreateSubsystem (bool) – True if subsystems will be recreated, even if they already exist.
- trendingManager (TrendingManager) – Manages the trending subsystem.
Returns: None. However, the underlying subsystems, histograms, etc, are modified.
-
overwatch.processing.processRuns.
processTimeSlices
(runs, runDir, minTimeRequested, maxTimeRequested, subsystemName, inputProcessingOptions)[source]¶ Creates a time slice or performs user directed reprocessing.
Time slices are created by processing a given run using only data in a given time range (and potentially modifying the processing options). User directed reprocessing uses the same infrastructure by varying the processing arguments and selecting the full time range available for a given run. While the external interface is different, this capabilities are performed using the same underlying infrastructure as in the standard processing.
This function is usually invoked via the web app on a particular run page.
Note
For the format of the errors that are returned, see the web app README.
Parameters: - runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
runDir
format (“Run123456”), while the values arerunContainer
objects. - runDir (str) – String containing the requested run number. For an example run 123456, it
should be formatted as
Run123456
. - minTimeRequested (int) – The requested start time of the merge in minutes.
- maxTimeRequested (int) – The requested end time of the merge in minutes.
- subsystemName (str) – The subsystem of the time slice request by three letter, all capital name (ex.
EMC
). - inputProcessingOptions (dict) – Processing options requested for the time slice. Keys are the names of
- options, while values are the actual values of the processing options. (the) –
Returns: - If successful, we return the time slice key (str) under which the requested time slice is stored
in the
subsystemContainer.timeSlices
dictionary. If an error was encountered, we return an error dictionary in the proper format.
Return type: str or dict
- runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the
in the
-
overwatch.processing.processRuns.
validateAndCreateNewTimeSlice
(run, subsystem, minTimeMinutes, maxTimeMinutes, inputProcessingOptions)[source]¶ Validate and create a
timeSliceContainer
based on the given inputs.Validate the given time slice options, check the options to determine if we’ve already create the time slice, and then return the proper
timeSliceContainer
(either an existing container or a new one based on the result of the checks). By comparing the requested options and times with those that we have already processed, we can avoid having to reprocess existing data when nothing has changed. This effectively allows us to cache the processing results.The resulting
timeSliceContainer
is stored under aUUID
generated string to ensure that they never overwrite each other.Note
For the error format in
errors
, see the web app README.Parameters: - run (runContainer) – Run for which the time slice was requested.
- subsystem (subsystemContainer) – Subsystem for which the time slice was requested.
- minTimeMinutes (int) – Minimum time for the time slice in minutes.
- maxTimeMinutes (int) – Maximum time for the time slice in minutes.
- inputProcessingOptions (dict) – Processing options requested for the time slice.
Returns: - (timeSliceKey, newlyCreated, errors) where timeSliceKey (str) is the key under which the relevant
timeSliceContainer
is stored in thesubsystemContainer.timesSlices
dict, newlyCreated (bool) isTrue
if thetimeSliceContainer
was newly created (as opposed to already existing), and errors (dict) is an error dictionary in the proper format.
Return type: tuple
overwatch.processing.processingClasses module¶
Classes that define the processing structure of Overwatch.
Classes that define the structure of processing. This information can be created and processed, or read from file.
Note
For the __repr__
and __str__
methods defined here, they can throw KeyError
for class attributes
if the these methods rely on __dict__
and the objects have just been loaded from ZODB. Presumably, __dict__
doesn’t cause ZODB to fully load the object. To work around this issue, any methods using __dict__
first
call some attribute (ideally, something simple) to ensure that the object is fully loaded. The result of that call
is ignored.
-
class
overwatch.processing.processingClasses.
fileContainer
(filename, startOfRun=None)[source]¶ Bases:
persistent.Persistent
File information container.
This object wraps a ROOT filename, providing convenient access to relevant properties, such as the type of file (combined, timeSlice, standard), and the time stamp. This information is often stored in the filename itself, but extraction procedures vary for each file type. Note that it does not open the file itself - this is still the responsibility of the user.
Parameters: - filenae (str) – Filename of the corresponding file. This is expected to the full path
from the
dirPrefix
to the file. - startOfRun (int) – Start of the run in unix time. Default:
None
. The default will lead to timeIntoRun being set to-1
. The default is most commonly used for time slices, where the start of run isn’t so meaningful.
-
filenae
¶ Filename of the corresponding file. This is expected to the full path from the
dirPrefix
to the file.Type: str
-
combinedFile
¶ True if this file corresponds to a combined file. It is set to
True
if “combined” is in the filename.Type: bool
-
timeSlice
¶ True if this file corresponds to a time slice. It is set to
True
if “timeSlice” in in the filename.Type: bool
-
fileTime
¶ Unix time stamp of the file, extracted from the filename.
Type: int
-
timeIntoRun
¶ Time in seconds from the start of the run to the file time. Depends on startOfRun being a valid time when the object was created.
Type: int
- filenae (str) – Filename of the corresponding file. This is expected to the full path
from the
-
class
overwatch.processing.processingClasses.
histogramContainer
(histName, histList=None, prettyName=None)[source]¶ Bases:
persistent.Persistent
Histogram information container.
Organizes information about a particular histogram (or set of histograms). Manages functions that process and otherwise modify the histogram, which are specified through the plugin system. The container also manages plotting details.
Note
The histogram container doesn’t always have access to the underlying histogram. When constructing the container, it is useful to have the histogram available to provide some information, but then the histogram should not be needed until final processing is performed and the hist is plotted. When this final step is reached, the histogram can be retrieved by
retrieveHistogram()
helper function.Parameters: - histName (str) – Name of the histogram. Doesn’t necessarily need to be the same as
TH1.GetName()
. - histList (list) – List of histogram names that should contribute to this container. Used for stacking multiple histograms on onto one canvas. Default: None
- prettyName (str) – Name of the histogram that is appropriate for display. Default:
None
, which will lead to be it being set tohistName
.
-
histName
¶ Name of the histogram. Doesn’t necessarily need to be the same as
TH1.GetName()
.Type: str
-
prettyName
¶ Name of the histogram that is appropriate for display.
Type: str
-
histList
¶ List of histogram names that should contribute to this container. Used for stacking multiple histograms on onto one canvas. Default: None. See
retrieveHistogram()
for more information on how this functionality is utilized.Type: list
-
information
¶ Information that is extracted from the histogram that should be stored persistently and displayed. This information will be displayed with the web app, with the key shown as a clickable button, and the value information stored behind it.
Type: PersistentMapping
-
hist
¶ The histogram which this container wraps.
Type: ROOT.TH1
-
histType
¶ Class of the histogram. For example,
ROOT.TH1F
. Can be used for functions that only apply to 2D hists, etc. It is stored separately from the histogram to allow for it to be available even when the underlying histogram is not (as occurs while setting up but not yet processing a histogram).Type: ROOT.TClass
-
drawOptions
¶ Draw options to be passed to
TH1.Draw()
when drawing the histogram.Type: str
-
canvas
¶ Canvas onto which the histogram will be plotted. Available after the histogram has been classified (ie in processing functions).
Type: ROOT.TCanvas
-
projectionFunctionsToApply
¶ List-like object of functions that perform projections to the histogram that is represented by this container. See the detector subsystem README for more information.
Type: PersistentList
-
functionsToApply
¶ List-like object of functions that are applied to the histogram during the processing step. See the detector subsystem README for more information.
Type: PersistentList
-
trendingObjects
¶ List-like object of trending objects which operate on this histogram. See the detector subsystem and trending README for more information.
Type: PersistentList
-
retrieveHistogram
(ROOT, fIn=None, trending=None)[source]¶ Retrieve the histogram from the given file or trending container.
This function can retrieve a single histogram from a file, multiple hists from a file to create a stack (based on the hist names in
histList
), or a single trending histogram stored in the collection of trending objects.Parameters: - ROOT (ROOT) – ROOT module. Passed into this object so this module doesn’t need to directly depend on importing ROOT.
- fIn (ROOT.TFile) – File in which the histogram(s) is stored. Default:
None
. - trending (trendingContainer) – Contains the trending objects, including the trending
histogram which is represented in this histogram container. It is the source
of the histogram, and therefore similar to the input ROOT file. Default:
None
.
Returns: True if the histogram was successfully retrieved.
Return type: bool
- histName (str) – Name of the histogram. Doesn’t necessarily need to be the same as
-
class
overwatch.processing.processingClasses.
histogramGroupContainer
(prettyName, groupSelectionPattern, plotInGridSelectionPattern='DO NOT PLOT IN GRID')[source]¶ Bases:
persistent.Persistent
Organizes similar histograms into groups for processing and display.
Histograms groups are created by providing name substrings of histogram which should be included. The name substring is referred to as a
groupSelectionPattern
. For example, if the pattern was “hello”, all histograms containing “hello” would be selected. Additional properties related to groups, such as display information, are also stroed.Parameters: - prettyName (str) – Readable name of the group.
- groupSelectionPattern (str) – Pattern of the histogram names that will be selected. For example, if wanted to select histograms related to EMCal patch amplitude, we would make the pattern something like “PatchAmp”. The pattern depends on the name of the histograms sent from the HLT.
- plotInGridSelectionPattern (str) – Pattern which denotes whether the histograms should be plotted in
a grid.
plotInGrid
is set based on whether this value is ingroupSelectionPattern
. For example, in the EMCal, theplotInGridSelectionPattern
is_SM
, since “SM” denotes a supermodule.
-
prettyName
¶ Readable name of the group. Set via the
groupName
in the constructor.Type: str
-
selectionPattern
¶ Pattern of the histogram names that will be selected.
Type: str
-
plotInGridSelectionPattern
¶ Pattern (substring) which denotes whether the histograms should be plotted in a grid.
Type: str
-
plotInGrid
¶ True when the histograms should be plotted in a grid.
Type: bool
-
histList
¶ List of histogram names that should be filled when the
selectionPattern
is matched.Type: PersistentList
-
class
overwatch.processing.processingClasses.
runContainer
(runDir, fileMode, hltMode=None)[source]¶ Bases:
persistent.Persistent
Object to represent a particular run.
It stores run level information, as well the subsystems which then containing the corresponding event information (histogram groups, histograms, etc).
Note that files are not considered event level information because the files correspond to individual subsystem. Furthermore, in rare cases, there may be numbers of files for different subsystems that are included in an individual run. Consequently, it is cleaner for each subsystem to track it’s own files.
To allow the object to be reconstructed from scratch, the HLT mode is stored by writing a YAML file in the corresponding run directory. This file is referred to as the “run info” file. Additional properties could also be written to this file to avoid the loss of transient information.
Note
The run info file is read and written on object construction. It will only be checked if the HLT mode is not set.
Parameters: - runDir (str) – String containing the run number. For an example run 123456, it should be
formatted as
Run123456
. - fileMode (bool) – If true, the run data was collected in cumulative mode. See the processing README for further information.
- hltMode (str) – String containing the HLT mode used for the run.
-
runDir
¶ String containing the run number. For an example run 123456, it should be formatted as
Run123456
Type: str
-
runNumber
¶ Run number extracted from the
runDir
.Type: int
-
prettyName
¶ Reformatting of the
runDir
for improved readability.Type: str
-
fileMode
¶ If true, the run data was collected in cumulative mode. See the processing README for further information. Set via
fileMode
.Type: bool
-
subsystems
¶ Dict-like object which will contain all of the subsystem containers in an event. The key is the corresponding subsystem three letter name.
Type: BTree
-
hltMode
¶ Mode the HLT operated in for this run. Valid HLT modes are “B”, “C”, “E”, and “U”. Further information on the various modes is in the processing README. Default:
None
(which will be converted to “U”, for “unknown”).Type: str
-
isRunOngoing
()[source]¶ Checks if a run is ongoing.
The ongoing run check is performed by looking checking for a new file in any of the subsystems. If they have just received a new file, then the run is ongoing.
Note
If
subsystem.newFile
is false, this is not a sufficient condition to say that the run has ended. This is becausenewFile
will be set to false if the subsystem didn’t have a file in the most recent processing run, even if the run is still ongoing. This can happen for many reasons, including if the processing is executed more frequently than the data transfer rate or receiver request rate, for example. However, ifnewFile
is true, then it is sufficient to know that the run is ongoing.Parameters: None – Returns: True if the run is ongoing. Return type: bool
-
minutesSinceLastTimestamp
()[source]¶ Determine the time since the last file timestamp in minutes.
Parameters: None. – Returns: Minutes since the timestamp of the most recent file. Default: -1. Return type: float
-
startOfRunTimeStamp
()[source]¶ Provides the start of the run time stamp in a format suitable for display.
This timestamp is determined by looking at the timestamp of the last subsystem (arbitrarily selected) that is available in the run. No time zone conversion is performed, so it simply displays the time zone where the data was stored (CERN time in production systems).
Parameters: None – Returns: Start of run time stamp formatted in an appropriate manner for display. Return type: str
- runDir (str) – String containing the run number. For an example run 123456, it should be
formatted as
-
class
overwatch.processing.processingClasses.
subsystemContainer
(subsystem, runDir, startOfRun, endOfRun, showRootFiles=False, fileLocationSubsystem=None)[source]¶ Bases:
persistent.Persistent
Object to represent a particular subsystem (detector).
It stores subsystem level information, including the histograms, groups, and file information. It is the main container for much of the information that is relevant for processing.
Information on the file storage layout implemented through this class is available in the processing README.
Note
This object checks for and creates a number of directories on initialization.
Parameters: - subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex.
EMC
). - runDir (str) – String containing the run number. For an example run 123456, it should be
formatted as
Run123456
- startOfRun (int) – Start of the run in unix time.
- endOfRun (int) – End of the run in unix time.
- showRootFiles (bool) – True if the ROOT files should be made accessible through the run list.
Default:
False
. - fileLocationSubsystem (str) – Subsystem name of where the files are actually located. If a subsystem
has specific data files then this is just equal to the subsystem. However, if it relies on
files inside of another subsystem (such as those from the HLT subsystem receiver), then this
variable is equal to that subsystem name. Default:
None
, which corresponds to the subsystem storing it’s own data.
-
subsystem
¶ The current subsystem in the form of a three letter, all capital name (ex.
EMC
).Type: str
-
showRootFiles
¶ True if the ROOT files should be made accessible through the run list.
Type: bool
-
fileLocationSubsystem
¶ Subsystem name of where the files are actually located. If a subsystem has specific data files then this is just equal to the subsystem. However, if it relies on files inside of another subsystem, then this variable is equal to that subsystem name.
Type: str
-
files
¶ Dict-like object which describes subsystem ROOT files. Unix time of a given file is the key and a file container for that file is the value.
Type: BTree
-
timeSlices
¶ Dict-like object which describes subsystem time slices. A UUID is the dict key (so they can be uniquely identified), while a timeSliceContainer with the corresponding time slice properties is the value.
Type: BTree
-
combinedFile
¶ File container corresponding to the combined file.
Type: fileContainer
-
baseDir
¶ Path to the base storage directory for the subsystem. Of the form
Run123456/SYS
.Type: str
-
imgDir
¶ Path to the image storage directory for the subsystem. Of the form
Run123456/SYS/img
.Type: str
-
jsonDir
¶ Path to the json storage directory for the subsystem. Of the form
Run123456/SYS/json
.Type: str
-
startOfRun
¶ Start of the run in unix time.
Type: int
-
endOfRun
¶ End of the run in unix time.
Type: int
-
runLength
¶ Length of the run in minutes.
Type: int
-
histGroups
¶ List-like object of histogram groups, which are used to classify similar histograms.
Type: PersistentList
-
histsInFile
¶ Dict-like object of all histograms that are in a particular file. Keys are the histogram name, while the values are
histogramContainer
objects which contain the histogram. Hists should be usually be accessed through the hist groups, but list this provides direct access when necessary early in processing.Type: BTree
-
histsAvailable
¶ Dict-like object containing all histograms that are available, including those in a particular file and those that are created during processing. Newly created hists should be stored in this dict. Keys are histogram names, while values are
histogramContainer
objects which contain the histogram.Type: BTree
-
hists
¶ Dict-like object which contains all histograms that should be processed by a histogram. After initial creation, this should be the definitive source of histograms for processing and display. Keys are histogram names, while values are
histogramContainer
objects which contain the histogram.Type: BTree
-
newFile
¶ True if we received a new file, while will trigger reprocessing. This flag should only be changed when beginning processing the next time. To be explicit, if a subsystem just received a new file and it was processed, this flag should only be changed to
False
after the next processing iteration begins. This allows the status of the run (determined through the subsystem) to be displayed in the web app. Default: True because if the subsystem is being created, we likely need reprocessing.Type: bool
-
nEvents
¶ Number of events in the subsystem. Processing will look for a histogram that contains
events
in the name and attempt to extract the number of events based on the number of entries. Should not be used unless the subsystem explicitly includes a histogram with the number of events. Default: 1.Type: int
-
processingOptions
¶ Implemented by the subsystem to note options used during standard processing. The subsystem processing options can vary when processing a time slice, so storing the options allow us to return to the standard options when performing a full processing. Keys are the option names as string, while values are their corresponding values.
Type: PersistentMapping
-
calculateRunLength
(startOfRun=None, endOfRun=None)[source]¶ Helper function to update the run length.
Note
The run length is defined in minutes.
Parameters: - startOfRun (int) – Start of the run in unix time. Default:
None
. If not specified, thestartOfRun
stored in the subsystem will be used. - endOfRun (int) – End of the run in unix time. Default:
None
. If not specified, thestartOfRun
stored in the subsystem will be used.
Returns: The calculated run length in minutes.
Return type: int
- startOfRun (int) – Start of the run in unix time. Default:
-
static
prettyPrintUnixTime
(unixTime)[source]¶ Converts the given time stamp into an appropriate manner (“pretty”) for display.
The time is returned in the format: “Tuesday, 6 Nov 2018 20:55:10”. This function is mainly needed in Jinja templates were arbitrary functions are not allowed.
Note
We display this in the CERN time zone, so we convert it here to that timezone.
Parameters: unixTime (int) – Unix time to be converted. Returns: The time stamp converted into an appropriate manner for display. Return type: str
-
resetContainer
()[source]¶ Clear the stored hist information so we can recreate (reprocess) the subsystem.
Without resetting the container, reprocessing doesn’t fully test the processing functions, which are skipped if these list- and dict-like hist objects have entries.
Parameters: None – Returns: None
-
setupDirectories
(runDir)[source]¶ Helper function to setup the subsystem directories.
Defines the base, img, and JSON directories, as well as creating the them if necessary.
Parameters: runDir (str) – String containing the run number. For an example run 123456, it should be formatted as Run123456
Returns: None. However, it sets the baseDir
,imgDir
, andjsonDir
properties of thesubsystemContainer
.
- subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex.
-
class
overwatch.processing.processingClasses.
timeSliceContainer
(minUnixTimeRequested, maxUnixTimeRequested, minUnixTimeAvailable, maxUnixTimeAvailable, startOfRun, filesToMerge, optionsHash)[source]¶ Bases:
persistent.Persistent
Time slice information container.
Contains information about a time slice request, including the time ranges and the files involved. These values are required to uniquely describe a time slice.
Parameters: - minUnixTimeRequested (int) – Minimum requested unix time. This is the first time stamp to be included in the time slice.
- maxUnixTimeRequested (int) – Maximum requested unix time. This is the last time stamp to be included in the time slice.
- minUnixTimeAvailable (int) – Minimum unix time of the run.
- maxUnixTimeAvailable (int) – Maximum unix time of the run.
- startOfRun (int) – Unix time of the start of the run.
- filesToMerge (list) – List of fileContainer objects which need to be merged to create the time slice.
- optionsHash (str) – SHA1 hash of the processing options used to construct the time slice.
-
minUnixTimeRequested
¶ Minimum requested unix time. This is the first time stamp to be included in the time slice.
Type: int
-
maxUnixTimeRequested
¶ Maximum requested unix time. This is the last time stamp to be included in the time slice.
Type: int
-
minUnixTimeAvailable
¶ Minimum unix time of the run.
Type: int
-
maxUnixTimeAvailable
¶ Maximum unix time of the run.
Type: int
-
startOfRun
¶ Unix time of the start of the run.
Type: int
-
filesToMerge
¶ List of fileContainer objects which need to be merged to create the time slice.
Type: list
-
optionsHash
¶ SHA1 hash of the processing options used to construct the time slice. This hash is used for caching by comparing the processing options for a new time slice request with those already processed. If the hashes are the same, we can directly return the already processed result.
Type: str
-
filenamePrefix
¶ Filename for the timeSlice file, based on the given start and end times.
Type: str
-
filename
¶ File container for the timeSlice file.
Type: fileContainer
-
processingOptions
¶ Implemented by the time slice container to note options used during standard processing. The time slice processing options can vary when compared to standard subsystem processing, so storing the options allow us to apply the custom time slice options.
Type: PersistentMapping
-
timeInMinutes
(inputTime)[source]¶ Return the time from the input unix time to the start of the run in minutes.
Parameters: inputTime (int) – Unix time to be compared to the start of run time. Returns: Minutes from the start of run to the given time. Return type: int
-
timeInMinutesRounded
(inputTime)[source]¶ Return the time from the input unix time to start of the run in minutes, rounded to the nearest minute.
Note
I believe this was created due to some float vs int issues in the Jinja templating system. Although the purpose of this function isn’t entirely clear, it is kept for compatibility purposes.
Parameters: inputTime (int) – Unix time to be compared to the start of run time. Returns: Minutes from the start of run to the given time. Return type: int
overwatch.processing.pluginManager module¶
Contains all of the machinery for the plugin system.
This modules manages the plugins functions defined by each detector. This is achieved by dynamically loading each subsystem module on import of this module. A pointer to each function is added to the plugin manager, allowing for any detector subsystem function to be called through this module. The processing functions use this functionality to allow subsystems to plug into all stages of the processing and trending.
Note that only the main routing plugin functions defined below (for example, as
defined in findFunctionsForHist
) are actually called through this module.
All other functions (for example, functions that will actually perform processing
on a hist) will be called directly through their own subsystem modules. However,
they are also loaded by the plugin manager for convenience.
The subsystems to actually load are specified in the configuration file.
-
overwatch.processing.pluginManager.
createAdditionalHistograms
(subsystem)[source]¶ Properly routes additional histogram creation functions for each subsystem.
Additional histograms can be created for a particular subsystem via these plugins. Function names should be of the form
createAdditional(SYS)Histograms(subsystem, **kwargs)
, where(SYS)
is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.Parameters: subsystem (subsystemContainer) – Current subsystem container Returns: None.
-
overwatch.processing.pluginManager.
createHistGroups
(subsystem)[source]¶ Properly route histogram group function for each subsystem.
Histogram groups are groups of histograms which should be displayed together for visualization. Function names should be of the form
create(SYS)HistogramGroups(subsystem, **kwargs)
, where(SYS)
is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.Parameters: subsystem (subsystemContainer) – Current subsystem container. Returns: True if the function was called Return type: bool
-
overwatch.processing.pluginManager.
createHistogramStacks
(subsystem)[source]¶ Properly routes histogram stack function for each subsystem.
Histogram stacks are collections of histograms which should be plotted together. For example, one may want to plot similar spectra, such as those in the EMCal and DCal, on the same plot. These are treated similarly to a histogramContainer. Functions should be of the form
create(SYS)HistogramStacks(subsystem, **kwargs)
, where(SYS)
is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.Parameters: subsystem (subsystemContainer) – Current subsystem container Returns: None.
-
overwatch.processing.pluginManager.
defineTrendingObjects
(subsystem)[source]¶ Defines trending objects and the histograms from which they should be extracted.
Defines trending objects related to a subsystem. These objects implement the trending function, as well as specifying the histograms that provide the values for the trending. The plugin function for each subsystem should be of the form
define(SYS)TrendingObjects(trending, **kwargs)
, where(SYS)
is the subsystem three letter name, trending is a dict where the new trending objects should be stored, and the other args are reserved for future use.Parameters: subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex. EMC
).Returns: Keys are the name of the trending objects, while values are the trending objects themselves. Return type: dict
-
overwatch.processing.pluginManager.
findFunctionsForHist
(subsystem, hist)[source]¶ Determines which functions should be applied to a histogram.
Histogram functions apply additional processing, from extracting values to change ranges to drawing on top of the histogram. These functions are executed when the histogram is processed. The functions should be stored as function pointers so the lookup doesn’t need to occur every time the histogram container is processed. The plugin functions for each subsystem should be of the form
findFunctionsFor(SYS)Histogram(subsystem, hist, **kwargs)
, where(SYS)
is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem and hist (histogramContainer) is the current histogram being processed, and the other args are reserved for future use.Note
This function must handle all possible histograms for a subsystem, so it is strongly recommended to select them via hist name or another property.
Parameters: - subsystem (subsystemContainer) – Current subsystem container.
- hist (histogramContainer) – Current histogram to be processed.
Returns: None.
-
overwatch.processing.pluginManager.
setHistogramOptions
(subsystem)[source]¶ Properly routes histogram options function for each subsystem.
Histogram options include options such as renaming histograms, setting draw options, setting histogram scaling, and/or thresholds, etc. These options much be specific to the histogram object. Canvas options are set elsewhere when actually drawing on the canvas. It cannot be set now because the canvas doesn’t yet exist and we would need to call functions to on that object (we prefer not to use function pointers here). Functions should be of the form
set(SYS)HistogramOptions(subsystem, **kwargs)
, where(SYS)
is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.Parameters: subsystem (subsystemContainer) – Current subsystem container Returns: None.
-
overwatch.processing.pluginManager.
subsystemNamespace
(functionName, subsystemName)[source]¶ Prepend the subsystem name to a function to act as a namespace.
This avoids the possibility of different subsystems with the same function names overwriting each other. Returned function names are of the form
SYS_functionName
.Note
Since
.
indicates an attribute or element of a module, we use an_
instead. Although it might be nice visually, and is suggestive of the relationship between these functions and the subsystem modules, it causes problems when generating the docs since the generation treats the.
as if it legitimate python (which it isn’t, since we don’t have the full path).Parameters: - functionName (str) – Name of the function.
- subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex.
EMC
).
Returns: Properly formatted function name with the subsystem prepended as a namespace.
Return type: str
overwatch.processing.run module¶
Minimal executable to launch processing.
__main__
is implemented to allow for this function to be executed directly,
while run()
is defined to allow for execution via entry_points
defined
in the python package setup.
-
overwatch.processing.run.
run
()[source]¶ Main entry point for starting
processAllRuns()
.This function will run on an interval determined by the value of
processingTimeToSleep
(specified in seconds). If the value is 0 or less, the processing will only run once.Note
The sleep time is defined as the time between when
processAllRuns()
finishes and when it is started again.Parameters: None. – Returns: None.