Overwatch Processing Package (overwatch.processing)

General information from the README is available immediately below. Further module specific documentation is available further below in the package reference, with the modules listed below.

Package reference

overwatch.processing.mergeFiles module

This module contains functions used to merge histograms and ROOT files.

For further information on functionality and options, see the function docstrings.

overwatch.processing.mergeFiles.merge(currentDir, run, subsystem, cumulativeMode=True, timeSlice=None)[source]

For a given run and subsystem, handles merging of files into a “combined file” which is suitable for processing.

For a standard subsystem, the “combined file” is the file which stores the most recent data for a particular run and subsystem. The merge is only performed if we have received new files in the specified run. The details of the merge are determined by the cumuluativeMode setting. This setting, which is determined by the options sent in the data request sent to the HLT by the ZMQ receiver, denotes whether the data that we received are reset each time the data is sent. If the data is being reset, then the files are not cumulative, and therefore cumulativeMode should be set to False. Note that although this function supports both modes, we tend to operate in cumulative mode because resetting the objects would interfere with other subscribes to the HLT data. For example, if both Overwatch and another subscriber were set to request data with resets every minute and were offset by 30 seconds, they would both only receive approximately half the data! Thus, it’s preferred to operate in cumulative mode.

For cumulative mode, the combined objects are created in two different ways: 1) For a standard combined file, by simply copying the most recent file (because it contains data covering the entire run); 2) For time slices, by subtracting the objects in two corresponding ROOT files. For reset mode, TFileMerger is used to merge all files within the available timestamps together.

This function also handles merging files for time slices. The relevant parameters should be specified in a timeSliceContainer. The min and max requested times are extracted, and this function only merges files within the fixed time range corresponding to those values. The output format of this file is identical to any other combined file.

Note

As side effects of this function, if it executes successfully for a combined file, the combined filename will be updated in subsystemContainer.

Parameters:
  • dirPrefix (str) – Path to the root directory where the data is stored.
  • runDir (str) – Run directory of current run. Of the form Run######.
  • subsystem (str) – The current subsystem by three letter, all capital name (ex. EMC).
  • cumulativeMode (bool) – Specifies whether the histograms we receive are cumulative or if they have been reset between each acquired ROOT file, i.e. whether we merge in “subscribe mode” or “request/reset mode”. Default: True.
  • timeSlice (processingClasses.timeSliceContainer) – Stores the properties of the requested time slice. If not specified, it will be ignored and it will create a standard “combined file”. Default: None
Returns:

On success, None is returned. Otherwise, an exception is raise.

Return type:

None

Raises:

ValueError – If the number of input files doesn’t match the number of files in the merger. Perhaps if a file is inaccessible.

overwatch.processing.mergeFiles.mergeRootFiles(runs, dirPrefix, forceNewMerge=False, cumulativeMode=True)[source]

Driver function for creating combined files for each subsystem within a given set of runs.

For a given list of runs, this function will iterate over all available subsystems, merging or moving files as appropriate. To speed up this function, operations will only be performed if a new file is available for a particular subsystem (ie. subsystemContainer.newFile == True). This function will result in a combined file per subsystem per run. For further information on the format of this file, see merge().

Parameters:
  • runs (dict) – Dict of runContainers to perform the merge over. The keys are the runDirs, in the from of Run######.
  • dirPrefix (str) – Path to the root directory where the data is stored.
  • forceNewMerge (bool) – Flag to force merging for all runs, regardless of whether it is supposed to be merged. Default: False.
  • cumulativeMode (bool) – Specifies whether the histograms we receive are cumulative or if they have been reset between each acquired ROOT file, i.e. whether we merge in “subscribe mode” or “request/reset mode”. See merge() for further information on this mode. Default: True.
Returns:

None

overwatch.processing.mergeFiles.subtractFiles(minFile, maxFile, outfile)[source]

Subtract histograms in one file from matching histograms in another.

This function is used for creating time slices in cumulative mode. Since each file is cumulative, the later time stamped file needs to be subtracted from the earlier time stamped file. The remaining data corresponds to the data stored during the time window early-late.

This function is not used for creating a standard combined file because the cumulative information is already stored in the most recent file.

Note

The names of the histograms in each file must match exactly for them to be subtracted.

Note

The output file is opened with “RECREATE”, so it will always overwrite an existing file with the given filename!

Parameters:
  • minFile (str) – Filename of the ROOT file containing data to be subtracted.
  • maxFile (str) – Filename of the ROOT file containing data to to subtracted from.
  • outfile (str) – Filename of the output file which will contain the subtracted histograms.
Returns:

None.

overwatch.processing.processRuns module

Steers and executes Overwatch histogram processing.

Takes files received from the HLT, organizes the information within a directory structure, and processes the histograms within. It provides plugin opportunities throughout the all processing and trending steps.

overwatch.processing.processRuns.compareProcessingOptionsDicts(inputProcessingOptions, processingOptions, errors)[source]

Compare an input and existing processing options dictionaries.

Compare the dictionary values stored in the input (inputProcessingOptions) to those in the reference (processingOptions). Both the keys and values are compared. For both dictionaries, keys are names of options, while values are the corresponding option values.

Note

The existing processing options can have more entries than the input. Only the keys and values in the input are checked.

Note

For the error format in errors, see the web app README.

Parameters:
  • inputProcessingOptions (dict) – Processing options specified in the time slice.
  • processingOptions (dict) – Processing options used during standard processing which serve as the reference options. These usually should be the subsystem processing options.
Returns:

(processingOptionsAreTheSame, errors) where processingOptionsAreTheSame (bool) is True if all input

options are the same values as in the existing options and errors (dict) is an error dictionary in the proper format.

Return type:

tuple

overwatch.processing.processRuns.createNewSubsystemFromMovedFilesInformation(runs, subsystem, runDict, runDir)[source]

Creates a new subsystem based on the information from the moved files.

This function determines the fileLocationSubsystem and then creates a new subsystem based on the given information, including adding the files to the subsystem. It also ensures that the subsystem will be processed by enabling the newFile flag in the subsystem.

Note

In the case of a subsystem which doesn’t have it’s own files in a run where the HLT is not available (for example, and EMC standalone run), the ValueError exception will be raised. If the run has just been created, this is just fine - the subsystem just won’t be created (as it shouldn’t be). In this case, it’s advisable to catch and log exception and continue with standard execution. However, in other cases (such as adding a file later in the run), this shouldn’t be possible, so we want the exception to be raised and it needs to be handled carefully (in such a case, it likely indicates that something is broken).

Parameters:
  • runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the in the runDir format (“Run123456”), while the values are runContainer objects.
  • subsystem (str) – The current subsystem by three letter, all capital name (ex. EMC). Default: None.
  • runDict (dict) – Nested dict which contains the new filenames and the HLT mode. For the precise structure, base.utilities.moveFiles().
  • runDir (str) – String containing the requested run number. For an example run 123456, it should be formatted as Run123456.
Returns:

None. However, the run container is modified to store the newly created subsystem.

Raises:

ValueError – If the subsystem requests doesn’t have it’s own receiver files and files from the HLT receiver are also not available.

overwatch.processing.processRuns.processAllRuns(dbRoot=None, connection=None)[source]

Driver function for processing all available data, storing the results in a database and on disk.

This function is responsible for driving all processing functionality in Overwatch. This spans from initial preprocessing of the received ROOT files to trending information extracted from histograms. In particular, it directs:

  • Retrieve the run information or recreate it if it doesn’t exist. If recreated, it will be populated with existing information already stored in the data directory.
  • Retrieve the trending object or recreate it if it doesn’t exist. As of August 2018, the trending objects will be empty when recreated.
  • Move new files into the Overwatch file structure and create runs and/or subsystems from those new files. If the corresponding objects already exist, then they are updated.
  • Perform the actual processing, which includes executing the subsystem (detector) plug-in functionality. The processing will only be performed if necessary (ie if there are new files which need processing). This can also be overridden by specifically requesting reprocessing.
  • Perform the trending. It also has subsystem (detector) plug-in functionality.
  • Transferring the processed data if requested.

For further information on the technical details of how all of this is accomplished, see the processing README, as well as the package documentation. For further information on the subsystem (detector) plug-in functionality, see the detector subsystem and trending README.

Note

Configuration for this processing is provided by the Overwatch configuration system. For further information, see the Overwatch base module README.

Parameters:
  • dbRoot – Database root. Default: None. If either argument is None, this function will retrieve the database information itself (and close the connection at the end).
  • connection – Database connection. Default: None. If either argument is None, this function will retrieve the database information itself (and close the connection at the end).
Returns:

None. However, it has extensive side effects. It changes values in the database related to runs,

subsystems, etc, as well as writing image and json files to disk.

overwatch.processing.processRuns.processHist(subsystem, hist, canvas, outputFormatting, processingOptions, subsystemName=None, trendingManager=None)[source]

Main histogram processing function.

This function is responsible for taking a given histogramContainer, process the underlying histogram via processing functions, fill trending objects (if applicable), and then store the result in images and json for display in the web app. Here, we execute the plug-in functionality assigned earlier and perform the actual drawing of the hist onto a canvas.

In more detail, we processing steps performed here are:

  • Setup the canvas.
  • Apply the projection functions (if applicable) to get the proper histogram.
  • Draw the histogram.
  • Apply the processing functions (if applicable).
  • Write the output to image and json.
  • Cleanup the hist and canvas by removing reference to them.

Note

The hist is drawn before calling the processing function to allow the plug-ins to draw on top of the histogram.

Note

The json that is written is by TBufferJSON for display via jsRoot. While it stores the information, it requires jsRoot to be displayed meaningfully.

Note

This function is built in such a way that it works for processing both histograms and trending objects. For this to work, both the subsystemContainer and the trendingContainer must support the following methods: imgDir(), which is the image storage directory, and jsonDir(), which is the json storage directory. Both should except to get formatted with `` % {“subsystem”: subsystemName}``.

Parameters:
  • subsystem (subsystemContainer or trendingContainer) – Subsystem or trending container which contains the histogram being processed. It only uses a subset of either classes methods. See the note for information about the requirements of this object.
  • hist (histogramContainer) – Histogram to be processed.
  • canvas (TCanvas) – Canvas on which the histogram should be plotted. It will be stored in the histogramContainer for the purposes of processing the hist.
  • outputFormatting (str) – Specially formatted string which contains a generic path to be used when printing histograms. It must contain base, name, and ext, where base is the base path, name is the filename and ext is the extension. Ex: {base}/{name}.{ext}.
  • processingOptions (dict) – Implemented by the subsystem to note options used during standard processing. Keys are names of options, while values are the corresponding option values. Default: None. Note: In this case, it will use the default subsystem or trending processing options.
  • subsystemName (str) – The current subsystem by three letter, all capital name (ex. EMC). Default: None. In that case of None, the subsystem name is retrieved from subsystem.subsystem. This argument is used for processing the trending objects where we don’t have access to their corresponding subsystemContainer. The subsystem name of the trendingContainer (TDG) does not necessarily correspond to the subsystem of the object being processed, so we have to pass it here.
  • trendingManager (TrendingManager) – Will be notified when as histogram is processed to allow the use of the histogram values in trending.
Returns:

None. However, the subsystem, histogram, etc are modified and their representations in images

and json are written to disk.

overwatch.processing.processRuns.processMovedFilesIntoRuns(runs, runDict)[source]

Convert the list of moved files into run and subsystem containers stored in the database.

In the case that the run has not been created, a new run container is created and an attempt is made to create all subsystems that were requested in the configuration. If the subsystem already exists, the moved files are added to the existing objects. It also includes the capability to add new subsystems part of the way through a run in the unlikely event that we pick up new data during the run.

Parameters:
  • runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the in the runDir format (“Run123456”), while the values are runContainer objects.
  • runDict (dict) – Nested dict which contains the new filenames and the HLT mode. For the precise structure, base.utilities.moveFiles().
Returns:

None. Subsystems are created inside of the runContainer objects for which there are entries in the

runDict.

overwatch.processing.processRuns.processRootFile(filename, outputFormatting, subsystem, processingOptions=None, forceRecreateSubsystem=False, trendingManager=None)[source]

Given a root file, process all histograms for a given subsystem.

Processing includes assigning the contained histograms to a subsystem, allowing for customization via the plugin system. For a new subsystem, the processing proceeds in the following order:

  • Create histogram containers for histograms in the file.
  • Create new histograms (in addition to those already in the file).
  • Create histogram stacks.
  • Specify histogram options.
  • Create histogram groups.
  • Sort histograms into histogram groups.
  • For each sorted histogram:
    • Determine which processing functions to apply to which histograms.
    • Determine which trending functions require which histograms.

Processing then proceeds to apply those functions to all sorted histograms. The final histograms are then stored as images and as json. In the case that the subsystem already exists, we can skip all of those steps and simply apply the processing functions. If a histogram was not sorted then it belongs to another subsystem and could be processed by it later (depending on the configured subsystems).

Note

Trending objects are filled (in processHist()) when the relevant hists are processed in this function.

Parameters:
  • filename (str) – The full path to the file to be processed.
  • outputFormatting (str) – Specially formatted string which contains a generic path to be used when printing histograms. It must contain base, name, and ext, where base is the base path, name is the filename and ext is the extension. Ex: {base}/{name}.{ext}.
  • subsystem (subsystemContainer) – Contains information about the current subsystem.
  • processingOptions (dict) – Implemented by the subsystem to note options used during standard processing. Keys are names of options, while values are the corresponding option values. Default: None. Note: In this case, it will use the default subsystem processing options.
  • forceRecreateSubsystem (bool) – True if subsystems will be recreated, even if they already exist.
  • trendingManager (TrendingManager) – Manages the trending subsystem.
Returns:

None. However, the underlying subsystems, histograms, etc, are modified.

overwatch.processing.processRuns.processTimeSlices(runs, runDir, minTimeRequested, maxTimeRequested, subsystemName, inputProcessingOptions)[source]

Creates a time slice or performs user directed reprocessing.

Time slices are created by processing a given run using only data in a given time range (and potentially modifying the processing options). User directed reprocessing uses the same infrastructure by varying the processing arguments and selecting the full time range available for a given run. While the external interface is different, this capabilities are performed using the same underlying infrastructure as in the standard processing.

This function is usually invoked via the web app on a particular run page.

Note

For the format of the errors that are returned, see the web app README.

Parameters:
  • runs (BTree) – Dict-like object which stores all run, subsystem, and hist information. Keys are the in the runDir format (“Run123456”), while the values are runContainer objects.
  • runDir (str) – String containing the requested run number. For an example run 123456, it should be formatted as Run123456.
  • minTimeRequested (int) – The requested start time of the merge in minutes.
  • maxTimeRequested (int) – The requested end time of the merge in minutes.
  • subsystemName (str) – The subsystem of the time slice request by three letter, all capital name (ex. EMC).
  • inputProcessingOptions (dict) – Processing options requested for the time slice. Keys are the names of
  • options, while values are the actual values of the processing options. (the) –
Returns:

If successful, we return the time slice key (str) under which the requested time slice is stored

in the subsystemContainer.timeSlices dictionary. If an error was encountered, we return an error dictionary in the proper format.

Return type:

str or dict

overwatch.processing.processRuns.validateAndCreateNewTimeSlice(run, subsystem, minTimeMinutes, maxTimeMinutes, inputProcessingOptions)[source]

Validate and create a timeSliceContainer based on the given inputs.

Validate the given time slice options, check the options to determine if we’ve already create the time slice, and then return the proper timeSliceContainer (either an existing container or a new one based on the result of the checks). By comparing the requested options and times with those that we have already processed, we can avoid having to reprocess existing data when nothing has changed. This effectively allows us to cache the processing results.

The resulting timeSliceContainer is stored under a UUID generated string to ensure that they never overwrite each other.

Note

For the error format in errors, see the web app README.

Parameters:
  • run (runContainer) – Run for which the time slice was requested.
  • subsystem (subsystemContainer) – Subsystem for which the time slice was requested.
  • minTimeMinutes (int) – Minimum time for the time slice in minutes.
  • maxTimeMinutes (int) – Maximum time for the time slice in minutes.
  • inputProcessingOptions (dict) – Processing options requested for the time slice.
Returns:

(timeSliceKey, newlyCreated, errors) where timeSliceKey (str) is the key under which the relevant

timeSliceContainer is stored in the subsystemContainer.timesSlices dict, newlyCreated (bool) is True if the timeSliceContainer was newly created (as opposed to already existing), and errors (dict) is an error dictionary in the proper format.

Return type:

tuple

overwatch.processing.processingClasses module

Classes that define the processing structure of Overwatch.

Classes that define the structure of processing. This information can be created and processed, or read from file.

Note

For the __repr__ and __str__ methods defined here, they can throw KeyError for class attributes if the these methods rely on __dict__ and the objects have just been loaded from ZODB. Presumably, __dict__ doesn’t cause ZODB to fully load the object. To work around this issue, any methods using __dict__ first call some attribute (ideally, something simple) to ensure that the object is fully loaded. The result of that call is ignored.

class overwatch.processing.processingClasses.fileContainer(filename, startOfRun=None)[source]

Bases: persistent.Persistent

File information container.

This object wraps a ROOT filename, providing convenient access to relevant properties, such as the type of file (combined, timeSlice, standard), and the time stamp. This information is often stored in the filename itself, but extraction procedures vary for each file type. Note that it does not open the file itself - this is still the responsibility of the user.

Parameters:
  • filenae (str) – Filename of the corresponding file. This is expected to the full path from the dirPrefix to the file.
  • startOfRun (int) – Start of the run in unix time. Default: None. The default will lead to timeIntoRun being set to -1. The default is most commonly used for time slices, where the start of run isn’t so meaningful.
filenae

Filename of the corresponding file. This is expected to the full path from the dirPrefix to the file.

Type:str
combinedFile

True if this file corresponds to a combined file. It is set to True if “combined” is in the filename.

Type:bool
timeSlice

True if this file corresponds to a time slice. It is set to True if “timeSlice” in in the filename.

Type:bool
fileTime

Unix time stamp of the file, extracted from the filename.

Type:int
timeIntoRun

Time in seconds from the start of the run to the file time. Depends on startOfRun being a valid time when the object was created.

Type:int
class overwatch.processing.processingClasses.histogramContainer(histName, histList=None, prettyName=None)[source]

Bases: persistent.Persistent

Histogram information container.

Organizes information about a particular histogram (or set of histograms). Manages functions that process and otherwise modify the histogram, which are specified through the plugin system. The container also manages plotting details.

Note

The histogram container doesn’t always have access to the underlying histogram. When constructing the container, it is useful to have the histogram available to provide some information, but then the histogram should not be needed until final processing is performed and the hist is plotted. When this final step is reached, the histogram can be retrieved by retrieveHistogram() helper function.

Parameters:
  • histName (str) – Name of the histogram. Doesn’t necessarily need to be the same as TH1.GetName().
  • histList (list) – List of histogram names that should contribute to this container. Used for stacking multiple histograms on onto one canvas. Default: None
  • prettyName (str) – Name of the histogram that is appropriate for display. Default: None, which will lead to be it being set to histName.
histName

Name of the histogram. Doesn’t necessarily need to be the same as TH1.GetName().

Type:str
prettyName

Name of the histogram that is appropriate for display.

Type:str
histList

List of histogram names that should contribute to this container. Used for stacking multiple histograms on onto one canvas. Default: None. See retrieveHistogram() for more information on how this functionality is utilized.

Type:list
information

Information that is extracted from the histogram that should be stored persistently and displayed. This information will be displayed with the web app, with the key shown as a clickable button, and the value information stored behind it.

Type:PersistentMapping
hist

The histogram which this container wraps.

Type:ROOT.TH1
histType

Class of the histogram. For example, ROOT.TH1F. Can be used for functions that only apply to 2D hists, etc. It is stored separately from the histogram to allow for it to be available even when the underlying histogram is not (as occurs while setting up but not yet processing a histogram).

Type:ROOT.TClass
drawOptions

Draw options to be passed to TH1.Draw() when drawing the histogram.

Type:str
canvas

Canvas onto which the histogram will be plotted. Available after the histogram has been classified (ie in processing functions).

Type:ROOT.TCanvas
projectionFunctionsToApply

List-like object of functions that perform projections to the histogram that is represented by this container. See the detector subsystem README for more information.

Type:PersistentList
functionsToApply

List-like object of functions that are applied to the histogram during the processing step. See the detector subsystem README for more information.

Type:PersistentList
trendingObjects

List-like object of trending objects which operate on this histogram. See the detector subsystem and trending README for more information.

Type:PersistentList
retrieveHistogram(ROOT, fIn=None, trending=None)[source]

Retrieve the histogram from the given file or trending container.

This function can retrieve a single histogram from a file, multiple hists from a file to create a stack (based on the hist names in histList), or a single trending histogram stored in the collection of trending objects.

Parameters:
  • ROOT (ROOT) – ROOT module. Passed into this object so this module doesn’t need to directly depend on importing ROOT.
  • fIn (ROOT.TFile) – File in which the histogram(s) is stored. Default: None.
  • trending (trendingContainer) – Contains the trending objects, including the trending histogram which is represented in this histogram container. It is the source of the histogram, and therefore similar to the input ROOT file. Default: None.
Returns:

True if the histogram was successfully retrieved.

Return type:

bool

class overwatch.processing.processingClasses.histogramGroupContainer(prettyName, groupSelectionPattern, plotInGridSelectionPattern='DO NOT PLOT IN GRID')[source]

Bases: persistent.Persistent

Organizes similar histograms into groups for processing and display.

Histograms groups are created by providing name substrings of histogram which should be included. The name substring is referred to as a groupSelectionPattern. For example, if the pattern was “hello”, all histograms containing “hello” would be selected. Additional properties related to groups, such as display information, are also stroed.

Parameters:
  • prettyName (str) – Readable name of the group.
  • groupSelectionPattern (str) – Pattern of the histogram names that will be selected. For example, if wanted to select histograms related to EMCal patch amplitude, we would make the pattern something like “PatchAmp”. The pattern depends on the name of the histograms sent from the HLT.
  • plotInGridSelectionPattern (str) – Pattern which denotes whether the histograms should be plotted in a grid. plotInGrid is set based on whether this value is in groupSelectionPattern. For example, in the EMCal, the plotInGridSelectionPattern is _SM, since “SM” denotes a supermodule.
prettyName

Readable name of the group. Set via the groupName in the constructor.

Type:str
selectionPattern

Pattern of the histogram names that will be selected.

Type:str
plotInGridSelectionPattern

Pattern (substring) which denotes whether the histograms should be plotted in a grid.

Type:str
plotInGrid

True when the histograms should be plotted in a grid.

Type:bool
histList

List of histogram names that should be filled when the selectionPattern is matched.

Type:PersistentList
class overwatch.processing.processingClasses.runContainer(runDir, fileMode, hltMode=None)[source]

Bases: persistent.Persistent

Object to represent a particular run.

It stores run level information, as well the subsystems which then containing the corresponding event information (histogram groups, histograms, etc).

Note that files are not considered event level information because the files correspond to individual subsystem. Furthermore, in rare cases, there may be numbers of files for different subsystems that are included in an individual run. Consequently, it is cleaner for each subsystem to track it’s own files.

To allow the object to be reconstructed from scratch, the HLT mode is stored by writing a YAML file in the corresponding run directory. This file is referred to as the “run info” file. Additional properties could also be written to this file to avoid the loss of transient information.

Note

The run info file is read and written on object construction. It will only be checked if the HLT mode is not set.

Parameters:
  • runDir (str) – String containing the run number. For an example run 123456, it should be formatted as Run123456.
  • fileMode (bool) – If true, the run data was collected in cumulative mode. See the processing README for further information.
  • hltMode (str) – String containing the HLT mode used for the run.
runDir

String containing the run number. For an example run 123456, it should be formatted as Run123456

Type:str
runNumber

Run number extracted from the runDir.

Type:int
prettyName

Reformatting of the runDir for improved readability.

Type:str
fileMode

If true, the run data was collected in cumulative mode. See the processing README for further information. Set via fileMode.

Type:bool
subsystems

Dict-like object which will contain all of the subsystem containers in an event. The key is the corresponding subsystem three letter name.

Type:BTree
hltMode

Mode the HLT operated in for this run. Valid HLT modes are “B”, “C”, “E”, and “U”. Further information on the various modes is in the processing README. Default: None (which will be converted to “U”, for “unknown”).

Type:str
isRunOngoing()[source]

Checks if a run is ongoing.

The ongoing run check is performed by looking checking for a new file in any of the subsystems. If they have just received a new file, then the run is ongoing.

Note

If subsystem.newFile is false, this is not a sufficient condition to say that the run has ended. This is because newFile will be set to false if the subsystem didn’t have a file in the most recent processing run, even if the run is still ongoing. This can happen for many reasons, including if the processing is executed more frequently than the data transfer rate or receiver request rate, for example. However, if newFile is true, then it is sufficient to know that the run is ongoing.

Parameters:None
Returns:True if the run is ongoing.
Return type:bool
minutesSinceLastTimestamp()[source]

Determine the time since the last file timestamp in minutes.

Parameters:None.
Returns:Minutes since the timestamp of the most recent file. Default: -1.
Return type:float
startOfRunTimeStamp()[source]

Provides the start of the run time stamp in a format suitable for display.

This timestamp is determined by looking at the timestamp of the last subsystem (arbitrarily selected) that is available in the run. No time zone conversion is performed, so it simply displays the time zone where the data was stored (CERN time in production systems).

Parameters:None
Returns:Start of run time stamp formatted in an appropriate manner for display.
Return type:str
class overwatch.processing.processingClasses.subsystemContainer(subsystem, runDir, startOfRun, endOfRun, showRootFiles=False, fileLocationSubsystem=None)[source]

Bases: persistent.Persistent

Object to represent a particular subsystem (detector).

It stores subsystem level information, including the histograms, groups, and file information. It is the main container for much of the information that is relevant for processing.

Information on the file storage layout implemented through this class is available in the processing README.

Note

This object checks for and creates a number of directories on initialization.

Parameters:
  • subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex. EMC).
  • runDir (str) – String containing the run number. For an example run 123456, it should be formatted as Run123456
  • startOfRun (int) – Start of the run in unix time.
  • endOfRun (int) – End of the run in unix time.
  • showRootFiles (bool) – True if the ROOT files should be made accessible through the run list. Default: False.
  • fileLocationSubsystem (str) – Subsystem name of where the files are actually located. If a subsystem has specific data files then this is just equal to the subsystem. However, if it relies on files inside of another subsystem (such as those from the HLT subsystem receiver), then this variable is equal to that subsystem name. Default: None, which corresponds to the subsystem storing it’s own data.
subsystem

The current subsystem in the form of a three letter, all capital name (ex. EMC).

Type:str
showRootFiles

True if the ROOT files should be made accessible through the run list.

Type:bool
fileLocationSubsystem

Subsystem name of where the files are actually located. If a subsystem has specific data files then this is just equal to the subsystem. However, if it relies on files inside of another subsystem, then this variable is equal to that subsystem name.

Type:str
files

Dict-like object which describes subsystem ROOT files. Unix time of a given file is the key and a file container for that file is the value.

Type:BTree
timeSlices

Dict-like object which describes subsystem time slices. A UUID is the dict key (so they can be uniquely identified), while a timeSliceContainer with the corresponding time slice properties is the value.

Type:BTree
combinedFile

File container corresponding to the combined file.

Type:fileContainer
baseDir

Path to the base storage directory for the subsystem. Of the form Run123456/SYS.

Type:str
imgDir

Path to the image storage directory for the subsystem. Of the form Run123456/SYS/img.

Type:str
jsonDir

Path to the json storage directory for the subsystem. Of the form Run123456/SYS/json.

Type:str
startOfRun

Start of the run in unix time.

Type:int
endOfRun

End of the run in unix time.

Type:int
runLength

Length of the run in minutes.

Type:int
histGroups

List-like object of histogram groups, which are used to classify similar histograms.

Type:PersistentList
histsInFile

Dict-like object of all histograms that are in a particular file. Keys are the histogram name, while the values are histogramContainer objects which contain the histogram. Hists should be usually be accessed through the hist groups, but list this provides direct access when necessary early in processing.

Type:BTree
histsAvailable

Dict-like object containing all histograms that are available, including those in a particular file and those that are created during processing. Newly created hists should be stored in this dict. Keys are histogram names, while values are histogramContainer objects which contain the histogram.

Type:BTree
hists

Dict-like object which contains all histograms that should be processed by a histogram. After initial creation, this should be the definitive source of histograms for processing and display. Keys are histogram names, while values are histogramContainer objects which contain the histogram.

Type:BTree
newFile

True if we received a new file, while will trigger reprocessing. This flag should only be changed when beginning processing the next time. To be explicit, if a subsystem just received a new file and it was processed, this flag should only be changed to False after the next processing iteration begins. This allows the status of the run (determined through the subsystem) to be displayed in the web app. Default: True because if the subsystem is being created, we likely need reprocessing.

Type:bool
nEvents

Number of events in the subsystem. Processing will look for a histogram that contains events in the name and attempt to extract the number of events based on the number of entries. Should not be used unless the subsystem explicitly includes a histogram with the number of events. Default: 1.

Type:int
processingOptions

Implemented by the subsystem to note options used during standard processing. The subsystem processing options can vary when processing a time slice, so storing the options allow us to return to the standard options when performing a full processing. Keys are the option names as string, while values are their corresponding values.

Type:PersistentMapping
calculateRunLength(startOfRun=None, endOfRun=None)[source]

Helper function to update the run length.

Note

The run length is defined in minutes.

Parameters:
  • startOfRun (int) – Start of the run in unix time. Default: None. If not specified, the startOfRun stored in the subsystem will be used.
  • endOfRun (int) – End of the run in unix time. Default: None. If not specified, the startOfRun stored in the subsystem will be used.
Returns:

The calculated run length in minutes.

Return type:

int

static prettyPrintUnixTime(unixTime)[source]

Converts the given time stamp into an appropriate manner (“pretty”) for display.

The time is returned in the format: “Tuesday, 6 Nov 2018 20:55:10”. This function is mainly needed in Jinja templates were arbitrary functions are not allowed.

Note

We display this in the CERN time zone, so we convert it here to that timezone.

Parameters:unixTime (int) – Unix time to be converted.
Returns:The time stamp converted into an appropriate manner for display.
Return type:str
resetContainer()[source]

Clear the stored hist information so we can recreate (reprocess) the subsystem.

Without resetting the container, reprocessing doesn’t fully test the processing functions, which are skipped if these list- and dict-like hist objects have entries.

Parameters:None
Returns:None
setupDirectories(runDir)[source]

Helper function to setup the subsystem directories.

Defines the base, img, and JSON directories, as well as creating the them if necessary.

Parameters:runDir (str) – String containing the run number. For an example run 123456, it should be formatted as Run123456
Returns:None. However, it sets the baseDir, imgDir, and jsonDir properties of the subsystemContainer.
class overwatch.processing.processingClasses.timeSliceContainer(minUnixTimeRequested, maxUnixTimeRequested, minUnixTimeAvailable, maxUnixTimeAvailable, startOfRun, filesToMerge, optionsHash)[source]

Bases: persistent.Persistent

Time slice information container.

Contains information about a time slice request, including the time ranges and the files involved. These values are required to uniquely describe a time slice.

Parameters:
  • minUnixTimeRequested (int) – Minimum requested unix time. This is the first time stamp to be included in the time slice.
  • maxUnixTimeRequested (int) – Maximum requested unix time. This is the last time stamp to be included in the time slice.
  • minUnixTimeAvailable (int) – Minimum unix time of the run.
  • maxUnixTimeAvailable (int) – Maximum unix time of the run.
  • startOfRun (int) – Unix time of the start of the run.
  • filesToMerge (list) – List of fileContainer objects which need to be merged to create the time slice.
  • optionsHash (str) – SHA1 hash of the processing options used to construct the time slice.
minUnixTimeRequested

Minimum requested unix time. This is the first time stamp to be included in the time slice.

Type:int
maxUnixTimeRequested

Maximum requested unix time. This is the last time stamp to be included in the time slice.

Type:int
minUnixTimeAvailable

Minimum unix time of the run.

Type:int
maxUnixTimeAvailable

Maximum unix time of the run.

Type:int
startOfRun

Unix time of the start of the run.

Type:int
filesToMerge

List of fileContainer objects which need to be merged to create the time slice.

Type:list
optionsHash

SHA1 hash of the processing options used to construct the time slice. This hash is used for caching by comparing the processing options for a new time slice request with those already processed. If the hashes are the same, we can directly return the already processed result.

Type:str
filenamePrefix

Filename for the timeSlice file, based on the given start and end times.

Type:str
filename

File container for the timeSlice file.

Type:fileContainer
processingOptions

Implemented by the time slice container to note options used during standard processing. The time slice processing options can vary when compared to standard subsystem processing, so storing the options allow us to apply the custom time slice options.

Type:PersistentMapping
timeInMinutes(inputTime)[source]

Return the time from the input unix time to the start of the run in minutes.

Parameters:inputTime (int) – Unix time to be compared to the start of run time.
Returns:Minutes from the start of run to the given time.
Return type:int
timeInMinutesRounded(inputTime)[source]

Return the time from the input unix time to start of the run in minutes, rounded to the nearest minute.

Note

I believe this was created due to some float vs int issues in the Jinja templating system. Although the purpose of this function isn’t entirely clear, it is kept for compatibility purposes.

Parameters:inputTime (int) – Unix time to be compared to the start of run time.
Returns:Minutes from the start of run to the given time.
Return type:int

overwatch.processing.pluginManager module

Contains all of the machinery for the plugin system.

This modules manages the plugins functions defined by each detector. This is achieved by dynamically loading each subsystem module on import of this module. A pointer to each function is added to the plugin manager, allowing for any detector subsystem function to be called through this module. The processing functions use this functionality to allow subsystems to plug into all stages of the processing and trending.

Note that only the main routing plugin functions defined below (for example, as defined in findFunctionsForHist) are actually called through this module. All other functions (for example, functions that will actually perform processing on a hist) will be called directly through their own subsystem modules. However, they are also loaded by the plugin manager for convenience.

The subsystems to actually load are specified in the configuration file.

overwatch.processing.pluginManager.createAdditionalHistograms(subsystem)[source]

Properly routes additional histogram creation functions for each subsystem.

Additional histograms can be created for a particular subsystem via these plugins. Function names should be of the form createAdditional(SYS)Histograms(subsystem, **kwargs), where (SYS) is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.

Parameters:subsystem (subsystemContainer) – Current subsystem container
Returns:None.
overwatch.processing.pluginManager.createHistGroups(subsystem)[source]

Properly route histogram group function for each subsystem.

Histogram groups are groups of histograms which should be displayed together for visualization. Function names should be of the form create(SYS)HistogramGroups(subsystem, **kwargs), where (SYS) is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.

Parameters:subsystem (subsystemContainer) – Current subsystem container.
Returns:True if the function was called
Return type:bool
overwatch.processing.pluginManager.createHistogramStacks(subsystem)[source]

Properly routes histogram stack function for each subsystem.

Histogram stacks are collections of histograms which should be plotted together. For example, one may want to plot similar spectra, such as those in the EMCal and DCal, on the same plot. These are treated similarly to a histogramContainer. Functions should be of the form create(SYS)HistogramStacks(subsystem, **kwargs), where (SYS) is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.

Parameters:subsystem (subsystemContainer) – Current subsystem container
Returns:None.
overwatch.processing.pluginManager.defineTrendingObjects(subsystem)[source]

Defines trending objects and the histograms from which they should be extracted.

Defines trending objects related to a subsystem. These objects implement the trending function, as well as specifying the histograms that provide the values for the trending. The plugin function for each subsystem should be of the form define(SYS)TrendingObjects(trending, **kwargs), where (SYS) is the subsystem three letter name, trending is a dict where the new trending objects should be stored, and the other args are reserved for future use.

Parameters:subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex. EMC).
Returns:Keys are the name of the trending objects, while values are the trending objects themselves.
Return type:dict
overwatch.processing.pluginManager.findFunctionsForHist(subsystem, hist)[source]

Determines which functions should be applied to a histogram.

Histogram functions apply additional processing, from extracting values to change ranges to drawing on top of the histogram. These functions are executed when the histogram is processed. The functions should be stored as function pointers so the lookup doesn’t need to occur every time the histogram container is processed. The plugin functions for each subsystem should be of the form findFunctionsFor(SYS)Histogram(subsystem, hist, **kwargs), where (SYS) is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem and hist (histogramContainer) is the current histogram being processed, and the other args are reserved for future use.

Note

This function must handle all possible histograms for a subsystem, so it is strongly recommended to select them via hist name or another property.

Parameters:
Returns:

None.

overwatch.processing.pluginManager.setHistogramOptions(subsystem)[source]

Properly routes histogram options function for each subsystem.

Histogram options include options such as renaming histograms, setting draw options, setting histogram scaling, and/or thresholds, etc. These options much be specific to the histogram object. Canvas options are set elsewhere when actually drawing on the canvas. It cannot be set now because the canvas doesn’t yet exist and we would need to call functions to on that object (we prefer not to use function pointers here). Functions should be of the form set(SYS)HistogramOptions(subsystem, **kwargs), where (SYS) is the subsystem three letter name, subsystem (subsystemContainer) is the current subsystem container, and the other args are reserved for future use.

Parameters:subsystem (subsystemContainer) – Current subsystem container
Returns:None.
overwatch.processing.pluginManager.subsystemNamespace(functionName, subsystemName)[source]

Prepend the subsystem name to a function to act as a namespace.

This avoids the possibility of different subsystems with the same function names overwriting each other. Returned function names are of the form SYS_functionName.

Note

Since . indicates an attribute or element of a module, we use an _ instead. Although it might be nice visually, and is suggestive of the relationship between these functions and the subsystem modules, it causes problems when generating the docs since the generation treats the . as if it legitimate python (which it isn’t, since we don’t have the full path).

Parameters:
  • functionName (str) – Name of the function.
  • subsystem (str) – The current subsystem in the form of a three letter, all capital name (ex. EMC).
Returns:

Properly formatted function name with the subsystem prepended as a namespace.

Return type:

str

overwatch.processing.run module

Minimal executable to launch processing.

__main__ is implemented to allow for this function to be executed directly, while run() is defined to allow for execution via entry_points defined in the python package setup.

overwatch.processing.run.run()[source]

Main entry point for starting processAllRuns().

This function will run on an interval determined by the value of processingTimeToSleep (specified in seconds). If the value is 0 or less, the processing will only run once.

Note

The sleep time is defined as the time between when processAllRuns() finishes and when it is started again.

Parameters:None.
Returns:None.