# Motivation ![](https://imgs.xkcd.com/comics/data_pipeline.png) In the following section, the **Limitations** of Lima1 and the **Goals** for Lima2 are discussed. ## Analysis and goals ### Distributed DAQ and processing ##### Analysis High performance detectors generate more data than a single backend computer can handle in terms of both data acquisition and processing. Lima1 being a monolithic (single process) solution, it does not scale well. ##### Goal Support High performance detectors with a horizontally scalable solution. A distributed system, e.g. a cluster of acquisition and processing computer is considered. Most of the high performance detectors are built by aggregating multiple modules. That means that two frame dispatching strategies are worth considering: * Full frame: each acquisition/processing node receives a full detector frame (including all modules) * Partial frame: each acquisition/processing node receives data from a single module The distributed system should be flexible enough to allow a variety of acquisition and processing nodes topology such as: * Single acquisition computer, Multiple processing computers * Multiple acquisition with processing computers * Multiple acquisition computers and Multiple processing computers The trivial case of a single computer for acquisition and processing should also be supported. ### Decoupled DAQ and processing ##### Analysis With Lima1, Acquisition and Processing are coupled to the point that user applications need to wait that the processing of the frames of the current acquisition is finished before restarting a new acquisition. ![](assets/decoupled_daq_proc1.png) ##### Goal Let start a new acquisition when the camera is ready without waiting for all the images to go through the full processing pipeline. Eventually a batch of acquisitions could be started and their associated processing differed to a later time, e.g. once all the data are in memory. ![](assets/decoupled_daq_proc2.png) ### Coordinate transformation tracking ##### Analysis With Lima1, coordinate transformation induced by the processing tasks is not formalized and it can be tricky to get back to the detector coordinate system when visualizing an image that has flipped, rotated and cropped, to set a ROI for instance. ##### Goal Each processing task provides a matrix representing their [affine transformation](https://en.wikipedia.org/wiki/Affine_transformation#Image_transformation) and those transformations can be combined to compute the coordinates of pixels and ROI in the detector coordinate system. ### Memory management ##### Analysis In Lima1, there is no global memory management, in particular for data generated in the processing pipeline (compressed images before saving). There is only one buffer allocation engine, the hardware buffers; this is inefficient because a lot of memory must be systematically allocated in order to guarantee the latencies in the processing tasks. The "In-Place" task functionality is not efficient when the resulting image is smaller than the original one. Hardware buffers might be "expensive" and/or "limited", so they should only serve for the first processing stage. Finally, the "asynchronous" data retrieval is not systematically used, so the buffers implementing this functionality should be optional (see "Observation / Test points" below). ##### Goal Buffer sizes at different stages should be dynamically adjusted. Usage of allocators should be formalized and generalized. It would be desirable to formalize the management of different kind of memories, like GPU and FPGA. Page locking and NUMA (CPU affinity), and TLS might be also useful in high-performance applications. ### Multiband image support ##### Analysis With Lima1, only grayscale images are fully supported. Color images can be acquired with the so called Video mode. ##### Goal Color (video) cameras and cameras with multiple energy discriminating thresholds (e.g. Dectris EIGER2) should be supported. ### Resource monitoring ##### Analysis With Lima1, there is no "global" resource (memory or CPU) monitoring. ##### Goal Keep track of the buffer allocations and report to the application (control system). For example, this could help prevent buffer overrun if the scan can be paused. ### Multiple parameter sets ##### Analysis With Lima1, if parameters are modified while an acquisition is running, the configuration is inconsistent with the current acquisition: ```cpp ct.setBinning(2, 2) ct.prepareAcq(2, 2) ct.startAcq(2, 2) setBinning(1, 1) assert (ct.getBinning() == (2, 2)) // Oops ``` ##### Goal Provide a data structure for configuration that will have multiple instance for user (before preparation), effective hardware (after preparation), effective software (after pipeline creation). ### Integration of third-party processing ##### Analysis With Lima1, adding a new processing task (external operation) is not documented. ``` extOpt = ctControl.externalOperation() self.__maskTask = extOpt.addOp(Core.MASK, self.MASK_TASK_NAME, self._runLevel) ``` ##### Goal Provide a C++ and Python API to extend the Pipeline with custom Processes. Libraries like PyFAI (that are implemented on GPU) should be supported. Process that have a distributed (MPI) implementation should be supported as well. ### Sparse data ##### Analysis With Lima1, only dense storage is supported for images. ##### Goal Support sparse storage for grayscale images. ### Incomplete data sets ##### Analysis With Lima1, the lack of a frame in a sequence is not supported. ##### Goal Cope with incomplete, sparse data sequences. The origin of the absence of a frame can be: * Fault or overrun condition (detector, processing) * Multiple fault tolerance policies should be offered * Veto mechanisms in data reduction * Discard useless data The cause of the absence of data should be included in the metadata. ### Multiple saving / streaming tasks ##### Analysis With LIma1, saving supports only images (2d data) and is performed at a fixed stage in the processing pipeline. Lima does not support foreign data structure (1d data or tables) generate by tasks. ##### Goal Multiple saving / streaming tasks can be inserted at different stages in the image processing pipeline. They can save raw or partially processed images, as well as byproducts of non 2D type like statistics or reduced 1D/0D data. The saving tasks may write on the same I/O (file) or on different ones. The saving infrastructure should also include metadata, which can be: * At different moments: * Static (detector type, instrument name) * Per acquisition (Lima configuration, start timestamp) * Per frame (detector timestamp) * Asynchronous (monitoring, temperature, processing statistics) * From different sources: * Hardware generated (frame timestamps, temperature, dead-time) * Software generated (configuration, processing statistics) ### Multiple observation / testing tasks A new kind of (sink) task can serve for monitoring and retrieving data asynchronously at defined stages in the processing pipeline. Each observation point acts as a cache of that stage, with configurable history and sampling parameters. ## Implementation improvements ### Explicit state machine > No more `if (status == READY) && (previousStatus == RUNNING)` soup! Acquisition and detector have statuses which is an obvious sign of a state machine, if not multiple concurrent state machines. Using an explicit state machines means that the software is driven by a state transition table and actions are executed when entering and leaving a state. ### Prefer events to callbacks Callbacks are executed in the context of the thread that call them, potentially the acquisition thread. A better approach is to provide a synchronization primitive that the applications use to handle events in its own threads. ### Externalize the acquisition thread The acquisition thread is currently part of the plugin implementation. In other words, it is the responsibility of the plugin to feed the ring buffer. It would be more flexible to have a common acquisition thread(s) and leave the specifics of "getting an image" to the plugin. ### Hide vendor SDKs from users Use the Pointer to implemenation idiom. Checkout the [documentation](https://en.cppreference.com/w/cpp/language/pimpl) for a thorough explanation on this pattern. ### Simplify usage ```cpp Camera cam(parameters); cam.setBitDepth(FIVE_BIT_AND_A_HALF); HwInterface hw(cam); Control ct(hw); ``` is now ```cpp simulator::cam(parameters); cam.set_bit_depth(FIVE_BIT_AND_A_HALF); ``` Note you still have access to camera specific functionalities. ### Simplify python binding With introspection, most of the binding can be generated programmatically. ### Boilerplate in the plugins code Reduce repetitive code, using generative programming techniques. ### Use established third-party libraries Benefit from libraries developed, used and maintained by large communities. Examples are: * Boost * Adobe Stlab.Concurrency * MPI