#!/usr/bin/env python # coding: utf-8 # # Single Exposure Processing # # This is intended to walk you through the processing pipeline on jupyterlab. It builds on the first two hands-on tutorials in the LSST ["Getting started" tutorial series](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorial). It is intended for anyone getting started with using the LSST Science Pipelines for data processing. # # The goal of this tutorial is to setup a Butler for a simulated LSST data set and to run the `processCCD.py` pipeline task to produced reduced images. # ## Setting up the data repository # # Sample data for this tutorial comes from the `twinkles` LSST simulation and is available in a shared directory on `jupyterlab`. We will make a copy of the input data in our current directory: # In[ ]: get_ipython().system('if [ ! -d DATA ]; then cp -r /project/shared/data/Twinkles_subset/input_data_v2 DATA; fi') # Inside the data directory you'll see a directory structure that looks like this # In[ ]: get_ipython().system('ls -lh DATA/') # The Butler uses a mapper to find and organize data in a format specific to each camera. Here we're using `lsst.obs.lsstSim.LsstSimMapper` mapper for the Twinkles simulated data: # In[ ]: cat DATA/_mapper # All of the relavent images and calibrations have already been ingested into the Butler for this data set. # ## Reviewing what data will be processed # # We'll now process individual raw LSST simulated images in the Butler `DATA` repository into calibrated exposures. We’ll use the `processCcd.py` command-line task to remove instrumental signatures with dark, bias and flat field calibration images. `processCcd.py` will also use the reference catalog to establish a preliminary WCS and photometric zeropoint solution. # # First we'll examine the set of exposures available in the Twinkles data set using the Butler # Now we'll do a similar thing using the `processEimageTask` from the LSST pipeline. **There is a bit of ugliness here because the `processEimage.py` command line script is only python2 compatible so we need to parse the arguments through the API. This has the nasty habit of trying to exit after the args.** # In[ ]: from lsst.obs.lsstSim.processEimage import ProcessEimageTask # In[ ]: args = 'DATA --rerun process-eimage --id filter=r --show data' ProcessEimageTask.parseAndRun(args=args.split()) # BUG: the command above exits early, due to a namespace problem: # /opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/15.0/python/lsst/pipe/base/argumentParser.py in parse_args(self, config, args, log, override) # 628 # 629 if namespace.show and "run" not in namespace.show: # --> 630 sys.exit(0) # The important arguments here are `--id` and `--show data`. # # The `--id` argument allows you to select datasets to process by their data IDs. Data IDs describe individual datasets in the Butler repository. Datasets also have types, and each command-line task will only process data of certain types. In this case, `processEimage.py` will processes raw simulated e-images **(need more description of e-images)**. # # In the above command, the `--id filter=r` argument selects data from the r filter. Specifying `--id` without any arguments acts as a wildcard that selects all raw-type data in the repository. # # The `--show data` argument puts `processEimage.py` into a dry-run mode that prints a list of data IDs to standard output that would be processed according to the `--id` argument rather than actually processing the data. # # Notice the keys that describe each data ID, such as the visit (exposure identifier), raft (identifies a specific LSST camera raft), sensor (identifies an individual ccd on a raft) and filter, among others. With these keys you can select exactly what data you want to process. # Next we perform the same task directly with the Butler: # In[ ]: import lsst.daf.persistence as dafPersist # In[ ]: butler = dafPersist.Butler(inputs='DATA') butler.queryMetadata('eimage', ['visit', 'raft', 'sensor','filter'], dataId={'filter': 'r'}) # ## Processing data # # Now we'll move on to actually process some of the Twinkles data. To do this, we'll remove the `--show data` argument. # In[ ]: args = 'DATA --rerun process-eimage --id filter=r --show data' # The command below also exits early - see the error message above. ProcessEimageTask.parseAndRun(args=args.split())