wiki:Procedures/HyperspectralCalibrationCreation

Context Navigation

Version 25 (modified by asm, 7 years ago) (diff)
--

Generating calibration files for Hyperspectral data

Radiometric, wavelength and miscellaneous other calibration of the Fenix hyperspectral instrument is undertaken in the calibration room at NERC-ARF Operations (now at BAS) and has been carried out at least annually since 2014. This page describes how to use the calibration software written by NERC-ARF-DAN to create the necessary calibration files from the raw calibration data. All progress should be recorded on a trac ticket (https:/nerc-arf-dan.pml.ac.uk/trac/newticket) and added to the list below.

2015: https://nerc-arf-dan.pml.ac.uk/trac/ticket/594
2016: https://nerc-arf-dan.pml.ac.uk/trac/ticket/613
2017 loan Fenix sensor (ID 350006R): https://nerc-arf-dan.pml.ac.uk/trac/ticket/615
2018 old Fenix sensor; new detector array (so a new sensor in practice): https://nerc-arf-dan.pml.ac.uk/trac/ticket/627

Setup

Copy all data to the appropriate location as well as the spreadsheet of the data. Open a new ticket and make a few preliminar notes. Make sure all data is correct, that all files have darkframes recorded and make notes on the ticket if anything went wrong, some data was not recorded or anything that does not make sense. Later on, you will need to specify in the config file the data directory structure, the default is:

data
   |-d1 (Number of the day)
   |    |-fenix
   |    |     |-raw data of each lampfiles
   |    |     |-radiometric raw files   
   |    |-owl
   |   (...)
   |-d2
(...)

You can change the structure but you will need to specify it on the config file. For the output (you will need to create the dirs) the suggested structure is:

processed
   |-d1 (Number of the day)
   |    |-fenix/
   |    |     |-anchors/
   |    |     |-average/
   |    |     |-dc/
   |    |     |-output/
   |    |     |-smile/
   |    |     |-split/
   |    |-owl/
   |   (...)
   |-d2
(...)

Relevant scripts are held in the internal 'libarsfcal' git repository

Note that the various calibration scripts written in Python expect to import a package called "libarsfcal". This means that you must have the directory above the libarsfcal directory in your PYTHONPATH environment variable for the scripts to run. This will be added by default but if you are making changes to the scripts you need to prepend the development version to your PYTHONPATH:

export GITCHECKOUT=~/scratch_network/git/
export PYTHONPATH=$GITCHECKOUT:$PYTHONPATH

You can check the location of the library you are using with:

python -c "import libarsfcal;print(libarsfcal.__file__)"

Config file

The calibration scripts require a config file. An example config file can be found under libarsfcal/supplementary. Keys must be in the format key=value, one per line. The file must contain a [DEFAULT] heading holding default values, an [eagle] heading holding keys for the Eagle calibration, a [hawk] heading holding keys for the Hawk calibration and a [fenix] heading holding keys for the Hawk calibration.

Keys required under [DEFAULT]:

day=Name of the directory containing the data (Usually day and gain level if any, which usually will not happen example: day2)
aplcal=<path to aplcal> (Usually will be just: aplcal)
fast_bil_median=<path to fast_bil_median> (%(scriptdir)s/fast_bil_median)
fityk=<path to cfityk>(cfityk)
linefile=<path to CSV file containing spectral lines, the wavelengths of each spectral peak that each line provide> (Can be edited for new lamps. I advice to use the Mylar filter only for checking the spectra in the SWIR is correct comparing with an scale not using the Mylar filter. The Mylar filter has an accuracy of 1nm and introduces a bigger than real FWHM so do not use it for the final version)
lamp_index_file=<path to CSV file matching data file names to spectral lamps, usually named as lamp_lookup.csv>

A example of the lamp_index file is:

N16F3210,O
N16F3204,He

Usually on the lab will try to use always the same naming format for the data but you will have to check everything is correct and edit if needed). Example files for linefile, lamp_index_file and lampfile are all under libarsfcal/supplementary.

Keys required under [eagle], [hawk] and [fenix], you just need to edit the section you are going to use:

number_of_pixels=<number of spectral pixels on sensor>
rawdir=<Directory containing raw data files>
splitdir=<Directory in which to place split light/dark files>
darkdir=<Directory in which to place dark corrected files>
averagedir=<Directory in which to place averaged files>
outdir=<Directory in which to place output files> (You will need to create all the dirs before hand)

Extra keys required for [fenix]:

sample_hdr_file=<Path to header file containing old wavelength scale>
lampfile=<Path to CSV file containing integrating sphere response for this sensor> (csv calibrated file by Physics Lab every 2 years, keep updated)
lampfile_with_filter=<Path to CSV file containing integrating sphere response through blue filter for this sensor> (if used will improve the SWIR lines. If not used will have to specify the same than before)
raw_radcal_file=<Path to raw file to start from to create radiometric calibration file> (uniform)
filter_radcal_file=<Path to blue filtered raw file to start from to create radiometric calibration file> (if not used make a copy of the one with no-filter with a different name and point to it here)
spectrum_break_band=<last pixel in VNIR>

Wavelength calibration

This must be run first, since the radiometric calibration requires a wavelength calibration to tie the lamp data into. Before doing anything, make sure your data is OK and it has the darkframes recorded (don't use them otherwise).To run the wavelength calibration you will be using "cal_spectral.py". Note for Eagle and Fenix you may need to re-run gen_bandsets.py (see "Other scripts", below) with different settings in order to obtain bandset files for different spectral binning settings.

Master script: cal_spectral.py

Basic usage of this requires only the name of the config file as an argument. For the case of the fenix, in the best scenario you will need to use:

cal_spectral.py config_file -s fenix --hwhm

To specify that the sensor is Fenix and that the FWHMs read in the hdr are indeed hwhm rather than fwhm.

You may also want to use the following arguments:

-v: Verbose. Will print a lot more stuff but makes it easier to see if it's going wrong.
-s: Specify sensor. Default is Eagle.
-x: Comma-separated regex list. Files matching these regexes will not be processed. This may be useful if there are a lot of data files in the specified raw directory (or later stages) that are not relevant to the wavelength calibration.
-t: Specify a calibration stage to start from. Useful to skip the long-running early stages if you don't need to (for example) split the raw files again when re-running.
-p: Pauses calibration between stages so output can be easily checked.
--hwhm: Generates fityk scripts using hwhm rather than fwhm.
--offset2: Specifies a starting offset for Fenix's SWIR. Useful if the wavelength offset is large (>1nm).

Other scripts

The master script calls the following scripts. If the master script is failing you may need to bug-fix one of these, or they can be run individually - this gives more control and flexibility at several stages, but is more complicated.

common.py: Contains functions that are common to more than one calibration stage
data_handler.py: Contains functions for handling BIL files
split_autodark.py: Splits Eagle or Hawk data files containing autodark lines into a light file and a dark file.
spatial_average.py: Calculates the average of each line for each band
findbaseline.py: Finds the baseline data value to remove prior to peak fitting
gen_fityk_script.py: Generates fityk scripts to do peak fitting
lookup_lamp.py: Looks up which type of spectral lamp was used for a given data file
checkpeaks.py: Checks spectral peaks found by fityk against known spectral lines and works out corrected pixel numbers
combine_anchors.py: Combines multiple spectral anchor files into one
minimise_error.py: Minimises the RMS error of a list of calibration anchor points against known spectral lines
wavelength_scale.py: Generates a wavelength scale for a given spectral dataset with wavelength anchor points.
gen_bandsets.py: Generates new bandset (.prn, .bnd and .wls) files from the results of wavelength calibration

Wavelength scale checking

The generated wavelength scale may be checked against the lamp lines to determine the error in each matching band and it's FWHM. The output of this test is recorded in the data analysis report. Any unrealistic FWHMs and any incorrectly matched peaks should be removed from the fityk output peak files and the wavelength scale regenerated.

Master script: calc_peak_diffs.py

Basic usage of this requires the following arguments:

-a: Final anchor file produced during wavelength calibration.
-l: Spectral lines file.

When you have created the final wavelength scale that passes the lamp line test, record the parameters on the ticket, especially the data set (day), sample_hdr_file and linefile from the config file and any optional parameters used with cal_spectral.py. Also record the final anchors and check that the VNIR and SWIR starting wavelengths are realistic. The new wavelength scale is generated from the raw files and the offsets output by the script are from the raw files. As the previous calibration wavelengths are not currently set in the raw data files, you may need to set the SWIR offset to the previous offset to get good results.

Radiometric calibration

This can either be run after the wavelength calibration in order to obtain a cal using an up-to-date wavelength calibration, or it can be run using an old wavelength calibration.

Master script: cal_radiometric.py

Basic usage of this requires only the name of the config file as an argument. You may also want to use the following arguments:

-v: Verbose. Will print a lot more stuff but makes it easier to see if it's going wrong.
-s: Specify whether Eagle or Hawk. Required to run Hawk calibration.
-t: Specify a calibration stage to start from. Useful to skip the long-running early stages if you don't need to (for example) split the raw files again when re-running.
-p: Pauses calibration between stages so output can be easily checked.
-w: Specifies a wavelength scale file to use (ie output from wavelength calibration). You shouldn't need to use this if you've just run the wavelength cal, but you might need to if there are multiple wavelength cal files in the output directory or if you are running from an old cal.

Other scripts

common.py: Contains functions that are common to more than one calibration stage
data_handler.py: Contains functions for handling BIL files
split_autodark.py: Splits Eagle or Hawk data files containing autodark lines into a light file and a dark file.
spatial_average.py: Calculates the average of each line for each band
replace_wavscale.py: Replaces wavelength scale in a given Eagle/Hawk header file with a new wavelength scale
gen_radcal.py: Generates calibration multipliers given an appropriate BIL input file.

Analysis

There are scripts to generate analysis graphs and files under libarsfcal/analysis. These should be run on the calibration output, please add to them if you think of more tests.

Additional needed steps

You will need to create as well the files for the different binnings using "gen_bansets.py"' and the bad pixel file https://nerc-arf-dan.pml.ac.uk/trac/wiki/ProcessingAtDAN/hyper_cal/bad_pixels

Data quality report

The data quality report must be updated for each calibration using the outputs produced by calc_peak_diffs.py and comparison.py.

If not already saved, generate a text file with the spectral offsets, using the output option to save to file:

calc_peak_diffs.py -a <anchor file> -l <spectral line file> -o results/fenix_spectral_offsets_sept.txt

Important note: Please realise that the cal_peak_diffs.py is calculating the mean for both the SWIR and VNIR detectors. This should be done separately. The error mean is also wrong as it is calculated as the arithmetic mean and should be the mean of the absolute values of the errors. Please have a look to the 2018 hyperspectral data quality report which has this issues corrected for as a guide. If the code has not been updated and fixed, you can do the extra calculations on a spreadsheet.

Then create the comparison plots, e.g.:

comparison.py processed_sept_dac/d3-1/fenix/output/fenix_201509.cal ~/calibration/2014/fenix/fenix_201405.cal -s fenix --notitle

Finally, update the text and figures in arsfhome/documents/data_quality_reports, submit to code review and generate a new report.

Download in other formats:

Plain Text