wiki:Procedures/NEODCDelivery

Context Navigation

Version 59 (modified by knpa, 15 years ago) (diff)
--

NEODC Data Delivery

This page documents the procedure to follow when delivering data to NEODC.

First, we need to have some data to send - these should be datasets that are 'completed'.
All sensors need to be delivered and r-synced. Run rsync with a --dry-run to check it is all up to date. Check with whoever processed each sensor if need be.

Move workspace project into workspace/being_archived
Prepare the repository version for archiving:
1. Make sure everything is present and where it should be! (see Processing/FilenameConventions for the required layout and name formats)
  Things to look out for: Delivery folders, applanix/rinex data, las files, DEMs.
  Use proj_tidy.sh to highlight any problems:
```
proj_tidy.sh -p <project directory> [-ed]

Options:
-e Extended check (may be verbosey) 
-d Delivery mode (Checks deliveries only)
```
2. Add a copy of the relevant trac ticket(s) ; run:
```
mkdir -p admin/trac_ticket
pushd admin/trac_ticket
wget --recursive --level 1 --convert-links --html-extension http://arsf-dan.nerc.ac.uk/trac/ticket/TICKETNUMBER
popd
```
3. Delete the contents of the lev1/ subdirectory, where these are duplicates of the delivery directory.
4. Scan the filesystem for any 'bad' things and fix them:
  1. Delete any unnecessary files - backups of DEMs that weren't used, temp files created by gedit (~ at end of filename), hidden files etc
  2. Find all files/dirs with unusual characters (space, brackets, etc):
```
find -regex '.*[^-0-9a-zA-Z/._].*' | ~arsf/usr/bin/fix_naughty_chars.py
```
    This will give suggested commands, but check first.
5. Set permissions:
  1. Remove executable bit on all files (except the point cloud filter and the run[aceh] scripts):
```
find -type f -not -wholename '*pt_cloud_filter*' -and -not -regex '.*/run[aceh]/.*sh' -and -perm /a=x -exec chmod a-x {} \;
```
  2. Give everyone read permissions (and execute if it has user execute) for the current directory and below:
```
chmod -R a+rX .
```
Create an archive tarball for NEODC to download:

*NOTE*
If AIMMS or GRIMM data is present then you will first need to separate them and put into separate tarballs.
Use: tar czf <TARBALL NAME> <DIRECTORY TO TARBALL>
Tarball name should be in format: GB09_05-2009_278b_Leighton_moss-AIMMS.tar.gz
```
su - arsf
cd ~/arsf_data/archived/
./qsub_archiver.sh <path to project in repository>
(e.g. ~arsf/arsf_data/2008/flight_data/uk/CEH08_01)
# Note you need to specify dirs at the project level
# To run the archiving locally rather than via the grid engine, use:
./archive_helper-justdoit.sh ~arsf/arsf_data/2008/flight_data/uk/CEH08_01
```
When complete, this will have dumped the data into ~arsf/arsf_data/archived/neodc_transfer_area/staging/. Check it looks OK then move it up one level so NEODC can rsync it.
Notify NEODC they can download the data.
When NEODC have the data:
1. Remove it from the transfer area
2. Note on ticket that it's been sent to neodc (with date.)
When NEODC confirm they have backed up the data:
1. Move the repository project to non-backed-up space at: ~arsf/arsf_data/archived/<original path from ~arsf/arsf_data/>
  e.g. mv ~arsf/arsf_data/2008/flight_data/uk/CEH08_01/ ~arsf/arsf_data/archived/2008/flight_data/uk/CEH08_01
  You may need to create parent directories if they don't yet exist.
2. Create a symlink to the project in it's original location. Point the symlink through ~arsf/arsf_data/archived rather than directly to larsen.
  e.g. ln -s ~arsf/arsf_data/archived/2008/flight_data/uk/CEH08_01 ~arsf/arsf_data/2008/flight_data/uk/CEH08_01
3. Note in ticket that it has been backed up by NEODC and give new data location.
When NEODC confirm that everything appears to be in order (maybe wait a week):
- Close the ticket
- Delete the workspace copy in being_archived

Download in other formats:

Plain Text