Version 63 (modified by emca, 14 years ago) (diff)

--

NEODC Data Delivery

This page documents the procedure to follow when delivering data to NEODC.

First, we need to have some data to send - these should be datasets that are 'completed'.
All sensors need to be delivered and r-synced. Run rsync with a --dry-run to check it is all up to date. Check with whoever processed each sensor if need be.

  1. Move workspace project into workspace/being_archived
  2. Prepare the repository version for archiving:
    1. Make sure everything is present and where it should be! (see Processing/FilenameConventions for the required layout and name formats)
      Things to look out for: Delivery folders, applanix/rinex data, las files, DEMs
      Use proj_tidy.sh to highlight any problems:
      proj_tidy.sh -p <project directory> [-ed]
      
      Options:
      -e Extended check (may be verbosey) 
      -d Delivery mode (Checks deliveries only)
      
    2. Add a copy of the relevant trac ticket(s) ; run:
      mkdir -p admin/trac_ticket
      pushd admin/trac_ticket
      wget --recursive --level 1 --convert-links --html-extension http://arsf-dan.nerc.ac.uk/trac/ticket/TICKETNUMBER
      popd
      
    3. Scan the filesystem for any 'bad' things and fix them:
      1. Delete any unnecessary files - backups of DEMs that weren't used, temp files created by gedit (~ at end of filename), hidden files, duplicates in lev1 dir etc
      2. Find all files/dirs with unusual characters (space, brackets, etc), ignoring the admin/trac_ticket folder:
        find -regex '.*[^-0-9a-zA-Z/._].*'  -o -path './admin/trac_ticket' -prune | ~arsf/usr/bin/fix_naughty_chars.py
        
        This will give suggested commands, but check first.
    4. Set permissions:
      1. Remove executable bit on all files (except the point cloud filter and the run[aceh] scripts):
        find -type f -not -wholename '*pt_cloud_filter*' -and -not -regex '.*/run[aceh]/.*sh' -and -perm /a=x -exec chmod a-x {} \;
        
      2. Give everyone read permissions (and execute if it has user execute) for the current directory and below:
        chmod -R a+rX .
        
  3. Create an archive tarball for NEODC to download:

    *NOTE*
    If AIMMS or GRIMM data is present then you will first need to separate them and put into separate tarballs.
    Use: tar czf <TARBALL NAME> <DIRECTORY TO TARBALL>
    Tarball name should be in format: GB09_05-2009_278b_Leighton_moss-AIMMS.tar.gz
    Create a md5sum file for the AIMMS/GRIMM data also
    Use: md5sum <TARBALL NAME> <TARBALL NAME>-MD5SUM.txt
    su - arsf
    cd ~/arsf_data/archived/
    ./qsub_archiver.sh <path to project in repository>
    (e.g. ~arsf/arsf_data/2008/flight_data/uk/CEH08_01)
    # Note you need to specify dirs at the project level
    # To run the archiving locally rather than via the grid engine, use:
    ./archive_helper-justdoit.sh ~arsf/arsf_data/2008/flight_data/uk/CEH08_01
    
    When complete, this will have dumped the data into ~arsf/arsf_data/archived/neodc_transfer_area/staging/. Check it looks OK then move it up one level so NEODC can rsync it.

  4. Notify NEODC they can download the data.
  5. When NEODC have the data:
    1. Remove it from the transfer area
    2. Note on ticket that it's been sent to neodc (with date.)
  6. When NEODC confirm they have backed up the data:
    1. Move the repository project to non-backed-up space at: ~arsf/arsf_data/archived/<original path from ~arsf/arsf_data/>
      e.g. mv ~arsf/arsf_data/2008/flight_data/uk/CEH08_01/ ~arsf/arsf_data/archived/2008/flight_data/uk/CEH08_01
      You may need to create parent directories if they don't yet exist.
    2. Create a symlink to the project in it's original location. Point the symlink through ~arsf/arsf_data/archived rather than directly to larsen.
      e.g. ln -s ~arsf/arsf_data/archived/2008/flight_data/uk/CEH08_01 ~arsf/arsf_data/2008/flight_data/uk/CEH08_01
    3. Note in ticket that it has been backed up by NEODC and give new data location.
  7. When NEODC confirm that everything appears to be in order (maybe wait a week):
    • Close the ticket
    • Delete the workspace copy in being_archived