Version 84 (modified by knpa, 10 years ago) (diff)

--

Archiving projects with NEODC

This page documents the procedure to follow when sending data to NEODC.

  1. Choose a project:
    • A project is ready to be archived when all sensors have been delivered.
    • Check the ticket and make sure there is nothing on there which suggests something still needs to be done with the dataset (ask the processor if need be).
  2. Record in ticket that you are beginning to archive this flight.
  3. If there is a workspace version, move it into workspace/being_archived (pre-2011 only)
  4. Prepare the repository version for archiving:
    1. Run proj_tidy.sh to highlight any problems:
      proj_tidy.sh -p <project directory> -c
      
      Check the output. Delete hidden(not .htaccess or .htpasswd)/temporary/broken files. Fix incorrect file name formats and any other obvious errors. Everything in the delivery should be close to perfect, but don't worry too much about things in the main project.
    2. Make sure that the delivery sensor data is present and the raw data is present (incl. navigation). Run a quick eye over the rest of the deliveries.
    3. Remove unwanted large files. The project should have been cleaned up by the processor but often large files remain which are not needed. In particular, there are sometimes duplicates of data in processing/<sensor> which are included in delivery. Free up as much space as possible by deleting unwanted large files. Don't delete anything in processing/kml_overview
    4. Add a copy of the relevant trac ticket(s) ; run:
      mkdir -p admin/trac_ticket
      pushd admin/trac_ticket
      wget --recursive --level 1 --convert-links --html-extension http://arsf-dan.nerc.ac.uk/trac/ticket/TICKETNUMBER
      popd
      
    5. Set permissions:
      1. Remove executable bit on all files (except the point cloud filter and the run[aceh] scripts):
        Note - if you are processing 2011 or later, you will need to run the below commands as both arsf and airborne
        find -type f -user `whoami` -not -wholename '*pt_cloud_filter*' -and -not -regex '.*/run[aceh]/.*sh' -and -perm /a=x -exec chmod a-x {} \;
        
      2. Give everyone read permissions (and execute if it has user execute) for the current directory and below:
        find -user `whoami` -exec chmod a+rX {} \;
        
    6. If there are multiple deliveries for a sensor, then put all but the newest version in a subdirectory called previous_deliveries. Make sure the newest version is a complete delivery. Fill in missing data with hardlinks using cp -nl if necessary. Ask for help if not sure.
    7. Make sure the delivery directory is in the top level (otherwise use finalise_delivered_data.sh)
  5. Notify NEODC they can download the data (current contact is: wendy.garland@…. cc knpa).
  6. Record in the ticket that it has been submitted to NEODC and include the date.
  7. When NEODC confirm they have backed up the data:
    1. Note in ticket that it has been backed up by NEODC.
    2. Change status page entry to "archived" for relevant sensors (if pre-2011 HS then only do this if both original and reprocessed HS have been archived).
    3. If workspace version present, delete from being_archived.
    4. Add the flight in the appropriate format and in the appropriate position in ~arsf/archived_flights.txt
    5. If all sensors have been archived (including reprocessing) then close the ticket. Otherwise note why the ticket is being left open.