Guidelines for Red Hen Developers

Introduction

Guidelines for people who join or create a project team, especially one involving coding.

Standard Operating Guidelines

Networking your data, activity, and code

  • The overarching goal is to place code you develop into production in a Red Hen pipeline, and to place data you use into a networked resource list. Your work should be available to other, and future Red Hens. Some specifics:
    • A complete instance of the Red Hen dataset is present on the server Gallina, which is located inside the Case Western Reserve University's High-Performance Computing Cluster; this is where Red Hen builds its main information processing pipelines
    • To contribute effectively to Red Hen, you will probably need to establish an account as a member of Mark Turner's mbt8 HPC team at Case. We expect your functional code to reside there and to be either in production or available for further development by your team or future members of the team.
    • Except for special cases involving confidentiality, proprietary restrictions, etc., your data must be available generally, with specification of how to locate it, or on a Red Hen server (e.g. gallina) for other Red Hens.
To navigate in the Gallina tree, add this function to your ~/.bashrc file:

        # Move to the main tv storage directory N days ago and list the contents
        function day () {
         if [ -z "$1" ] ; then DAY=0 ; else DAY=${1:0:10} ; fi
         if [ "$( echo "$1" | egrep '^[0-9]+$' )" ] ; then DAY="$1"
          elif [ "${#1}" -eq "7" ] ; then cd /mnt/rds/redhen01/redhen/tv/${1%-*}/$1 ; DAY=""
          elif [ "$1" = "here" ] ; then DAY="$( pwd )" DAY=${DAY##*/} DAY="$[$[$(date +%s)-$(date -d "$DAY" +%s)]/86400]"
          elif [ "$1" = "+" ] ; then DAY=`pwd` ; DAY=${DAY##*/}
            DAY="$[$[$(date +%s)-$(date -ud "$DAY" +%s)]/86400]" ; DAY=$[DAY-$2]
          elif [ "$1" = "-" ] ; then DAY=`pwd` ; DAY=${DAY##*/}
            DAY="$[$[$(date +%s)-$(date -ud "$DAY" +%s)]/86400]" ; DAY=$[DAY+$2]
          elif [ "${#DAY}" -eq "10" ] ; then DAY="$[$[$(date +%s)-$(date -ud "$DAY" +%s)]/86400]"
          else echo "$1?"
         fi #;  echo "DAY is $DAY ; 1 is $1 ; 2 is $2"
         if [ -n "$DAY" ] ; then DIR="/mnt/rds/redhen01/redhen/tv/$(date -ud "-$DAY day" +%Y)/$(date -ud "-$DAY day" +%Y-%m)/$(date -ud "-$DAY day" +%F)"
           if [ -d $DIR ] ; then cd $DIR ; else echo "No $DIR" ; fi
         fi
        }

Save the file and issue "source ~/.bashrc" to activate. To go to a particular day, issue "day" with the date or the number of days ago:

    day 2018-02-04
    day 4

To navigate between dates, use

    day + 5
    day - 30

You can also use this in a loop, for instance:

    module load ffmpeg
    for DAY in {08..31} ; do day 2018-01-$DAY ; for FIL in *_CN_*.txt ; do echo $FIL ; grep 'DUR|' $FIL ; ffprobe ${FIL%.*}.mp4 ; done ; done

We have limited storage capacity in your home directory, but ample space on gallina, which is to say, /mnt/rds/redhen01/redhen. Could you attempt to create a directory on gallina where you can store your output and possibly your code and symlink to it from your home directory.

Documenting your work

  • Establish a page for the project on this website (http://redhenlab.org).  As you progress, update that page to indicate the current state and the next steps. Once a project is (perhaps temporarily) wrapped up, revise the page so that it is less in the form of ongoing notes and updates and form in the form of a general snapshot, with instructions, that will help newcomers to the project get oriented and use whatever has been developed.  See this page as a model.