The following tool is provided to offer an alternative to searching NDA for legacy PsychENCODE (PEC) data originally submitted to Synapse and migrated to NDA.
Requirements:
Users must have an account on NDA and permissions to download PEC data from NDA. Instructions for applying for access can be found here.
NDA API Usage (optional): Comfortability with basic command line usage. Instructions for installing and setting up nda-tools can be found here.
NDA Download Manager (optional): Download and install NDA's Download Manager software. Software and instructions for use can be found here.
Search tool Usage:
There are three options for finding data. By default, all of the files in PEC and CommonMind are listed in the search tool.
Alternatively, you can select from a list of common projects. These projects are described as they were listed in Synapse, with the description text from Synapse if it was available. The projects are organized within NDA as "Experiments" and reflected in the "ndaExperiments" column of the search tool table.
Finally, if you know specific Synapse IDs for files or projects, you may search directly for those IDs. The search results will include all child files found for that project, or for the Synapse ID(s) entered. Multiple IDs may be entered at once, and the results will be inclusive of all.
There are two options for downloading results:
One includes a full table listing the file paths on NDA, associated NDA experiments, and S3 addresses.
Alternatively, you can choose to download just the S3 addresses. This results in a file that can be used directly with nda-tools to download the selected files.
Suggested Usage (with nda-tools):
Log into NDA and create a download package from the entire PEC collection
Navigate your browser to the main PEC collection C5032.
Click "ADD TO CART" at the bottom of the page
Wait for the filter cart to update (top of the page, will update from "Filter Cart(0)" to "Filter Cart(1)".
Once the filter cart has populated, click on the filter cart and select "CREATE PACKAGE/ADD TO STUDY".
On the next page, select "CREATE DATA PACKAGE".
Enter your desired package name, and be sure to check the box next to "Include associated data files"
It will take some time for the data package to be completed. You can see the status of the package by clicking on "My Account" on the top right of the page, followed by selecting "Data Packages" from the drop-down menu.
Take note of the "ID" for your package. You will need to reference this to download files.
Use the tool below to find files of interest:
Either use one of the search functionalities, or download the entire table and choose files by path or filetype.
Once you have your list of selected files, you will need the S3 addresses listed in a text file, one per line. The tool has an option to download just the S3 addresses of the results to feed directly into the next step.
Download the files:
nda-tools offers options to download files either to your filesystem, or to a cloud location via the downloadcmd tool. Note that there may be more options for the tool than are currently listed in the github repo instructions. downloadcmd -h should list all available.
To download only the files listed in your search results file to your filesystem, use the following command:
downloadcmd -dp <package id> -t <s3FileList.txt> -d </destination/directory>
Suggested Usage (with NDA Download Manager):
Follow steps 1 and 2 from above.
Open the NDA Download Manager and log in. Your package should show up in the NDA Download Manager interface, once it has completed building.
Use results from the search tool to identify files/folders of interest.
The folder structure displayed in the NDA Download Manager will match that listed by the tool, with the addition that there is an added root layer in the manager corresponding to data modality type. For example, files listed under HumanStudies/BrainGVEX/Data/RNA-seq/fastq/ would be shown in the manager as rna_seq01/HumanStudies/BrainGVEX/Data/RNA-seq/fastq/
Individual files and folders can be selected for download from the manager interface.
All newer PEC data that were never uploaded to Synapse are available through project-specific NDA Collections listed in the table on the PEC page: https://nda.nih.gov/pec.