Sandbox paths and pipeline mappings
A list of folders accessible in sandbox and their full bucket paths and mappings
Sandbox paths and their bucket locations and mappings
The table below provides the sandbox (interval/IVM) paths, their "mappings", external bucket paths and their descriptions. If using these paths for running pipelines (e.g. setting file or folder locations in analysis input .json files), the Sandbox (IVM) path will not work and only the sandbox mapping and external bucket paths can be used for this purpose.
/finngen/library-green/
LIBRARY_GREEN/
gs://finngen-production-library-green/
FinnGen's core analysis results, summary data, anonymous data
/finngen/library-red/
LIBRARY_RED/
gs://finngen-production-library-red/
Phenotype, Genotype data, individual-level data
/finngen/red/
SANDBOX_RED/
gs://fg-production-sandbox-Y-red/
Researchers' own analysis results, the organization's own "red" bucket
/home/ivm/
NA
NA
Home disk. Users' personal data and scripts. Cannot be accessed
/finngen/shared/
LIBRARY_SHARED/
gs://finngen-production-library-shared/
Shared data between all Sandboxes (organizations)
/finngen/library-green/finngen_RX/unmodifiable_pipelines/
CUSTOM_GWAS/
gs://library_green/finngen_RX/unmodifiable_pipeline/
Output location of unmodifiable pipeline results.
Note 1. The green, red and pipelines buckets (/finngen/green/
, /finngen/red/
and /finngen/pipelines/
) are specific to your organisation's bucket and data in these locations cannot be seen by other organisations. If using the external (bucket) path to access these locations, you will need to replace the letter Y
with the number of your Sandbox. If you do not know your organisation's sandbox number, you can find it from the green, red and pipelines bucket paths in a file named buckets.txt
on your sandbox desktop.
Note 2. For unmodifiable pipeline results, remember to replace the letter X
in the sandbox or external path with the release number of the unmodifiable pipeline (e.g. 12
).
Note 3. The greendownloads or the greenuploads buckets are not accessible from the Sandbox.
Finding bucket paths and mappings within sandbox
You can remind yourself of your organisation's Sandbox number and their buckets paths from the buckets.txt
file on the Sandbox desktop (see Navigating the Sandbox) - the full path of this file is /home/ivm/Desktop/buckets.txt
.
The full bucket paths are also stored in environment variables in your terminal environment. For example, to see the bucket paths of the red and pipeline buckets in the terminal, you can use the echo
command:
echo $RED_BUCKET
echo $PIPELINE_BUCKET

and use $RED_BUCKET
and $PIPELINE_BUCKET
to refer those paths within the terminal when performing file operations (e.g. copying, moving, deleting etc.) using the gsutil
command.
Copying files to your organisation's red bucket
In order to make files accessible for running analyses in the cloud (e.g. any pipeline submitted using the Pipelines tool), the files first need to be copied to a externally accessible location. For this purpose, the best location is the red bucket, which is specific to each organisation's sandbox environment and can be used to store red (individual-level) or green (summary-level) data. To copy a file to the red bucket, open the terminal ("Terminal Emulator" from the Applications menu) and run the command:
gsutil cp /path/to/file_to_copy.txt $RED_BUCKET/username/
where /path/to/file_to_copy.txt
is the full sandbox path of the file you want to copy and username
is your sandbox username. If the folder $RED_BUCKET/
doesn't already exist, this command will create it.
To use this file for analyses submitted to the cloud, the path you would need for the input .json file would therefore be SANDBOX_RED/username/file_to_copy.txt
, using the mapping format.
It is recommended that you copy files only to your own red bucket folder (i.e. $RED_BUCKET/username/
) so that you don't accidentally overwrite other users' files and also that you can find the files again when needed. It is good practice to create subfolders within your own red bucket folder to keep your files organised, e.g. by copying the required files to a specific subfolders. An example could be:
gsutil cp /path/to/myphenotypes.txt.gz $RED_BUCKET/username/Phenotype_data/R12/
Last updated
Was this helpful?