-
Notifications
You must be signed in to change notification settings - Fork 51
SimpleFileExtractor
CBRAIN offers a way to mass-extract files out of existing datasets. The tool is called SimpleFileExtractor and is installed on most supercomputers.
You use it by selecting the set of files (or rather, FileCollections) you want to extract from, then providing a list of patterns to match the files inside. The result will be a new FileCollection containing the extracted files.
For example, you have many CivetOutputs and you are interested in some of the thickness and surface files. You select those CivetOuputs files and launch the SimpleFileExtractor tool.
Note:
You can not run the tool on more than about 5000 input file collections. Just run it multiple times on different subsets of 5000 inputs if you have larger input sets.
Note:
To avoid having to copy the entirety of the original (source) files, the tool will only run if the input data is stored locally. You need to select a version that runs on a particular server, depending on the location of your inputs. The mapping is as follows:
-
Use Beluga for files stored on: Local-Beluga
-
Use Cedar for files stored on: Local-Beluga
-
Use Graham for files stored on: Local-Graham
-
Use GrahamPlatform for files stored on: Local-GrahamPlatform
-
Use Converter-1 or Converter-2 for files stored on:
- MainStore
- NeuroHubStore
- SFTP-1
- SFTP-2
- NeuroHub-UKBB-Civet
- CONP-VisualWorkingMemory
- CONP-OpenPreventAD
- CONP-OpenPreventAD-BIDS-Subjects
- CONP-BigBrain-3DClassifiedVolumes
- CONP-BigBrain-3DSurfaces
- CONP-BigBrain
In the task parameters, you provide the file patterns of your interest:
- All the files matching these patterns will be extracted (copied) into a new FileCollection.