You are here

Globus File Transfer User Guide

Globus is a fast, reliable high-performance service for secure data movement provided by Computation Institute, University of Chicago, Argonne National Laboratory. Designed specifically for researchers, Globus advertises easy-to-use interface with background monitoring features- the service automates the activity of managing file transfers between any two resources - whether between two WestGrid resources or to/from WestGrid and another machine, such as another supercomputing facility, cloud resource, campus cluster, lab server, desktop or laptop.

Globus leverages GridFTP for its transfer protocol but shields the end user from complex and time consuming tasks related to GridFTP and other aspects of data movement.

Globus significantly improves transfer performance by auto tuning the transfer for the users and reduces the time spent managing transfers. Users have reported 10x or even 100x improvements over other transfer methods such as SCP.

When it originally launched, Globus' website was branded Globus Online. If you see references to Globus Online, they are referring to the Globus website.

Using Globus 

Globus can be used for moving data between any two resources, whether it is a small number of very large (even terabyte-sized) files or a very large number of small files. Note: A resource is represented in Globus as an "endpoint", identified by a unique name (for example, Simon Fraser University's Bugaboo is computecanada#bugaboo). For many small or uncomplicated transfers, scp, or WestGrid’s grid-enabled gcp, will work perfectly well.

You might choose to use Globus when:

  • You want to use a graphical user interface (GUI) to transfer data between WestGrid resources/systems.
  • You want file transfer management (“fire and forget”):
    • monitoring and auto-tuning performance
    • retrying failures
    • recovering from transfer faults automatically where possible
    • reporting status
  • You want performance better than single-stream scp to/from a non-WestGrid resource to a WestGrid resource, without having to configure GridFTP and Globus Toolkit tools manually. Globus transfers can be many times faster than scp for large data sets.

Specific use cases

  • Moving data between WestGrid resources: Globus can be used to move data between any two WestGrid resources. All WestGrid resources are already configured as Globus endpoints.
  • Moving data to WestGrid from a local machine (or vice versa): For cases where a user is looking to upload or download files to/from an WestGrid resource to a user's machine, Globus provides a simple mechanism to setup and manage such transfers. The phrase "user's machine" refers to any machine on which a user has an account such as a personal laptop, a campus server, or a machine used to access data from an instrument. Globus makes it possible to easily transfer to and from any machine (even if behind a firewall or NAT without any administrative privileges) with just a few clicks and without the typical difficulties of a GridFTP install.
  • Moving data between two non-WestGrid machines: Globus can also be used to move data between a campus cluster, a non-WestGrid computing facility, or a personal machine. Visit the Globus Connect Personal site for step-by-step instructions on how to turn your machine into an endpoint.

Getting started with Globus Online

Accessing Globus

Register for a Globus account at the Globus site.

Transferring Files

  • Moving data between WestGrid resources
  • Moving data to WestGrid from local machine (or vice versa)

Moving data between WestGrid resources

  1. Create a Globus account by visiting https://www.globus.org/SignUp if you have not done so already.
  2. To transfer files go to https://www.globus.org/xfer/StartTransfer or select the link shown in Figure 1.
  3. Select the source WestGrid endpoint in the "Endpoint" field. You may browse the drop-down list or type the name into the Endpoint field to find the desired endpoint. All WestGrid resources (and other Compute Canada sites) are listed under "computecanada#".  
  4. You will be asked to click a link which will redirect you to a WestGrid-hosted web page, where you can enter your WestGrid username and password. You will then be redirected back to Globus and your endpoint will be activated.
  5. Once the endpoint is activated you may browse and select the files/directories you wish to transfer.

Figure 1: Navigating to the transfer page

Globus file transfer link location

Figure 2: Selecting a WestGrid endpoint   (Note: you should use computecanada#bugaboo, not the westgrid#bugaboo illustrated in the figure.)

Transfer files example

 

Figure 3: Selected destination endpoint and files to transfer  (Note: you should use computecanada#bugaboo, not the westgrid#bugaboo illustrated in the figure.)

Transfer files example 2

Click the highlighted arrow to begin the transfer.

Globus will monitor progress, auto-correct and retry where necessary, auto-performance tune where possible, and report status.

Note that, by default, a transfer will overwrite existing files on the destination endpoint. To change this default behavior (e.g., to transfer only files that do not exist on the destination endpoint) click the "more options" link and select the desired transfer behavior (see Figure 4).

Figure 4: Changing the default transfer behavior

Figure 3: Changing the default transfer behavior

 

Moving data to WestGrid from local machine using Globus (or vice versa)

To move data between a WestGrid endpoint and your local machine you must enable your machine as a Globus endpoint. This is possible to do by installing Globus Connect Personal. You may download Globus Connect Personal (available for Mac OS, Windows, and Linux) by clicking on the Globus Connect Personal link at the bottom of the dashboard page or the "Get Globus Connect" link on the Transfer Files page.

Before installing Globus you must have a valid Globus account. Please follow steps 1 and 2 in the Moving data between WestGrid resources section above. Once you have a Globus account, follow the steps below to configure your local machine as a Globus endpoint:

  1. Install the downloaded Globus Connect Personal software. If you need assistance with this step, refer to the detailed instructions for your operating system: Mac OS, Linux, Windows.
  2. During installation you will be prompted for a setup key. To get this key:
    1. Open the Globus Connect Personal Installation window again, if you closed it after downloading (see Figure 5).
    2. Enter an endpoint name to identify your local machine to Globus.
    3. Click "Generate Setup Key".
    4. Copy the setup key and paste it into the Globus Connect Personal setup window to complete the process.

Figure 5: Setup a Globus Connect Personal endpoint

Figure 5: Setup a Globus Connect endpoint

Your local machine should now be available as an endpoint, listed as your_user_name#endpoint_name, where endpoint_name is the name you entered during Globus Connect setup above.

To transfer files:

  1. Go to https://www.globus.org/xfer/StartTransfer
  2. Select source endpoint, e.g., an WestGrid endpoint such as computecanada#silo.
  3. Select the destination endpoint, e.g. your newly created endpoint on your local machine.
  4. Browse and select the files you wish to transfer.
  5. Click the highlighted arrow to begin the transfer.

Accessing Globus via a Command Line Interface

Globus provides a command line interface that may be accessed using any standard ssh terminal client. Prior to running shell commands, you must upload your public SSH key to your Globus account. See instructions on the Globus support website. An Introduction to CLI and more details are available at Globus.

Note: Due to the way that WestGrid endpoints are now activated, you need to log into Globus and activate any endpoints you wish to use from the command line before attempting to use them.

Once your key is uploaded, to enter a secure Globus shell type:

ssh globus_username@cli.globusonline.org

You will see the Globus command prompt:

Welcome to globusonline.org, . Type "help" for help.
$ _

Enter "help" to see the commands available. Commonly used commands include:

  • endpoint-list - shows endpoints you've recently used
  • scp - an easy way to start a transfer, using familiar scp syntax
  • status - view status of transfer tasks
  • ls – a way to view the contents of an endpoint directory, e.g. ls computecanada#silo/home/sam

A complete list of Globus commands, along with detailed descriptions, is available on the CLI Command Reference page.

Example copy from computecanada#bugaboo to computecanada#silo:

Local-iMac:.ssh sam$ ssh sam@cli.globusonline.org
$ scp computecanada#bugaboo:/home/sam/globus_connect_install_latest.exe computecanada#silo:/home/sam/
Enter username for 'myproxy.westgrid.ca' (Default: 'tanieth'):
Enter password for 'myproxy.westgrid.ca':
Credential Subject: /C=CA/O=Grid/OU=westgrid.ca/CN=Sam
User pax-655/CN=17345973655/CN=12742230/CN=539343431
Credential Time Left: 23:59:59
Activating 'computecanada#bugaboo'
Activating 'computecanada#silo'
Task ID: 3f0b62ee-884d-11e2-b74b-12313906b091
Type <CTRL-C>  to cancel or bg to background [XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
1/1 532.54 mbps

What does Globus NOT DO yet?

Currently, Globus does not support the ability to rename files on a remote server using the Globus interface. In the event that you need to rename a file, you will be expected to ssh to the server where your files reside and perform the rename function directly.


Updated:
2015-02-25, 2016-03-08 - Changed westgrid# endpoint names to computecanada#.
2017-03-02 - Removed reference to Silo.