Job Configuration
Commands
Thecommands array specifies shell commands to run sequentially. Each job starts from a fresh virtual environment:
Instance Types
Specify an instance type for the job:Output Files
Define which files to save after job completion. Supports glob patterns:Ignored Files
Exclude files from being uploaded with your job:Job Statuses
Jobs progress through various statuses throughout their lifecycle:| Status | Description |
|---|---|
| Pending | Job is uploading and waiting to be assigned to a cluster. |
| Running | Job commands are being executed |
| Completed | All job commmands have returned an exit code of 0 and output files have been saved. |
| Error | User-level problem: a command has returned a non-zero exit code. Check the logs for details. |
| Failed | System-level problem: the cluster executing the job has failed (e.g., node failure, GPU error). TensorPool will investigate. |
| Canceling | Job cancellation in progress. The job outputs are being saved and cluster being shut down gracefully. |
| Canceled | Job was successfully canceled. |
Managing Jobs
List Jobs
View all your jobs:Job Information
Get detailed information about a specific job:Monitor Jobs
Stream real-time logs from a running job:Pull Output Files
Download output files from a completed job:Cancel Jobs
Cancel a running job:Multiple Configurations
You can create multiple configuration files for different experiments:Next Steps
- Learn about job commands
- Explore multi-node training for distributed workloads
- Manage SSH keys for cluster access