Run Cytosim compression simulations¶
Notebook contains steps for running Cytosim simulations in which a single actin fiber is compressed at different compression velocities.
This notebook uses Cytosim templates
and scripts. Clone a copy and set the environment variable
CYTOSIM=/path/to/Cytosim/
.
This notebook provides an example of running a simulation series in which there
are multiple conditions, each of which need to be run for multiple replicates.
For an example of running a simulation series for a single condition for
multiple replicates, see run_cytosim_no_compression_batch_simulations.py
.
if __name__ != "__main__":
raise ImportError("This module is a notebook and is not meant to be imported")
import getpass
import os
import sys
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
from subcell_pipeline.simulation.batch_simulations import (
check_and_save_job_logs,
generate_configs_from_template,
register_and_run_simulations,
)
load_dotenv()
cytosim_path: Path = Path(os.getenv("CYTOSIM", "."))
Template generator uses the Preconfig class from Cytosim. Append the path to
preconfig.py
from a copy of the Cytosim
repository, or download a copy into
this location.
sys.path.append(str(cytosim_path / "python" / "run"))
from preconfig import Preconfig # noqa: E402
Define simulation conditions¶
Defines the ACTIN_COMPRESSION_VELOCITY
simulation series, which compresses a
single 500 nm actin fiber at four different velocities (4.7, 15, 47, and 150
μm/s) with five replicates each (random seeds 1, 2, 3, 4, and 5).
# Name of the simulation series
series_name: str = "ACTIN_COMPRESSION_VELOCITY"
# S3 bucket for input and output files
bucket: str = "s3://cytosim-working-bucket"
# Random seeds for simulations
random_seeds: list[int] = [1, 2, 3, 4, 5]
# Path to the config template file
path_to_template: Path = cytosim_path / "templates" / "vary_compress_rate.cym.tpl"
# Current timestamp used to organize input and outfile files
timestamp: str = datetime.now().strftime("%Y-%m-%d")
# File keys for each velocity
velocity_keys: dict[str, str] = {
"4.73413649": "0047",
"15": "0150",
"47.4341649": "0470",
"150": "1500",
}
Generate configs from template¶
Use the Preconfig
from Cytosim to convert the .cym.tpl
template file into
local .cym
config files. For each file, extract the compression velocity to
use as the simulation condition key. Save all config files to S3 bucket.
preconfig = Preconfig()
config_files = preconfig.parse(path_to_template, {}, path=cytosim_path / "configs")
pattern = r"compression_velocity:([\s0-9\.]+)"
group_keys = generate_configs_from_template(
bucket,
series_name,
timestamp,
random_seeds,
config_files,
pattern,
velocity_keys,
)
Define simulation settings¶
Defines the AWS Batch settings for the simulations. Note that these settings will need to be modified for different AWS accounts.
# AWS account number
aws_account: str = getpass.getpass()
# AWS region
aws_region: str = "us-west-2"
# Prefix for job name and image
aws_user: str = "jessicay"
# Image name and version
image: str = "cytosim:0.0.0"
# Number of vCPUs for each job
vcpus: int = 1
# Memory for each job
memory: int = 7000
# Job queue
job_queue: str = "general_on_demand"
# Job array size
job_size: int = len(random_seeds)
Register and run jobs¶
For each velocity, we create a new job definition that specifies the input configs and the output location. Each job definition is then registered (for job definitions with the same name, a new revisions will be created unless no parameters have changed). All replicates for a given velocities are submitted as a job array. Job status can be monitored via the AWS Console.
job_arns = register_and_run_simulations(
bucket,
series_name,
timestamp,
group_keys,
aws_account,
aws_region,
aws_user,
image,
vcpus,
memory,
job_queue,
job_size,
)
Check and save job logs¶
Iterates through the list of submitted job ARNs to check job status. If job does not have the “SUCCEEDED” status, print the current status. If job does have the “SUCCEEDED” status, then save a copy of the CloudWatch logs. The list of job ARNs can be manually adjusted to limit which jobs are checks or avoid saving logs that have already been saved.
check_and_save_job_logs(bucket, series_name, job_arns, aws_region)