Halo Finding

Halo finding and analysis are combined into a single framework called the HaloCatalog.

If you already have a halo catalog, either produced by one of the methods below or in a format described in Halo Catalog Data, and want to perform further analysis, skip to Halo Analysis.

Three halo finding methods exist within yt. These are:

Halo finding is performed through the creation of a HaloCatalog object. The dataset or datasets on which halo finding is to be performed should be loaded and given to the HaloCatalog along with the finder_method keyword to specify the method to be used.

import yt
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog

data_ds = yt.load("Enzo_64/RD0006/RedshiftOutput0006")
hc = HaloCatalog(data_ds=data_ds, finder_method="hop")
hc.create()

Halo Finding on Multiple Snapshots

To run halo finding on a series of snapshots, provide a DatasetSeries or SimulationTimeSeries to the HaloCatalog. See Time Series Analysis and Analyzing an Entire Simulation for more information on creating these. All three halo finders can be run this way. If you want to make merger trees with Rockstar halo catalogs, you must run Rockstar in this way.

import yt
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog

my_sim = yt.load_simulation("enzo_tiny_cosmology/32Mpc_32.enzo", "Enzo")
my_sim.get_time_series()
hc = HaloCatalog(data_ds=my_sim, finder_method="hop")
hc.create()

Halo Finder Options

The available finder_method options are “fof”, “hop”, or “rockstar”. Each of these methods has their own set of keyword arguments to control functionality. These can specified in the form of a dictionary using the finder_kwargs keyword.

import yt
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog

data_ds = yt.load("Enzo_64/RD0006/RedshiftOutput0006")
hc = HaloCatalog(
    data_ds=data_ds,
    finder_method="fof",
    finder_kwargs={"ptype": "stars", "padding": 0.02},
)
hc.create()

For a full list of options for each halo finder, see:

FoF

This is a basic friends-of-friends algorithm. Any two particles separated by less than a linking length are considered to be in the same group. See Efstathiou et al. (1985) for more details as well as FOFHaloFinder.

HOP

This is the method introduced by Eisenstein and Hut (1998). The procedure is roughly as follows.

  1. Estimate the local density at each particle using a smoothing kernel.

  2. Build chains of linked particles by ‘hopping’ from one particle to its densest neighbor. A particle which is its own densest neighbor is the end of the chain.

  3. All chains that share the same densest particle are grouped together.

  4. Groups are included, linked together, or discarded depending on the user-supplied over density threshold parameter. The default is 160.

For both the FoF and HOP halo finders, the resulting halo catalogs will be written to a directory associated with the output_dir keyword provided to the HaloCatalog. The number of files for each catalog is equal to the number of processors used. The catalog files have the naming convention <dataset_name>/<dataset_name>.<processor_number>.h5, where dataset_name refers to the name of the snapshot. For more information on loading these with yt, see YTHaloCatalog.

Rockstar-galaxies

Rockstar uses an adaptive hierarchical refinement of friends-of-friends groups in six phase-space dimensions and one time dimension, which allows for robust (grid-independent, shape-independent, and noise- resilient) tracking of substructure. The methods are described in Behroozi et al. 2011.

The yt_astro_analysis package works with the latest version of rockstar-galaxies. See Installing with Rockstar support for information on obtaining and installing rockstar-galaxies for use with yt_astro_analysis.

To run Rockstar, your script must be run with mpirun using a minimum of three processors. Rockstar processes are divided into three groups:

  • readers: these read particle data from the snapshots. Set the number of readers with the num_readers keyword argument.

  • writers: these perform the halo finding and write the subsequent halo catalogs. Set the number of writers with the num_writers keyword argument.

  • server: this process coordinates the activity of the readers and writers. There is only one server process. The total number of processes given with mpirun must be equal to the number of readers plus writers plus one (for the server).

import yt

yt.enable_parallelism()
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog

my_sim = yt.load_simulation("enzo_tiny_cosmology/32Mpc_32.enzo", "Enzo")
my_sim.get_time_series()
hc = HaloCatalog(
    data_ds=my_sim,
    finder_method="rockstar",
    finder_kwargs={"num_readers": 1, "num_writers": 1},
)
hc.create()

Warning

Running Rockstar from yt on multiple compute nodes connected by an Infiniband network can be problematic. It is recommended to force the use of the non-Infiniband network (e.g. Ethernet) using this flag: --mca btl ^openib. For example, to run with 24 cores, do: mpirun -n 24 --mca btl ^openib python ./run_rockstar.py.

See RockstarHaloFinder for the list of available options.

Rockstar halo catalogs are saved to the directory associated the output_dir keyword provided to the HaloCatalog. The number of files for each catalog is equal to the number of writers. The catalog files have the naming convention halos_<catalog_number>.<processor_number>.bin, where catalog number 0 is the first halo catalog calculated. For more information on loading these with yt, see Rockstar.

Parallelism

All three halo finders can be run in parallel using mpirun and by adding yt.enable_parallelism() to the top of the script. The computational domain will be divided evenly among all processes (among the writers in the case of Rockstar) with a small amount of padding to ensure halos on sub-volume boundaries are not split. For FoF and HOP, the number of processors used only needs to provided to mpirun (e.g., mpirun -np 8 to run on 8 processors).

import yt

yt.enable_parallelism()
from yt.extensions.astro_analysis.halo_analysis import HaloCatalog

data_ds = yt.load("Enzo_64/RD0006/RedshiftOutput0006")
hc = HaloCatalog(
    data_ds=data_ds,
    finder_method="fof",
    finder_kwargs={"ptype": "stars", "padding": 0.02},
)
hc.create()

For more information on running yt in parallel, see Parallel Computation With yt.

Saving Halo Particles

As of version 1.1 of yt_astro_analysis, the ids of the particles belonging to each halo can be saved to the catalog when using either the FoF or HOP methods. The is enabled by default and can be disabled by setting save_particles to False in the finder_kwargs dictionary, as described above. Rockstar will also save halo particles to the .bin files. However, reading these is not currently supported in yt. See YTHaloCatalog for information on accessing halo particles for FoF and HOP catalogs.