--- title: "Reading FreeSurfer neuroimaging data with freesurferformats" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Reading FreeSurfer neuroimaging data with freesurferformats} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- In this document, we show how to read brain imaging data from [FreeSurfer](https://surfer.nmr.mgh.harvard.edu) binary files. These files are created and used by the [FreeSurfer neuroimaging software suite](https://surfer.nmr.mgh.harvard.edu) to store volume data and surface morphometry data computed from MRI brain images. # Reading FreeSurfer neuroimaging data with freesurferformats ## Background Brain imaging data come in different formats. Typically, the data is acquired on a scanner that outputs a set of two-dimensional (2D) DICOM format images. The 2D images are often combined into a single file that holds a 3D or 4D stack of images for further processing. Common formats include ANALYZE, NIFTI, and the MGH format used by FreeSurfer. This *freesurferdata* R package implements functions to parse data in MGH format, as well as some related FreeSurfer formats. MGH stands for Massachusetts General Hospital, and is a binary format. The MGZ format is a compressed version of the MGH format. **Note:** To learn how to *write* neuroimaging data with this package, read the vignette *Writing FreeSurfer neuroimaging data with freesurferformats* that comes with this package. ## Reading MGH and MGZ format files Here is a first example for reading MGH data. ```{r} library("freesurferformats") mgh_file = system.file("extdata", "brain.mgz", package = "freesurferformats", mustWork = TRUE) brain = read.fs.mgh(mgh_file) cat(sprintf("Read voxel data with dimensions %s. Values: min=%d, mean=%f, max=%d.\n", paste(dim(brain), collapse = 'x'), min(brain), mean(brain), max(brain))); ``` Now, `brain` is an *n*-dimensional matrix, where *n* depends on the data in the MGZ file. A conformed FreeSurfer volume like `brain.mgz` typically has 4 dimensions and 256 x 256 x 256 x 1 = 16777216 voxels. The final dimension, which is 1 here, means it has only a single time point or *frame*. In this case, the file was compressed and in MGZ format, but the function does not care, it works for both MGH and MGZ. If you need the header data, read on. ### Accessing metadata from the MGH header To access not only the volume data, but also the header, call `read.fs.mgh` like this: ```{r} brain_with_hdr = read.fs.mgh(mgh_file, with_header = TRUE); brain = brain_with_hdr$data; # as seen before, this is what we got in the last example (the data). header = brain_with_hdr$header; # the header ``` Now you have acces to the following header data: ```{r, eval = FALSE} header$dtype # int, one of: 0=MRI_UCHAR; 1=MRI_INT; 3=MRI_FLOAT; 4=MRI_SHORT header$ras_good_flag # int, 0 or 1. Whether the file contains a valid vox2ras matrix and ras_xform (see header$vox2ras_matrix below) header$has_mr_params # int, 0 or 1. Whether the file contains mr_params (see header$mr_params below) header$voldim # integer vector or length 4. The volume (=data) dimensions. E.g., c(256, 256, 256, 1) for 3D data. ``` If your MGH/MGZ file contains valid information on the vox2ras matrix and/or acquisition parameters (`mr_params`), you can access them like this for the mr params: ```{r} if(header$has_mr_params) { mr_params = header$mr_params; cat(sprintf("MR acquisition parameters: TR [ms]=%f, filp angle [radians]=%f, TE [ms]=%f, TI [ms]=%f\n", mr_params[1], mr_params[2], mr_params[3], mr_params[4])); } ``` And like this for the `vox2ras_matrix`: ```{r} if(header$ras_good_flag) { print(header$vox2ras_matrix); } ``` And finally the `ras_xform`: ```{r} if(header$ras_good_flag) { print(header$ras_xform); } ``` ### A second MGH example The MGH/MGZ format is also used to store morphometry data mapped to standard space (fsaverage). In the following example, we read cortical thickness data in standard space, smoothed with a FWHM 25 kernel: ```{r, eval = FALSE} mgh_file = system.file("mystudy", "subject1", "surf", "lh.thickness.fwhm25.fsaverage.mgh") cortical_thickness_standard = read.fs.mgh(mgh_file) ``` Now, `cortical_thickness_standard` is a vector of *n* float values, where *n* is the number of vertices of the *fsaverage* subject's left hemisphere surface (i.e., 163842 in FreeSurfer 6). ### Hint: Further image processing and visualization for volumes If all you need is to perform statistical analysis of the data in the MGH file, you are ready to do that after loading. If you need access to more image operations, I would recommend to convert the data to a NIFTI object. E.g., if you have [oro.nifti](https://CRAN.R-project.org/package=oro.nifti ) installed, you could visualize the `brain` data we loaded earlier like this: ```{r, eval = FALSE} oro.nifti::orthographic(oro.nifti::nifti(brain)) ``` If you need advanced visualization, including the option to render morphometry data or annotation on 3D brain surface meshes, you can have a look at the [fsbrain package](https://github.com/dfsp-spirit/fsbrain). ## Reading 'curv' format files Let's read an example morphometry data file that comes with this package. It contains vertex-wise measures of cortical thickness for the left hemisphere of a single subject in native space. ```{r} library("freesurferformats") curvfile = system.file("extdata", "lh.thickness", package = "freesurferformats", mustWork = TRUE) ct = read.fs.curv(curvfile) ``` Now, `ct` is a vector of *n* float values, where *n* is the number of vertices of the surface mesh the data belongs to (usually `surf/lh.white`). The number of vertices differs between subjects, as this is native space data. We can now have a closer look at the data and maybe plot a histogram of cortical thickness for this subject: ```{r} cat(sprintf("Read data for %d vertices. Values: min=%f, mean=%f, max=%f.\n", length(ct), min(ct), mean(ct), max(ct))) hist(ct, main="lh cortical thickness", xlab="Cortical thickness [mm]", ylab="Vertex count") ``` ## Reading morphometry data, no matter the format The package provides a wrapper function to read morphometry data, no matter the format. It always returns data as a vector and automatically determines the format from the file name. Here we use the function to read the file from the last example: ```{r} morphfile1 = system.file("extdata", "lh.thickness", package = "freesurferformats", mustWork = TRUE) thickness_native = read.fs.morph(morphfile1) ``` And here is an example for an MGZ file: ```{r} morphfile2 = system.file("extdata", "lh.curv.fwhm10.fsaverage.mgz", package = "freesurferformats", mustWork = TRUE) curv_standard = read.fs.morph(morphfile2) curv_standard[curv_standard < -1] = 0; # remove extreme outliers curv_standard[curv_standard > 1] = 0; hist(curv_standard, main="lh std curvature", xlab="Mean Curvature [mm^-1], fwhm10", ylab="Vertex count") ``` ## Reading morphometry data for a subset of vertices from weight, paint or w files Sometimes, vertex-wise data is stored in so-called `weight` files, which are also known as `paint` files or simply as `w` files. They do not simply contain a list of data values, but pairs of *(vertex index, value)*. The format is useful when you have data only for a subset of the vertices of a surface. It seems to be less used these days. Use the *read.fs.weight* function to read w files. ## Reading annotation files An annotation file contains a cortical parcellation for a subject, based on a brain atlas. It contains a label for each vertex of a surface, and that label assigns this vertex to one of a set of atlas regions. The file format also contains a colortable, which assigns a color code to each atlas region. An example file would be `labels/lh.aparc.annot` for the `aparc` (Desikan) atlas. Let's read an example annotation file that comes with this package: ```{r} annotfile = system.file("extdata", "lh.aparc.annot.gz", package = "freesurferformats", mustWork = TRUE); annot = read.fs.annot(annotfile); ``` **Note:** The example file that comes with this package was gzipped to save space. While this is not typical for annot files, the *read.fs.annot* function handles it automatically if the filename ends with *.gz*. ## Working with annotation data As mentioned earlier, such a file contains various pieces of information. Let us investigate the labels and the atlas region names for some vertices first: ```{r} num_vertices_total = length(annot$vertices); for (vert_idx in c(1, 5000, 123456)) { cat(sprintf("Vertex #%d with zero-based index %d has label code '%d' which stands for atlas region '%s'\n", vert_idx, annot$vertices[vert_idx], annot$label_codes[vert_idx], annot$label_names[vert_idx])); } ``` Now, we will focus on the colortable. We will list the available regions and their color codes. ```{r} ctable = annot$colortable$table; regions = annot$colortable$struct_names; for (region_idx in seq_len(annot$colortable$num_entries)) { cat(sprintf("Region #%d called '%s' has RGBA color (%d %d %d %d) and code '%d'.\n", region_idx, regions[region_idx], ctable[region_idx,1], ctable[region_idx,2], ctable[region_idx,3], ctable[region_idx,4], ctable[region_idx,5])); } ``` Keep in mind the indices when comparing results to those from other software: in GNU R, indices start with 1 but the FreeSurfer standard indices are zero-based: ```{r} r_index = 50; # one-based index as used by R and Matlab fs_index = annot$vertices[r_index]; # zero-based index as used in C, Java, Python and many modern languages cat(sprintf("Vertex at R index %d has FreeSurfer index %d and lies in region '%s'.\n", r_index, fs_index, annot$label_names[r_index])); ``` Let us retrieve some information on a specific region. We will reuse the `thickness_native` data loaded above: ```{r} region = "bankssts" thickness_in_region = thickness_native[annot$label_names == region] cat(sprintf("Region '%s' has %d vertices and a mean cortical thickness of %f mm.\n", region, length(thickness_in_region), mean(thickness_in_region))); ``` That's all the information you can get from an annotation file. ## Reading surface files (meshes) A surface file contains a brain surface mesh, i.e., a list of vertices and a list of faces. A vertex is defined by its three x,y,z coordinates, which are doubles. A face is defined by three vertex indices. Example files are `surf/lh.white` or `surf/rh.pial`. Let's read an tiny example surface file that comes with this package: ```{r} surface_file = system.file("extdata", "lh.tinysurface", package = "freesurferformats", mustWork = TRUE); surf = read.fs.surface(surface_file); cat(sprintf("Loaded surface consisting of %d vertices and %d faces.\n", nrow(surf$vertices), nrow(surf$faces))); ``` Now we can print the coordinates of vertex 5: ```{r} vertex_index = 5; v5 = surf$vertices[vertex_index,]; cat(sprintf("Vertex %d has coordinates (%f, %f, %f).\n", vertex_index, v5[1], v5[2], v5[3])); ``` And also the 3 vertices that make up face 2: ```{r} face_index = 2; f2 = surf$faces[face_index,]; cat(sprintf("Face %d consistes of the vertices %d, %d, and %d.\n", face_index, f2[1], f2[2], f2[3])); ``` Note that the vertex indices start with 1 in GNU R. The vertex indices in the file are 0-based, but this is handled transparently by the 'read.fs.surface' and 'write.fs.surface' functions. ### Reading arbitrary triangular meshes and converting to rgl mesh3d Note that freesurferformats supports not only FreeSurfer meshes, but also a range of standard mesh file formats. You can use the `read.fs.surface()` function to read meshes in formats like Wavefront Object Format (`obj`), Stanford Triangle Format (`PLY`), Object File Format (`OFF`), and many more. Many mesh packages in R use the `mesh3d` datastructure from the `rgl` package to store or manipulate meshes, so it is useful to know how to covert our `fs.surface` mesh instance into a `mesh3d` instance. Here we show how to load a mesh and transform in to a `tmesh3d` instance (requires the `rgl` package): ```{r, eval = FALSE} pmesh = read.fs.surface("~/path/to/mesh.ply"); my_mesh3d = rgl::tmesh3d(c(t(pmesh$vertices)), c(t(pmesh$faces)), homogeneous=FALSE); ``` One can now use this mesh in `rgl` or other packages like `Rvcg` that use the mesh3d datastructure. Here we compute the curvature of the mesh using the `Rvcg` package: ```{r, eval = FALSE} curv = Rvcg::vcgCurve(my_mesh3d); ``` ## Reading labels A label defines a list of vertices (of an associated surface or morphometry file) which are part of it. All others are not. You can think of it as binary mask. An atlas or annotation can be thought of as a list of labels, each of which defines a single brain region (the annotation also contains a colormap for the regions). Labels are useful to extract or mask out certain brain regions. E.g., you may want to exclude the medial wall from your analysis, of you may only be interested in a certain brain region. The following example reads a label file: ```{r} labelfile = system.file("extdata", "lh.entorhinal_exvivo.label", package = "freesurferformats", mustWork = TRUE); label = read.fs.label(labelfile); cat(sprintf("The label consists of %d vertices, the first one is vertex %d.\n", length(label), label[1])); ``` ## Reading color lookup tables (LUT files) A colortable assigns a name and an RGBA color to a set of labels. Such a colortable is included in an annotation (brain atlas), but there also is a separate ASCII text file format for these LUTs. An example file in the format is `FREESURFER_HOME/FreeSurferColorLUT.txt`. To read such a file: ```{r} lutfile = system.file("extdata", "colorlut.txt", package = "freesurferformats", mustWork = TRUE); colortable = read.fs.colortable(lutfile, compute_colorcode=TRUE); head(colortable); ``` You can also extract such a colortable from an annotation: ```{r} annotfile = system.file("extdata", "lh.aparc.annot.gz", package = "freesurferformats", mustWork = TRUE); annot = read.fs.annot(annotfile); colortable = colortable.from.annot(annot); head(colortable); ``` ### Computing the unique color codes Some functions in FreeSurfer identify the regions by an internal code code that is computed from the RGBA values of the colors. These values are *not* included in the colortable, they need to be computed. You can pass the optional argument `compute_colorcode=TRUE` to the LUT functions if you want them to compute the color codes and add the colorcode to the returned dataframe in a column named 'code'. ## References * See [the FreeSurfer wiki](https://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/MghFormat) for details on the MGH format ## Alternatives, similar and related Packages * The [freesurfer package by John Muschelli]( https://CRAN.R-project.org/package=freesurfer) implements an R wrapper around some FreeSurfer command line utilities. This includes functions to read MGH/MGZ files by converting them to NIFTI format using command line utilities that come with FreeSurfer, then loading the resulting NIFTI file with the [oro.nifti package](https://CRAN.R-project.org/package=oro.nifti). As such, the package requires that you have FreeSurfer installed. * Several packages exist that read files in NIFTI format, including [oro.nifti](https://CRAN.R-project.org/package=oro.nifti) and [RNifti](https://CRAN.R-project.org/package=RNifti). * My [fsbrain package](https://github.com/dfsp-spirit/fsbrain) provides an abstraction layer around fresurferformats. It is desingned to quickly access data from single subjects or groups stored in the standardized output directory structure used by FreeSurfer and recon-all. The package fsbrain also includes advanced visualization functions for surface-based data. * GIFTI files can be read with the [gitfti package by John Muschelli](https://cran.r-project.org/package=gifti).