Parse molecule names to extract lipid class and chain information.

Parse lipid names to return a data.frame containing lipid class, total chain length and unsaturation. Lipids should follow the pattern 'class xx:x/yy:y', with class referring to the abbreviated lipid class, xx:x as the composition of the first chain and yy:y as the second chain. Alternatively, lipids can be supplied following the pattern 'class zz:z', where zz:z indicates the combined chain length and unsaturation information.

annotate_lipids(molecules, no_match = c("warn", "remove", "ignore"))

Arguments

molecules: A character vector containing lipid molecule names.
no_match: How to handle lipids that cannot be parsed? Default is to give warnings.

Value

A data.frame with lipid annotations as columns. Input lipid names are given in a column named "Molecule".

Examples

lipid_list <- c(
  "Lyso PE 18:1(d7)",
  "PE(32:0)",
  "Cer(d18:0/C22:0)",
  "TG(16:0/18:1/18:1)"
)
annotate_lipids(lipid_list)
#> # A tibble: 4 × 21
#>   Molecule      clean_name ambig not_matched istd  class_stub chain1   l_1   s_1
#>   <chr>         <chr>      <lgl> <lgl>       <lgl> <chr>      <chr>  <int> <int>
#> 1 Lyso PE 18:1… LPE 18:1(… FALSE FALSE       TRUE  LPE        18:1      18     1
#> 2 PE(32:0)      PE 32:0    FALSE FALSE       FALSE PE         32:0      32     0
#> 3 Cer(d18:0/C2… Cer 18:0/… FALSE FALSE       FALSE Cer        18:0      18     0
#> 4 TG(16:0/18:1… TG 16:0/1… FALSE FALSE       FALSE TG         16:0      16     0
#> # ℹ 12 more variables: chain2 <chr>, l_2 <int>, s_2 <int>, chain3 <chr>,
#> #   l_3 <int>, s_3 <int>, chain4 <chr>, l_4 <lgl>, s_4 <lgl>, total_cl <int>,
#> #   total_cs <int>, Class <chr>