4 Aggregation
I am the wisest man alive, for I know one thing, and that is that I know nothing.1
Data aggregation stands as a fundamental aspect of data analysis, representing the process by which tidy data (along with metadata) is collected or grouped and subsequently presented in a summarized format. In the realm of bat survey data, aggregation typically occurs through two primary methods:
- Aggregation over a specified time period.
 - Spatial aggregation for locations or factors associated with a particular location (e.g., woodland).”
 
The bat survey data statics in the iBats package has 6,930 bat observations; a random selection of 5 rows is shown in Table 4.1.
Show the code
statics %>% # statics is a tidy data set from the iBats package
  sample_n(5) %>% 
  arrange(DateTime) %>% 
  # Table made with gt
  gt() %>% 
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")
      ),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  # Make bat scientific name italic
  tab_style(
    style = list(
      cell_text(style = "italic")
      ),
    locations = cells_body(
      columns = c(Species)
    )
  ) %>% 
  tab_options(data_row.padding = px(2)) %>% 
  cols_align(
    align = "left",
    columns = DateTime)| Description | DateTime | Species | Longitude | Latitude | 
|---|---|---|---|---|
| Static 4 | 2016-06-13 02:04:27 | Pipistrellus pipistrellus | -3.591858 | 50.33123 | 
| Static 4 | 2016-08-05 00:09:43 | Pipistrellus pipistrellus | -3.591748 | 50.33136 | 
| Static 2 | 2016-08-12 22:23:46 | Pipistrellus pipistrellus | -3.592583 | 50.33323 | 
| Static 4 | 2016-08-24 00:12:15 | Pipistrellus pipistrellus | -3.592228 | 50.33091 | 
| Static 2 | 2016-08-29 21:37:44 | Pipistrellus pipistrellus | -3.592583 | 50.33323 | 
4.1 Count of Bats
The simplest form of aggregation is a count of bats2; as shown in Table 4.2
Show the code
# Libraries (Packages) used
library(tidyverse)
library(iBats)
library(gt)
statics %>%
  group_by(Species) %>% 
  count() %>% 
  #arrange descending
  arrange(desc(n)) %>% 
  # rename n as count
  rename(`Bat Species` = Species, Count = n) %>% 
  # so table is produced with individual species on one row
  ungroup() %>% 
  # Table made gt()
  gt() %>% 
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")
      ),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  # Make bat scientific name italic
  tab_style(
    style = list(
      cell_text(style = "italic")
      ),
    locations = cells_body(
      columns = c(`Bat Species`)
    )
  ) %>% 
  tab_options(data_row.padding = px(2)) | Bat Species | Count | 
|---|---|
| Pipistrellus pipistrellus | 4972 | 
| Myotis spp. | 435 | 
| Barbastella barbastellus | 419 | 
| Nyctalus noctula | 315 | 
| Pipistrellus spp. | 283 | 
| Rhinolophus ferrumequinum | 160 | 
| Plecotus spp. | 136 | 
| Pipistrellus pygmaeus | 86 | 
| Rhinolophus hipposideros | 73 | 
| Pipistrellus nathusii | 32 | 
| Eptesicus serotinus | 12 | 
| Nyctalus leisleri | 3 | 
| Myotis nattereri | 2 | 
| Nyctalus spp. | 2 | 
More informative tables can be made by adding meta data such as the Month and Year of the observations. This is easy done on a tidy data conforming to the minimal data requirement with the date_time_info function in the iBats package. Table 4.3 shows the monthly count (bat passes) of Annex II species3; .
Show the code
# Libraries (Packages) used
library(tidyverse)
library(iBats)
library(flextable)
library(officer)
# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)
# Vector of month names used in the factor function
month_levels_full <- c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")
# re-order months column with rev so months run top to bottom of the table
statics_plus <- statics_plus %>%
   mutate(MonthFull = factor(MonthFull, levels = rev(month_levels_full), ordered = T))
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")
table_border <- fp_border(color = "black", width = 1) # from library(officer)
statics_plus %>%
  filter(Species %in% AnnexII) %>%
  group_by(MonthFull, Species) %>%
  count() %>%
  # arrange descending
  arrange(desc(MonthFull)) %>%
  # rename n as count
  rename(Month = MonthFull, `Bat Species` = Species, Count = n) %>%
  # Table made with flextable
  flextable(col_keys = colnames(.)) %>%
  italic(j = 2, italic = TRUE, part = "body") %>% 
  fontsize(part = "header", size = 12) %>%
  fontsize(part = "body", size = 12) %>%
  colformat_double(j = "Count", digits = 4, big.mark = ",") %>%
  width(j = 1, width = 2) %>%
  width(j = 2, width = 2.5) %>%
  width(j = 3, width = 1) %>%
  merge_v(j = 1) %>%
  border_inner_h(part = "body", border = table_border) %>%
  hline_bottom(part = "body", border = table_border) %>%
  bg(bg = "black", part = "header") %>%
  color(color = "white", part = "header")Month  | Bat Species  | Count  | 
|---|---|---|
May  | Barbastella barbastellus  | 7  | 
Rhinolophus ferrumequinum  | 9  | |
Rhinolophus hipposideros  | 10  | |
June  | Barbastella barbastellus  | 198  | 
Rhinolophus ferrumequinum  | 47  | |
Rhinolophus hipposideros  | 11  | |
July  | Barbastella barbastellus  | 49  | 
Rhinolophus ferrumequinum  | 7  | |
Rhinolophus hipposideros  | 4  | |
August  | Barbastella barbastellus  | 109  | 
Rhinolophus ferrumequinum  | 31  | |
Rhinolophus hipposideros  | 23  | |
September  | Barbastella barbastellus  | 9  | 
Rhinolophus ferrumequinum  | 64  | |
Rhinolophus hipposideros  | 17  | |
October  | Barbastella barbastellus  | 47  | 
Rhinolophus ferrumequinum  | 2  | |
Rhinolophus hipposideros  | 8  | 
Tables of numbers can be frustrating to read. To help readability the table can be annotated highlighting the count on a colour scale linked to the magnitude of the result. This is shown in Table 4.4 for the count of bats shown in Table 4.3.
Show the code
# Libraries (Packages) used
library(tidyverse)
library(flextable)
library(officer)
library(iBats)
# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)
# Vector of month names used in the factor function
month_levels_full <- c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")
# re-order months column with rev so months run top to bottom of the table
statics_plus <- statics_plus %>%
   mutate(MonthFull = factor(MonthFull, levels = rev(month_levels_full), ordered = T))
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")
table_border <- fp_border(color = "black", width = 1) # from library(officer)
table_data <- statics_plus %>%
  filter(Species %in% AnnexII) %>%
  group_by(MonthFull, Species) %>%
  count() %>%
  # arrange descending
  arrange(desc(MonthFull)) %>%
  # rename n as count
  rename(Month = MonthFull, `Bat Species` = Species, Count = n) 
# Find max and min counts
maxCount <- max(table_data$Count, na.rm = T)
minCount <- min(table_data$Count, na.rm = T)
#Make coloured palette
colourer <- scales::col_numeric(
  palette = c("transparent", "orangered2"),
  domain = c(minCount, maxCount)) 
# Table made with flextable
table_data %>% 
  flextable(col_keys = colnames(.)) %>%
  italic(j = 2, italic = TRUE, part = "body") %>% 
  fontsize(part = "header", size = 12) %>%
  fontsize(part = "body", size = 12) %>%
  colformat_double(j = "Count", digits = 4, big.mark = ",") %>%
  width(j = 1, width = 2) %>%
  width(j = 2, width = 2.5) %>%
  width(j = 3, width = 1) %>%
  merge_v(j = 1) %>%
  # Scale colour to count
  bg(bg = colourer, j = "Count", part = "body") %>%
  border_inner_h(part = "body", border = table_border) %>%
  hline_bottom(part = "body", border = table_border) %>%
  bg(bg = "black", part = "header") %>%
  color(color = "white", part = "header")Month  | Bat Species  | Count  | 
|---|---|---|
May  | Barbastella barbastellus  | 7  | 
Rhinolophus ferrumequinum  | 9  | |
Rhinolophus hipposideros  | 10  | |
June  | Barbastella barbastellus  | 198  | 
Rhinolophus ferrumequinum  | 47  | |
Rhinolophus hipposideros  | 11  | |
July  | Barbastella barbastellus  | 49  | 
Rhinolophus ferrumequinum  | 7  | |
Rhinolophus hipposideros  | 4  | |
August  | Barbastella barbastellus  | 109  | 
Rhinolophus ferrumequinum  | 31  | |
Rhinolophus hipposideros  | 23  | |
September  | Barbastella barbastellus  | 9  | 
Rhinolophus ferrumequinum  | 64  | |
Rhinolophus hipposideros  | 17  | |
October  | Barbastella barbastellus  | 47  | 
Rhinolophus ferrumequinum  | 2  | |
Rhinolophus hipposideros  | 8  | 
4.2 Summary Statistics
Summary statistics for the Common pipistrelle (Pipistrellus pipistrellus) observations at all static locations for each night is shown in Table 4.5; to make the table easier to read, the median column is highlighted with a graduated colour scale. The summary statistics are calculated with the favstats function from the mosaic package.
Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)
# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)
# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species == "Pipistrellus pipistrellus") %>% 
  group_by(Description, Night) %>% 
  tally()
# The summary statistics are saved into a variable riven_cond_stats 
cond_stats <- favstats(n~Description, data = grouped_data)
# riven_cond_stats is made into a the table (using the code below)
cond_stats %>% 
  # Create the table with the gt package
  gt() %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  gt_color_rows(median, palette = "ggsci::yellow_material") %>% 
  tab_options(data_row.padding = px(2)) | Description | min | Q1 | median | Q3 | max | mean | sd | n | missing | 
|---|---|---|---|---|---|---|---|---|---|
| Static 1 | 1 | 1.0 | 2 | 3.5 | 73 | 6.391304 | 15.570190 | 23 | 0 | 
| Static 2 | 1 | 6.0 | 10 | 17.5 | 42 | 12.031746 | 8.820738 | 63 | 0 | 
| Static 3 | 1 | 3.5 | 6 | 11.0 | 40 | 9.740741 | 10.323955 | 27 | 0 | 
| Static 4 | 10 | 33.5 | 78 | 155.5 | 438 | 115.129032 | 121.519749 | 31 | 0 | 
| Static 5 | 1 | 4.0 | 7 | 9.0 | 31 | 8.103448 | 6.586484 | 29 | 0 | 
Table 4.6 shows the summary statistics for Annex II Bat Species4 observations by month; with the maximum number of nightly bat passes highlighted. The table can be grouped by Month through switching Common and MonthFull names in the code below; this alternative table is shown in Table 4.7 with the mean (average) number of nightly bat passes highlighted.
Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)
# Annex II Bats (in the static)
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")
# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)
# List of bat common names and the scientific names
BatCommon <- c(
  "Barbastella barbastellus" = "Barbastelle",
  "Rhinolophus ferrumequinum" = "Greater horseshoe",
  "Rhinolophus hipposideros" = "Lesser horseshoe")
# From Scientific name create a Common Name Vector 
statics_plus$Common <-  unname(BatCommon[statics_plus$Species])
# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species %in% AnnexII) %>%
  group_by(Common, MonthFull, Night) %>% 
  tally() %>% 
  summarise(Minimum = round(min(n, na.rm = T), 2),
              Q1 = round(quantile(n, c(0.25), na.rm = T), 2),
              Mean = round(mean(n, na.rm = T), 2),
              Median = round(median(n, na.rm = T), 2),
              Q3 = round(quantile(n, c(0.75), na.rm = T), 2),
              Maximum = round(max(n, na.rm = T), 2),
              SD = round(sd(n, na.rm = T), 2),
              Nr = n())
# riven_cond_stats is made into a the table (using the code below)
grouped_data %>% 
  # Create the table with the gt package
  gt(rowname_col = "MonthFull",
    groupname_col = "Common") %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  tab_style(
    style = list(
      cell_fill(color = "midnightblue"),
      cell_text(color = "white"),
      cell_text(weight = "normal")
      ),
    locations = cells_body(
      columns = MonthFull
    )) %>% 
  gt_color_rows(Maximum, palette = "ggsci::purple_material") %>% 
  tab_options(data_row.padding = px(2)) | Minimum | Q1 | Mean | Median | Q3 | Maximum | SD | Nr | |
|---|---|---|---|---|---|---|---|---|
| Barbastelle | ||||||||
| May | 1 | 1.00 | 1.75 | 1.5 | 2.25 | 3 | 0.96 | 4 | 
| June | 1 | 1.50 | 18.00 | 18.0 | 31.50 | 42 | 15.67 | 11 | 
| July | 4 | 5.75 | 8.17 | 8.0 | 10.25 | 13 | 3.43 | 6 | 
| August | 1 | 3.00 | 7.27 | 6.0 | 8.00 | 22 | 6.13 | 15 | 
| September | 1 | 1.00 | 1.80 | 1.0 | 2.00 | 4 | 1.30 | 5 | 
| October | 4 | 5.00 | 6.71 | 6.0 | 7.00 | 13 | 2.98 | 7 | 
| Greater horseshoe | ||||||||
| May | 2 | 2.50 | 3.00 | 3.0 | 3.50 | 4 | 1.00 | 3 | 
| June | 1 | 1.50 | 3.13 | 3.0 | 4.00 | 7 | 1.81 | 15 | 
| July | 1 | 1.00 | 1.75 | 1.5 | 2.25 | 3 | 0.96 | 4 | 
| August | 1 | 1.00 | 2.07 | 2.0 | 3.00 | 6 | 1.39 | 15 | 
| September | 3 | 3.25 | 10.67 | 4.0 | 6.25 | 43 | 15.91 | 6 | 
| October | 1 | 1.00 | 1.00 | 1.0 | 1.00 | 1 | 0.00 | 2 | 
| Lesser horseshoe | ||||||||
| May | 1 | 1.00 | 2.50 | 1.5 | 3.00 | 6 | 2.38 | 4 | 
| June | 1 | 1.00 | 1.38 | 1.0 | 2.00 | 2 | 0.52 | 8 | 
| July | 1 | 1.50 | 2.00 | 2.0 | 2.50 | 3 | 1.41 | 2 | 
| August | 1 | 1.00 | 2.30 | 1.5 | 3.00 | 6 | 1.70 | 10 | 
| September | 1 | 1.50 | 2.43 | 3.0 | 3.00 | 4 | 1.13 | 7 | 
| October | 1 | 1.00 | 2.67 | 1.0 | 3.50 | 6 | 2.89 | 3 | 
Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)
# Annex II Bats (in the static)
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")
# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)
# List of bat common names and the scientific names
BatCommon <- c(
  "Barbastella barbastellus" = "Barbastelle",
  "Rhinolophus ferrumequinum" = "Greater horseshoe",
  "Rhinolophus hipposideros" = "Lesser horseshoe")
# From Scientific name create a Common Name Vector 
statics_plus$Common <-  unname(BatCommon[statics_plus$Species])
# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species %in% AnnexII) %>%
  group_by(Common, MonthFull, Night) %>% 
  tally() %>% 
  summarise(Minimum = round(min(n, na.rm = T), 2),
              Q1 = round(quantile(n, c(0.25), na.rm = T), 2),
              Mean = round(mean(n, na.rm = T), 2),
              Median = round(median(n, na.rm = T), 2),
              Q3 = round(quantile(n, c(0.75), na.rm = T), 2),
              Maximum = round(max(n, na.rm = T), 2),
              SD = round(sd(n, na.rm = T), 2),
              Nr = n())
# riven_cond_stats is made into a the table (using the code below)
grouped_data %>% 
  # Create the table with the gt package
  gt(rowname_col = "Common",
    groupname_col = "MonthFull") %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  gt_color_rows(Mean, palette = "ggsci::green_material") %>% 
  tab_options(data_row.padding = px(2)) | Minimum | Q1 | Mean | Median | Q3 | Maximum | SD | Nr | |
|---|---|---|---|---|---|---|---|---|
| May | ||||||||
| Barbastelle | 1 | 1.00 | 1.75 | 1.5 | 2.25 | 3 | 0.96 | 4 | 
| Greater horseshoe | 2 | 2.50 | 3.00 | 3.0 | 3.50 | 4 | 1.00 | 3 | 
| Lesser horseshoe | 1 | 1.00 | 2.50 | 1.5 | 3.00 | 6 | 2.38 | 4 | 
| June | ||||||||
| Barbastelle | 1 | 1.50 | 18.00 | 18.0 | 31.50 | 42 | 15.67 | 11 | 
| Greater horseshoe | 1 | 1.50 | 3.13 | 3.0 | 4.00 | 7 | 1.81 | 15 | 
| Lesser horseshoe | 1 | 1.00 | 1.38 | 1.0 | 2.00 | 2 | 0.52 | 8 | 
| July | ||||||||
| Barbastelle | 4 | 5.75 | 8.17 | 8.0 | 10.25 | 13 | 3.43 | 6 | 
| Greater horseshoe | 1 | 1.00 | 1.75 | 1.5 | 2.25 | 3 | 0.96 | 4 | 
| Lesser horseshoe | 1 | 1.50 | 2.00 | 2.0 | 2.50 | 3 | 1.41 | 2 | 
| August | ||||||||
| Barbastelle | 1 | 3.00 | 7.27 | 6.0 | 8.00 | 22 | 6.13 | 15 | 
| Greater horseshoe | 1 | 1.00 | 2.07 | 2.0 | 3.00 | 6 | 1.39 | 15 | 
| Lesser horseshoe | 1 | 1.00 | 2.30 | 1.5 | 3.00 | 6 | 1.70 | 10 | 
| September | ||||||||
| Barbastelle | 1 | 1.00 | 1.80 | 1.0 | 2.00 | 4 | 1.30 | 5 | 
| Greater horseshoe | 3 | 3.25 | 10.67 | 4.0 | 6.25 | 43 | 15.91 | 6 | 
| Lesser horseshoe | 1 | 1.50 | 2.43 | 3.0 | 3.00 | 4 | 1.13 | 7 | 
| October | ||||||||
| Barbastelle | 4 | 5.00 | 6.71 | 6.0 | 7.00 | 13 | 2.98 | 7 | 
| Greater horseshoe | 1 | 1.00 | 1.00 | 1.0 | 1.00 | 1 | 0.00 | 2 | 
| Lesser horseshoe | 1 | 1.00 | 2.67 | 1.0 | 3.50 | 6 | 2.89 | 3 | 
Socrates in Plato’s Republic (circa 380 BC)↩︎
note:- in this case it is a count of bat passes↩︎
https://ec.europa.eu/environment/nature/conservation/species/habitats_dir_en.htm↩︎
https://ec.europa.eu/environment/nature/conservation/species/habitats_dir_en.htm↩︎