4  Aggregation

I am the wisest man alive, for I know one thing, and that is that I know nothing.1

Data aggregation stands as a fundamental aspect of data analysis, representing the process by which tidy data (along with metadata) is collected or grouped and subsequently presented in a summarized format. In the realm of bat survey data, aggregation typically occurs through two primary methods:

  1. Aggregation over a specified time period.
  2. Spatial aggregation for locations or factors associated with a particular location (e.g., woodland).”

The bat survey data statics in the iBats package has 6,930 bat observations; a random selection of 5 rows is shown in Table 4.1.

Show the code
statics %>% # statics is a tidy data set from the iBats package
  sample_n(5) %>% 
  arrange(DateTime) %>% 
  # Table made with gt
  gt() %>% 
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")
      ),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  # Make bat scientific name italic
  tab_style(
    style = list(
      cell_text(style = "italic")
      ),
    locations = cells_body(
      columns = c(Species)
    )
  ) %>% 
  tab_options(data_row.padding = px(2)) %>% 
  cols_align(
    align = "left",
    columns = DateTime)
Table 4.1: Statics Bat Survey Data (5 random selected rows)
Description DateTime Species Longitude Latitude
Static 4 2016-06-13 02:04:27 Pipistrellus pipistrellus -3.591858 50.33123
Static 4 2016-08-05 00:09:43 Pipistrellus pipistrellus -3.591748 50.33136
Static 2 2016-08-12 22:23:46 Pipistrellus pipistrellus -3.592583 50.33323
Static 4 2016-08-24 00:12:15 Pipistrellus pipistrellus -3.592228 50.33091
Static 2 2016-08-29 21:37:44 Pipistrellus pipistrellus -3.592583 50.33323

4.1 Count of Bats

The simplest form of aggregation is a count of bats2; as shown in Table 4.2

Show the code
# Libraries (Packages) used
library(tidyverse)
library(iBats)
library(gt)

statics %>%
  group_by(Species) %>% 
  count() %>% 
  #arrange descending
  arrange(desc(n)) %>% 
  # rename n as count
  rename(`Bat Species` = Species, Count = n) %>% 
  # so table is produced with individual species on one row
  ungroup() %>% 
  # Table made gt()
  gt() %>% 
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")
      ),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  # Make bat scientific name italic
  tab_style(
    style = list(
      cell_text(style = "italic")
      ),
    locations = cells_body(
      columns = c(`Bat Species`)
    )
  ) %>% 
  tab_options(data_row.padding = px(2)) 
Table 4.2: A Count by Species
Bat Species Count
Pipistrellus pipistrellus 4972
Myotis spp. 435
Barbastella barbastellus 419
Nyctalus noctula 315
Pipistrellus spp. 283
Rhinolophus ferrumequinum 160
Plecotus spp. 136
Pipistrellus pygmaeus 86
Rhinolophus hipposideros 73
Pipistrellus nathusii 32
Eptesicus serotinus 12
Nyctalus leisleri 3
Myotis nattereri 2
Nyctalus spp. 2

More informative tables can be made by adding meta data such as the Month and Year of the observations. This is easy done on a tidy data conforming to the minimal data requirement with the date_time_info function in the iBats package. Table 4.3 shows the monthly count (bat passes) of Annex II species3; .

Show the code
# Libraries (Packages) used
library(tidyverse)
library(iBats)
library(flextable)
library(officer)

# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)

# Vector of month names used in the factor function
month_levels_full <- c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")

# re-order months column with rev so months run top to bottom of the table
statics_plus <- statics_plus %>%
   mutate(MonthFull = factor(MonthFull, levels = rev(month_levels_full), ordered = T))

AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")

table_border <- fp_border(color = "black", width = 1) # from library(officer)

statics_plus %>%
  filter(Species %in% AnnexII) %>%
  group_by(MonthFull, Species) %>%
  count() %>%
  # arrange descending
  arrange(desc(MonthFull)) %>%
  # rename n as count
  rename(Month = MonthFull, `Bat Species` = Species, Count = n) %>%
  # Table made with flextable
  flextable(col_keys = colnames(.)) %>%
  italic(j = 2, italic = TRUE, part = "body") %>% 
  fontsize(part = "header", size = 12) %>%
  fontsize(part = "body", size = 12) %>%
  colformat_double(j = "Count", digits = 4, big.mark = ",") %>%
  width(j = 1, width = 2) %>%
  width(j = 2, width = 2.5) %>%
  width(j = 3, width = 1) %>%
  merge_v(j = 1) %>%
  border_inner_h(part = "body", border = table_border) %>%
  hline_bottom(part = "body", border = table_border) %>%
  bg(bg = "black", part = "header") %>%
  color(color = "white", part = "header")
Table 4.3: A Count of Annex II Bats Grouped by Month

Month

Bat Species

Count

May

Barbastella barbastellus

7

Rhinolophus ferrumequinum

9

Rhinolophus hipposideros

10

June

Barbastella barbastellus

198

Rhinolophus ferrumequinum

47

Rhinolophus hipposideros

11

July

Barbastella barbastellus

49

Rhinolophus ferrumequinum

7

Rhinolophus hipposideros

4

August

Barbastella barbastellus

109

Rhinolophus ferrumequinum

31

Rhinolophus hipposideros

23

September

Barbastella barbastellus

9

Rhinolophus ferrumequinum

64

Rhinolophus hipposideros

17

October

Barbastella barbastellus

47

Rhinolophus ferrumequinum

2

Rhinolophus hipposideros

8

Tables of numbers can be frustrating to read. To help readability the table can be annotated highlighting the count on a colour scale linked to the magnitude of the result. This is shown in Table 4.4 for the count of bats shown in Table 4.3.

Show the code
# Libraries (Packages) used
library(tidyverse)
library(flextable)
library(officer)
library(iBats)

# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)

# Vector of month names used in the factor function
month_levels_full <- c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")

# re-order months column with rev so months run top to bottom of the table
statics_plus <- statics_plus %>%
   mutate(MonthFull = factor(MonthFull, levels = rev(month_levels_full), ordered = T))

AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")

table_border <- fp_border(color = "black", width = 1) # from library(officer)

table_data <- statics_plus %>%
  filter(Species %in% AnnexII) %>%
  group_by(MonthFull, Species) %>%
  count() %>%
  # arrange descending
  arrange(desc(MonthFull)) %>%
  # rename n as count
  rename(Month = MonthFull, `Bat Species` = Species, Count = n) 

# Find max and min counts
maxCount <- max(table_data$Count, na.rm = T)
minCount <- min(table_data$Count, na.rm = T)

#Make coloured palette
colourer <- scales::col_numeric(
  palette = c("transparent", "orangered2"),
  domain = c(minCount, maxCount)) 

# Table made with flextable
table_data %>% 
  flextable(col_keys = colnames(.)) %>%
  italic(j = 2, italic = TRUE, part = "body") %>% 
  fontsize(part = "header", size = 12) %>%
  fontsize(part = "body", size = 12) %>%
  colformat_double(j = "Count", digits = 4, big.mark = ",") %>%
  width(j = 1, width = 2) %>%
  width(j = 2, width = 2.5) %>%
  width(j = 3, width = 1) %>%
  merge_v(j = 1) %>%
  # Scale colour to count
  bg(bg = colourer, j = "Count", part = "body") %>%
  border_inner_h(part = "body", border = table_border) %>%
  hline_bottom(part = "body", border = table_border) %>%
  bg(bg = "black", part = "header") %>%
  color(color = "white", part = "header")
Table 4.4: A Count (Highlighted) of Annex II Bats Grouped by Month

Month

Bat Species

Count

May

Barbastella barbastellus

7

Rhinolophus ferrumequinum

9

Rhinolophus hipposideros

10

June

Barbastella barbastellus

198

Rhinolophus ferrumequinum

47

Rhinolophus hipposideros

11

July

Barbastella barbastellus

49

Rhinolophus ferrumequinum

7

Rhinolophus hipposideros

4

August

Barbastella barbastellus

109

Rhinolophus ferrumequinum

31

Rhinolophus hipposideros

23

September

Barbastella barbastellus

9

Rhinolophus ferrumequinum

64

Rhinolophus hipposideros

17

October

Barbastella barbastellus

47

Rhinolophus ferrumequinum

2

Rhinolophus hipposideros

8

4.2 Summary Statistics

Summary statistics for the Common pipistrelle (Pipistrellus pipistrellus) observations at all static locations for each night is shown in Table 4.5; to make the table easier to read, the median column is highlighted with a graduated colour scale. The summary statistics are calculated with the favstats function from the mosaic package.

Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)

# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)

# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species == "Pipistrellus pipistrellus") %>% 
  group_by(Description, Night) %>% 
  tally()

# The summary statistics are saved into a variable riven_cond_stats 
cond_stats <- favstats(n~Description, data = grouped_data)

# riven_cond_stats is made into a the table (using the code below)
cond_stats %>% 
  # Create the table with the gt package
  gt() %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  gt_color_rows(median, palette = "ggsci::yellow_material") %>% 
  tab_options(data_row.padding = px(2)) 
Table 4.5: Common Pipstrelle Observations (Passes) at the Static Locations
Description min Q1 median Q3 max mean sd n missing
Static 1 1 1.0 2 3.5 73 6.391304 15.570190 23 0
Static 2 1 6.0 10 17.5 42 12.031746 8.820738 63 0
Static 3 1 3.5 6 11.0 40 9.740741 10.323955 27 0
Static 4 10 33.5 78 155.5 438 115.129032 121.519749 31 0
Static 5 1 4.0 7 9.0 31 8.103448 6.586484 29 0

Table 4.6 shows the summary statistics for Annex II Bat Species4 observations by month; with the maximum number of nightly bat passes highlighted. The table can be grouped by Month through switching Common and MonthFull names in the code below; this alternative table is shown in Table 4.7 with the mean (average) number of nightly bat passes highlighted.

Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)

# Annex II Bats (in the static)
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")

# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)

# List of bat common names and the scientific names
BatCommon <- c(
  "Barbastella barbastellus" = "Barbastelle",
  "Rhinolophus ferrumequinum" = "Greater horseshoe",
  "Rhinolophus hipposideros" = "Lesser horseshoe")

# From Scientific name create a Common Name Vector 
statics_plus$Common <-  unname(BatCommon[statics_plus$Species])

# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species %in% AnnexII) %>%
  group_by(Common, MonthFull, Night) %>% 
  tally() %>% 
  summarise(Minimum = round(min(n, na.rm = T), 2),
              Q1 = round(quantile(n, c(0.25), na.rm = T), 2),
              Mean = round(mean(n, na.rm = T), 2),
              Median = round(median(n, na.rm = T), 2),
              Q3 = round(quantile(n, c(0.75), na.rm = T), 2),
              Maximum = round(max(n, na.rm = T), 2),
              SD = round(sd(n, na.rm = T), 2),
              Nr = n())

# riven_cond_stats is made into a the table (using the code below)
grouped_data %>% 
  # Create the table with the gt package
  gt(rowname_col = "MonthFull",
    groupname_col = "Common") %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  tab_style(
    style = list(
      cell_fill(color = "midnightblue"),
      cell_text(color = "white"),
      cell_text(weight = "normal")
      ),
    locations = cells_body(
      columns = MonthFull
    )) %>% 
  gt_color_rows(Maximum, palette = "ggsci::purple_material") %>% 
  tab_options(data_row.padding = px(2)) 
Table 4.6: Nightly Species Observations (Passes) by Month
Minimum Q1 Mean Median Q3 Maximum SD Nr
Barbastelle
May 1 1.00 1.75 1.5 2.25 3 0.96 4
June 1 1.50 18.00 18.0 31.50 42 15.67 11
July 4 5.75 8.17 8.0 10.25 13 3.43 6
August 1 3.00 7.27 6.0 8.00 22 6.13 15
September 1 1.00 1.80 1.0 2.00 4 1.30 5
October 4 5.00 6.71 6.0 7.00 13 2.98 7
Greater horseshoe
May 2 2.50 3.00 3.0 3.50 4 1.00 3
June 1 1.50 3.13 3.0 4.00 7 1.81 15
July 1 1.00 1.75 1.5 2.25 3 0.96 4
August 1 1.00 2.07 2.0 3.00 6 1.39 15
September 3 3.25 10.67 4.0 6.25 43 15.91 6
October 1 1.00 1.00 1.0 1.00 1 0.00 2
Lesser horseshoe
May 1 1.00 2.50 1.5 3.00 6 2.38 4
June 1 1.00 1.38 1.0 2.00 2 0.52 8
July 1 1.50 2.00 2.0 2.50 3 1.41 2
August 1 1.00 2.30 1.5 3.00 6 1.70 10
September 1 1.50 2.43 3.0 3.00 4 1.13 7
October 1 1.00 2.67 1.0 3.50 6 2.89 3
Show the code
# Libraries (Packages) used
library(tidyverse)
library(mosaic)
library(gt)
library(gtExtras)
library(iBats)

# Annex II Bats (in the static)
AnnexII <- c("Barbastella barbastellus", "Rhinolophus ferrumequinum", "Rhinolophus hipposideros")

# Add data and time information to the statics data using the iBats::date_time_info
statics_plus <- iBats::date_time_info(statics)

# List of bat common names and the scientific names
BatCommon <- c(
  "Barbastella barbastellus" = "Barbastelle",
  "Rhinolophus ferrumequinum" = "Greater horseshoe",
  "Rhinolophus hipposideros" = "Lesser horseshoe")

# From Scientific name create a Common Name Vector 
statics_plus$Common <-  unname(BatCommon[statics_plus$Species])

# Group by Description and Night and Count the Observations
grouped_data <- statics_plus %>% 
  filter(Species %in% AnnexII) %>%
  group_by(Common, MonthFull, Night) %>% 
  tally() %>% 
  summarise(Minimum = round(min(n, na.rm = T), 2),
              Q1 = round(quantile(n, c(0.25), na.rm = T), 2),
              Mean = round(mean(n, na.rm = T), 2),
              Median = round(median(n, na.rm = T), 2),
              Q3 = round(quantile(n, c(0.75), na.rm = T), 2),
              Maximum = round(max(n, na.rm = T), 2),
              SD = round(sd(n, na.rm = T), 2),
              Nr = n())



# riven_cond_stats is made into a the table (using the code below)
grouped_data %>% 
  # Create the table with the gt package
  gt(rowname_col = "Common",
    groupname_col = "MonthFull") %>% 
  # Style the header to black fill and white text
  tab_style(
    style = list(
      cell_fill(color = "black"),
      cell_text(color = "white", weight = "bold")),
    locations = cells_column_labels(
      columns = c(everything())
    )
  ) %>% 
  gt_color_rows(Mean, palette = "ggsci::green_material") %>% 
  tab_options(data_row.padding = px(2)) 
Table 4.7: Nightly Species Observations by Month (Alternative Grouping)
Minimum Q1 Mean Median Q3 Maximum SD Nr
May
Barbastelle 1 1.00 1.75 1.5 2.25 3 0.96 4
Greater horseshoe 2 2.50 3.00 3.0 3.50 4 1.00 3
Lesser horseshoe 1 1.00 2.50 1.5 3.00 6 2.38 4
June
Barbastelle 1 1.50 18.00 18.0 31.50 42 15.67 11
Greater horseshoe 1 1.50 3.13 3.0 4.00 7 1.81 15
Lesser horseshoe 1 1.00 1.38 1.0 2.00 2 0.52 8
July
Barbastelle 4 5.75 8.17 8.0 10.25 13 3.43 6
Greater horseshoe 1 1.00 1.75 1.5 2.25 3 0.96 4
Lesser horseshoe 1 1.50 2.00 2.0 2.50 3 1.41 2
August
Barbastelle 1 3.00 7.27 6.0 8.00 22 6.13 15
Greater horseshoe 1 1.00 2.07 2.0 3.00 6 1.39 15
Lesser horseshoe 1 1.00 2.30 1.5 3.00 6 1.70 10
September
Barbastelle 1 1.00 1.80 1.0 2.00 4 1.30 5
Greater horseshoe 3 3.25 10.67 4.0 6.25 43 15.91 6
Lesser horseshoe 1 1.50 2.43 3.0 3.00 4 1.13 7
October
Barbastelle 4 5.00 6.71 6.0 7.00 13 2.98 7
Greater horseshoe 1 1.00 1.00 1.0 1.00 1 0.00 2
Lesser horseshoe 1 1.00 2.67 1.0 3.50 6 2.89 3

  1. Socrates in Plato’s Republic (circa 380 BC)↩︎

  2. note:- in this case it is a count of bat passes↩︎

  3. https://ec.europa.eu/environment/nature/conservation/species/habitats_dir_en.htm↩︎

  4. https://ec.europa.eu/environment/nature/conservation/species/habitats_dir_en.htm↩︎