Categories
Araba Bioscan

Araba Bioscan SLAM 24 February to 3 March 2023

Categories
Araba Bioscan

Araba Bioscan SLAM 17-24 February 2023

Categories
Autonomous Moth Trap

Autonomous Moth Trap Image Pipeline

This is the seventh post in a series:

This post summarises the image/data pipeline currently in place for the image capture and processing from two Autonomous Moth Traps:

  • AMT-2: Raspberry Pi 4.0 unit with Logitech BRIO camera collecting images via motion detection, as in Danish design:
    • Positioned on ground in fixed location for 250 nights between 22 May 2021 and 6 May 2022, collecting 967,511 images
    • Later fitted to a wall so far for 417 nights between 7 August 2022 and 28 September 2023 (ongoing), collecting 1,124,600 images to date
  • AMT-Alpha: Raspberry Pi Zero unit with Raspberry Pi HQ camera and 6 mm lens collecting images via timelapse:
    • Positioned on ground at various locations for a total of 122 nights between 20 November 2021 and 28 September 2023, collecting 82,833 images

Videos made from the images collected by each trap can be seen at https://vimeo.com/user157042939.

Together these traps have collected 2,174,944 images. The pipeline discussed below has identified and extracted 20,209,207 features of interest (“blobs”). This post documents the current processing, but I am reviewing all steps and expect to begin applying machine learning tools shortly.

Python software and associated YAML configuration files can be accessed from the dhobern / AMT repository in Github. The motion detection software is from the Motion-Project / motion repository.

AMT-2 image capture

AMT-2 is triggered daily using the following crontab settings (shown for 29 September 2023):

# m h  dom mon dow   command
# Light Trap Moths
4 19 * * * python /home/pi/lighton.py
5 19 * * * motion
6 19 * * * /home/pi/setCamera.sh
5 3 * * * pkill motion
6 3 * * * python /home/pi/lightoff.py
7 3 * * * /home/pi/backup.sh
0 12 * * * python3 /home/pi/amt_crontab.py sunset+60 sunrise-120 480

These settings indicate that the trap will next execute the following series of actions:

  • 19:04 – turn on lights (high-power LEDs and ring-light)
  • 19:05 – begin motion detection
  • 19:06 – apply camera parameters once the camera is started
  • 03:05 – end motion detection (480 minutes after start)
  • 03:06 – turn off lights
  • 03:07 – copy images, configuration settings (YAML) and crontab and camera settings to staging folder for SFTP access
  • 12:00 – reset crontab times so motion detection on the next date will start an hour after sunset and end two hours before sunrise or after 480 minutes, whichever is earlier

The configuration settings for this trap are currently as follows:

event:
  basisofrecord: Machine observation
  coordinateUncertaintyInMeters: 2
  decimalLatitude: -35.264047
  decimalLongitude: 149.083427
  geodeticDatum: WGS84
  recordedby:
    email: dhobern@gmail.com
    name: Donald Hobern
    orcid: 0000-0001-6492-4016
provenance:
  capture:
    camera: Logitech BRIO 4K
    illumination: 10-inch ring light
    imageheight: 2160
    imagewidth: 3840
    mode: Motion
    operatingdistance: 250
    processor: Raspberry Pi 4
    unitname: AMT-2
    uvlight: High-power LED tube - 6 UV, 1 green, 1 blue, 1 white

The camera settings are the output from:

v4l2-ctl -d /dev/video0 --list-ctrls

The folder of images and metadata is automatically transferred over SFTP at 08:30 each data onto a (Windows 11) desktop machine for subsequent processing.

AMT-Alpha image capture

On startup, AMT-Alpha launches a Python script amt_modeselector.py. This awaits triggering via a push button on the outside of the unit. When this button is pushed, the script selects an action based on the position of a rotary switch which may in four states:

  • Automatic – script does nothing, assuming that a cron job is scheduled to start amt_timelapse.py at a specified time. This mode is for unattended use.
  • Manual – script immediately launches amt_timelapse.py.
  • Transfer – script runs amt_transfer.py. If an USB drive has been inserted in the external USB port, this then reads configuration options from /media/usb/AMT/amt_transfer.yaml and may transfer images, configuration files and logs onto to the USB drive and new configuration files or updated software onto the device.
  • Off – script triggers a soft shutdown of the device.

Regardless of whether image capture is triggered manually or via crontab, amt_transfer.py reads configuration settings specified in amt_settings.yaml (which overrides default values set for the unit in amt_unit.yaml and underlying default values for the software specified in amt_defaults.yaml). If a GPS sensor is attached, the unit inserts coordinates into the configuration metadata via a temporary YAML file amt_location.yaml. The complete final configuration is stored as an output file along with the images captured.

The following configuration file is from a run of AMT-Alpha on 28 September 2023. This included a 120-second delay before collecting images at 20-second intervals:

_configurationfiles:
- /home/pi/amt_defaults.yaml
- /home/pi/amt_unit.yaml
- /home/pi/amt_settings.yaml
- /home/pi/amt_location.yaml
event:
  basisofrecord: Machine observation
  coordinateTimestamp: '2023-09-28T19:03:11.621919+10:00'
  coordinateUncertaintyInMeters: 1
  decimalLatitude: -35.264043
  decimalLongitude: 149.08358
  geodeticDatum: WGS84
  lunarPhase: Full Moon
  recordedby:
    email: dhobern@gmail.com
    name: Donald Hobern
    orcid: 0000-0001-6492-4016
  sunriseTime: '2023-09-29T05:46:00+10:00'
  sunsetTime: '2023-09-28T18:04:00+10:00'
provenance:
  capture:
    awb_gains:
    - 2.8
    - 1.6
    awb_mode: 'off'
    brightness: 60
    camera: Raspberry Pi HQ + 6mm Wide Angle Lens
    contrast: 35
    envsensor: DHT22
    folder: /home/pi/AMT/
    gpioenvdata: 9
    gpioenvpower: 10
    gpiogpspower: 24
    gpiogreen: 25
    gpiolights: 26
    gpiomanualmode: 22
    gpiomodetrigger: 16
    gpiored: 7
    gpioshutdownmode: 17
    gpiotransfermode: 27
    gpssensor: BN220
    illumination: 10-inch ring light
    imageheight: 3040
    imagewidth: 4056
    initialdelay: 120
    interval: 20
    maximages: 720
    meter_mode: matrix
    mode: TimeLapse
    operatingdistance: 265
    processor: Raspberry Pi Zero W
    program: /home/pi/amt_modeselector.py
    quality: 50
    saturation: 0
    sharpness: 70
    transferimages: true
    trigger: Manual
    unitname: AMT-alpha
    uvlight: High-power LED tube - 4 UV, 1 green, 1 blue
    version: 0.9.2

Images and configuration files may be transferred for processing via a USB drive or SFTP.

Segmenting images

Images from both traps have been processed using SegmentImages.py, initially based on the published Danish code. This uses OpenCV to detect objects of interest (“blobs”) and then applies a cost calculation to determine which blobs are likely to represent the same insect in consecutive images.

The cost calculation is based on costs in five dimensions (calculated in amt_tracker.py).

  • Size – 0 if the two blobs have the same number of pixels, 1 if one blob is at least four times the size of the other, with linear interpolation for intermediate values
  • Distance – 0 if the centroids of the two blobs are within 25 pixels of one another, 0.01 if the two blobs overlap or their centroids are within 100 pixels, 0.02 if they are within 250 pixels, and in all other cases the distance divided by 4405 (as the maximum distance possible on the screen)
  • Color – crude comparison of similarity of colours in blobs. The pixels in each blob are assigned to one of eight cells in RGB colourspace (intensity less than or greater than 128 for each of the RGB components) identified as K for “black”, R for “red”, G for “green”, B for “blue”, C for”cyan”, M for “magenta”, Y for “yellow” and W for “white”. Blobs are then assigned a colour string including the letters for all cells including at least 2% of the pixels in the blob. A cost of 1/8 is then assigned for each colour letter associated with one blob and not the other.
  • Direction – 0 if this is interpreted to be the first or second detection of a species, otherwise 0 if the position is exactly aligned with the direction between the last two detections, 1 if the position is in exactly the reverse direction, with linear interpolation for intermediate angles.
  • Age – allowing for insects disappearing and reappearing within five consecutive images. 0 if the blobs are in consecutive images, 1 if last seen five images previously, with linear interpolation for intermediate ages.

These five costs are then assigned weights based on a subjective (slightly tested) assessment of their relative importance. The weights applied have generally been 4 for Size and for Distance, 2 for Direction and Colour and 1 for Age. This means that the weighted cost for assuming two blobs are related is a distance in a hypercube with sides measuring 4, 4, 2, 2 and 1 units, i.e. with a hypoteneuse length or maximum weight of sqrt(41). These are then normalised to the range 0 to 1. Only weighted costs below 0.25 are considered plausible redetections.

Blobs are then assigned to “tracks” (series of locations of the same presumed insect over multiple images) based on an effort to minimise total cost.

The Python code creates a data subfolder containing:

  • amt_image.csv – CSV list of all images captured by the unit, including date and time and associated temperature and humidity if these were collected.
  • amt_blob.csv – CSV list of all blobs, including source image, bounding box, size, cost calculations and other variables. Each record also includes a track identifier and a changed flag indicating whether the blob was new or altered compared to earlier images. A sample is included as the image at the top of this post.
  • blobs – a folder containing segmented JPEG images for all blobs with the changed flag set to True.

This process successfully links many blobs into tracks but is also prone to merge or confuse tracks when insects are very active. The weightings are arbitrary, and tuning the weights might improve the process. Tracks are only a convenience to simplify later stages in the process.

Editing tracks

Another Python program, TrackEditor.py, is used to edit the tracks and associate them with species (or higher taxon identifications). This is a crude Tkinter application that loads data from amt_blob.csv along with the associated blob images and presents these for review and identification. Results are written into amt_track.csv. This lists the tracks and associates them with the name for the associated taxon. The editor allows tracks to be split and merged, so it also rewrites amt_blob.csv with revised track identifiers for the blobs.

The following image shows the TrackEditor window for some of the insects recorded by AMT-2 on the night of 28/29 September 2023. Clearly, several insects have been combined into a single track with id 187 (the 103 in parentheses gives the mean length of the sides of the associated images). Similarly, tracks 220 and 225 are for the same insect and can be joined.

The available operations are:

  • Clicking on the first image in a track joins the track to the previous track.
  • Clicking on any other image splits the track into two tracks, with the second track beginning with the clicked image.
  • Clicking the link icons (to the right of the track identifiers) on any two tracks merges them into a single track.
  • A scientific name can be entered into the text field for each track – a taxon dictionary supports autocompletion.
  • The three letter codes assign common higher taxon names to the track (Insecta, Coleoptera, Diptera, Hymenoptera, Lepidoptera, Trichoptera, Hemiptera, Tortricidae, Oecophoridae, Formicidae and Araneae).
  • The first of the three icons opens a larger image view for the first image in the track with buttons to step through the track.
  • The second icon deletes the track.
  • The final icon opens a dialog allowing one or more images from the track to be selected and submitted as a new observation via the iNaturalist API.

The following image shows the result of clicking on the first image of track 225 to merge it with track 220 and the larger image view for track 194.

The following image shows the result of splitting and organising track 187 and of adding two species identifications.

To date, this editor has been used to label 15,719 tracks containing 369,319 segmented images for approximately 350 taxa. Many images are series with very little inter-frame variation. Many taxa are larger groupings such as Diptera or Larentiinae.

Next steps

Labeling tracks (and hence blobs) with identifications is time-consuming but should allow a rich training set to be prepared with images representing a large proportion of the local fauna.

Sufficient images may already have been tagged to support at least training a model to group insects into broad categories and discard images that do not clearly represent individual insects. The outputs from such a process could then speed preparation of species level training sets.

Categories
Araba Bioscan

Araba Bioscan SLAM 10-17 February 2023

Categories
Biodiversity Informatics Lepidoptera Species Lists

Updating Global Lepidoptera Index for Psychidae

This is a small update to the recent post on updating Global Lepidoptera Index (GLI) for Elachistinae species. I have subsequently reworked GLI for Psychidae, based primarily on:

  • Sobczyk, T. (2013) World Catalogue of Insects Volume 10, Psychidae (Lepidoptera). 1–467 pp.
  • Arnscheid, W.R. & Weidlich, M. (2017) Microlepidoptera of Europe Volume 8, Psychidae. 1–356 pp.
  • Papers known to Google Scholar relating to Psychidae and published since 2012 (many from Zootaxa, smaller numbers from Entomofauna, SHILAP, DEZ, etc.).

Names for Australian Psychidae in GLI were already largely up to date owing to earlier efforts to align with Nielsen, E.S., Edwards, E.D. & Rangsi, T.V. (1996) Checklist of the Lepidoptera of Australia (Monographs on Australian Lepidoptera Volume 4).

However, the coverage for the rest of the family reflected the original digitisation of the NHM card index. The card index itself seems to have been maintained less thoroughly than for many other families. Names for Psychidae in LepIndex reflect very dated concepts for genera and species synonymy.

The recent sources for this family vary to some degree in assignment of genera to subfamilies and tribes and in use of subgenera. GLI now follows Sobczyk 2013 in these respects, but overrides for European species from Arnscheid & Weidlich 2017.

Following all updates, the number of species known within the family has risen from 1,118 to 1,454. However, the total number of species names (including both accepted names and all synonyms) has more than doubled relative to LepIndex. Much of this is because of changes in generic placement and synonmy, although significant numbers of species and names even from as early as the 1970s were missing from the card index.

Overlap in species names within the family Psychidae between LepIndex and GLI. Names are considered to be a full match if spelling and authorship are identical (including parentheses) and if the two datasets give the same accepted name for the associated species.

Of the 2,938 species names now included in GLI, only 418 exactly match a name in LepIndex and also map to the same accepted species name in both datasets. The vast majority of accepted psychid names in LepIndex are no longer considered correct.

Even with many historical names now synonymised, updating Psychidae in GLI resulted in a 30% growth in the number of accepted species recorded for the family. This is in line with the estimates in the earlier Elachistinae post that between 27% and 41% of all accepted Lepidoptera species are missing from Lepindex and that around 40,000 more species still need to be added to the dataset.

Categories
Biodiversity Informatics Lepidoptera Species Lists

Updating Global Lepidoptera Index for Elachistinae

Background

Until 2022, Catalogue of Life (COL) and GBIF still relied on the NHM LepIndex dataset for names for almost all Lepidoptera (butterflies and moths). This is now superseded by a revised version of LepIndex maintained in TaxonWorks as the Global Lepidoptera Index (GLI). See this earlier post for more detail.

Methods

The concept used in LepIndex for the gelechioid family Elachistidae corresponded to what we now treat as a subfamily Elachistinae. At the time of its last import into COL, LepIndex had 491 scientific names associated with this (sub-)family, organised as follows:

  • Family – 1 accepted
  • Genus – 35 accepted
  • Species – 410 accepted, 1 provisionally accepted, 40 synonyms, 2 ambiguous synonyms
  • Subspecies – 2 accepted

In 2019, Lauri Kaila published An annotated catalogue of Elachistinae of the World (Lepidoptera: Gelechioidea: Elachistidae) in Zootaxa. I had already brought GLI up to date for the Australian Elachistinae treated in his 2011 Monographs of Australian Lepidoptera volume, so I decided to take the time also to update the remainder of this subfamily and to include all post-2019 species I could find. This is now completed, and GLI now includes 1284 names for the group. This total comprises names in Kaila 2019, those from newer papers, fossil names from LepIndex and a few nomina dubia that were not in the catalogue but seem plausibly to refer to elachistine moths. I was not rigorous about adding every historical combination for epithets that have passed through multiple genera, but original combinations and current combinations should all be present, as should original combinations for all synonyms. I did not update the micro-references that were already in place for older names, but the newer names link to structured citations.

Totals are now as follows:

  • Subfamily – 1 accepted
  • Genus – 14 accepted, 50 synonyms
  • Species – 819 accepted, 392 synonyms
  • Infraspecific taxa – 1 accepted, 7 synonyms

About five genera and around a dozen other species that were under Elachistidae in LepIndex previously have been moved to other families in the Lepidoptera. Many of these cases are discussed by Kaila, although a few represent highly outdated placements in the NHM catalogue that were apparently not even considered worth discussing. Many small genera have been synonymised into Elachista, Perittia or Stephensia. Four fossil genera are not treated by Kaila but are retained from LepIndex.

I fixed multiple misspellings that occurred in LepIndex either because information on the index cards was incorrect or during transcription into digital format. Despite the scale of the publication, I found no obvious misspellings in Kaila 2019.

Results

Based on these raw numbers, it is clear that LepIndex lacked around 50% of the currently expected number of accepted species for the family and that many synonyms were also missing. The actual situation was even more serious than this appears, because many names were accepted by LepIndex are now considered synonyms, and vice versa.

Here is a summary of results from the largest genus, Elachista. LepIndex had 355 names associated with 327 accepted species in this genus, whereas GLI has 1,046 names for 716 accepted species.

Overlap in species names within the genus Elachista between LepIndex and GLI. Names are considered to be a full match if spelling and authorship are identical (including parentheses) and if the two datasets give the same accepted name for the associated species.

Just 183 (56% of 327) accepted names in LepIndex exactly matched the spelling, authorship and status, and only 9 (32% of 28) synonyms exactly matched the spelling, authorship, status and accepted name offered by GLI. If variation in authorship (mostly missing years and/or parentheses) is ignored, these totals rise to 200 accepted names and 12 synonyms that match the expected species.

81 (25%) of the names accepted for Elachista species by LepIndex are now considered synonyms for other species in the genus. 36 accepted names (11%) now refer to species outside this genus.

6 (21%) of the LepIndex synonyms in this genus are now treated as synonyms for different species

In other words, of the 365 names that LepIndex associated with species in the genus Elachista, even ignoring issues with authorship strings, just 212 (58%) directed users to the currently accepted name for a species.

Reviewing this not from the perspective of what the taxonomic community knows and what names are actually in circulation for species in the genus Elachista (again ignoring issues with authorship):

  • Nearly 70% (507 of 716) of the currently accepted species names in Elachista were unknown to LepIndex/COL/GBIF a year ago
  • 78% (815 of 1,045) of the names now in TaxonWorks for Elachista species were unknown or incorrectly handled a year ago

Discussion

Elachistinae forms perhaps 0.3-0.4% of the total described Lepidoptera fauna, so these corrections are only a small step towards delivering a comprehensive and reliable catalogue for world Lepidoptera. This subfamily now joins Nepticuloidea, Gracillariidae, Gelechiidae, Lecithoceridae, Alucitidae, Pterophoridae, and Tortricidae as groups that are in good condition in the COL Checklist. Preparations are well under way to bring in some other major family-rank datasets that have been prepared over many years by dedicated groups of taxonomists. Both Geometridae and Bombycoidea are likely to be replaced in the next few months.

The rest of the Lepidoptera is covered by aging datasets. The Global Butterfly Information System dataset (GloBIS/GART) may soon be updated. This covers the Pieridae and Papilionidae. I am working on a refresh for Gaden S. Robinson’s Tineidae dataset which was last updated in 2011. Even the Nepticuloidea (last updated in 2016) is urgently awaiting a planned update. All the rest comes from LepIndex.

The following table compares accepted species counts for the same taxa in different datasets. This is a crude metric – if large numbers of names that should be treated as synonyms are included as accepted species names, this may inflate numbers. However, these numbers show clearly that effort to clean up LepIndex data always leads to significant increases in record counts.

TaxonLepIndexGLIRevisedYear↑%
NepticuloideaNA985107320169
GracillariidaeNA17462013202215
Elachistinae4108198192023100
Gelechiidae463947665799202325
Lecithoceridae78015181518202395
Alucitoidea186246260202340
PterophoroideaNA10571574202349
Tortricidae8697948511360201831
Geometridae212602249723969202213
Bombycoidea346351156617202291
Total[43213]482345500227
Total excl.
Geometridae
[21953]255373103341
Other
Lepidoptera
99690108627[126606]
[140563]
27
41
All
Lepidoptera
[142903]156861[181608]
[195565]
27
41
Comparison of accepted species counts for different Lepidoptera taxa between a) the last version of LepIndex imported into COL, b) Global Lepidoptera Index as of 2023-01-18, and c) versions curated in the last few years (year listed indicates date considered current) and considered nearly complete. Growth is shown as a percentage increase in the number of records since the older of LepIndex or GLI.

The COL version of LepIndex is missing names for taxa that had been sourced from other datasets prior to 2019. The total count provided for LepIndex uses GLI counts for these taxa – the total is therefore an overestimate, but the mean growth across these groups is at least 27%. Applying the same rate across all other Lepidoptera groups gives an estimate for the order of 181,608 accepted described species. There is reason to consider Geometridae an outlier since significant NHM work on the family preceded the 2011 version of LepIndex. Excluding Geometridae from the calculation raises the estimated percentage growth to 41%, giving an estimated species count of 195,565.

Revised versions are as follows: Nepticulidae and Opostegidae of the World (Oct 2016), Global Taxonomic Database of Gracillariidae (Jan 2022), GLI Elachistinae (Mar 2023), Catalogue of World Gelechiidae (Feb 2023), GLI Lecithoceridae (Mar 2023), Catalogue of the Alucitoidea of the World (Nov 2022), Catalogue of the Pterophoroidea of the World (Jan 2023), World Catalogue of the Tortricidae (Tortricid.net, Dec 2018), Geometridae (pending update, 2022), Bombycoidea (pending update, 2022). The last two datasets will be added to COL once associated taxonomic catalogues have been published.

The table shows two calculated estimates for the current total number of described Lepidoptera species. I consider it highly likely that most remaining groups will expand at least 41% as gaps in LepIndex are addressed. Given the large amount of ongoing revisionary work in the Noctuoidea (42,941 species in GLI today), it seems reasonable that this popular group may have gaps as significant as those shown here for Bombycoidea, which would inflate the numbers much further. At a minimum, Catalogue of Life today is likely to be missing 40,000 described Lepidoptera species.

I would note too that many I found for Elachistinae that LepIndex lacked many 19th century European and British names. Some of these are significant omissions, for example names from Haworth, Hübner and Herrich-Schäffer, including the currently accepted name for the widespread species Elachista freyerella (Hübner, 1825) (with hundreds of records in GBIF). Although the NHM card index was maintained into the 1990s, modern publications begin to disappear even from early in the 1980s.

I feel even more than before the need to make the scale of the challenge much more public and for COL to become more proactive in finding and promoting new ways for content to be edited. A traffic-light system for coverage and quality for each taxon would be a big step forward.

Categories
Araba Bioscan

Araba Bioscan SLAM 3-10 February 2023

Categories
Araba Bioscan Lepidoptera

Lepidoptera Barcoding in Araba Bioscan

Background

Araba Bioscan is a project to improve understanding of arthropod (mostly insect) diversity at a single location, a suburban garden in Aranda, ACT close to open natural (dry sclerophyll) areas, including Black Mountain. This is one of the most intensively studied areas for insects in all of Australia, since the Australian National Insect Collection (ANIC) is located beside the Black Mountain reserve.

DNA barcoding

The project collected weekly samples from 23 October 2020 to 29 October 2022 in a Malaise trap. Further samples have been collected since 22 January 2023 using a SLAM trap. The ongoing effort is focused on identifying and documenting some less well-recorded groups, particularly ichneumonid wasps.

Samples from the first year were sent to the Centre for Biodiversity Genomics (CBG) in Guelph, ON, where 8,838 selected insects and other arthropods were extracted and imaged and their DNA barcodes were sequenced. This selection included:

  • 83 spiders (order Araneae)
  • 43 mites (subclass Acari)
  • 428 springtails (class Collembola)
  • 8 millipedes (class Diplopoda)
  • 32 cockroaches/termites (order Blattodea)
  • 274 beetles (order Coleoptera)
  • 4035 flies (order Diptera)
  • 924 true bugs (order Hemiptera)
  • 1573 wasps/bees/ants (order Hymenoptera)
  • 1138 moths/butterflies (order Lepidoptera)
  • 1 mantis (order Mantodea)
  • 64 lacewings (order Neuroptera)
  • 6 crickets/grasshoppers (order Orthoptera)
  • 183 barkflies (order Psocodea)
  • 1 twisted-wing insect (order Strepsiptera)
  • 22 thrips (order Thysanoptera)
  • 1 caddisfly (order Trichoptera)
  • 2 unspecified arthropods and 20 unspecified arachnids

Sequences from 7362 of these specimens matched a BIN (Barcode Index Number, associated with a cluster of DNA barcodes), indicating a high probability that these are the same species or a very close relative. Many of the specimens in these BINs have been identified, so the DNA barcodes provide a tool to assign scientific names to species found at the location.

This post looks at the results for the Lepidoptera specimens sequenced by Guelph. I had already spent many years light-trapping, photographing and identifying insects, and particularly moths, at this site. This includes 2,840 observations of moths and butterflies representing 599 distinct species on iNaturalist (plus many recognisable moths that may be identifiable to genus but have no scientific name).

A team from CBG made several visits to ANIC to sequence reference specimens of most Australian moth species (including many that are unnamed). This means that the barcode library for Lepidoptera is much more complete than for any other group of Australian insects and most forms are associated with at least genus-level identifications.

This makes the Lepidoptera an ideal test case for evaluating the information gain associated with a DNA-based insect survey. Malaise traps are not the preferred tool for sampling Lepidoptera and will not collect many large species which may be very conspicuous at light. On the other hand, they are efficient at collecting smaller insects and have no bias against species that may be drab, inconspicuous and hard to identify base on their external appearance.

The Araba Bioscan barcode data can be explored on the BOLD Australia site. This site holds a snapshot of all data in the Barcode of Life Data Systems (BOLD) that relates to Australian specimens. It allows these specimens to be grouped and viewed by scientific names (based on the identifications provided by submitters) or by BIN clusters. This makes it easier to explore how well identifications align with (mitochondrial) genetic variation.

If BOLD Australia is accessed using the link the supplied above, the taxa and specimens recorded as part of the Araba Bioscan project (specimens with processIDs beginning “GMAEA”) are highlighted in orange.

Of the 1138 Lepidoptera records, 338 are currently identified only as far as the order, 121 are identified only to family, 70 to subfamily, 283 to genus and 326 to species. These specimens span 39 families, 44 subfamilies and 102 genera. Identifications from BOLD include the following 79 species (listed here with the families assigned in BOLD):

  1. Bedellia somnulentella (Bedelliidae, 35 individuals)
  2. Blastobasis tarda (Blastobasidae, 8 individuals)
  3. Tebenna micalis (Choreutidae, 2 individuals)
  4. Cholotis semnostola (Cosmopterigidae, 2 individuals)
  5. Macrobathra ceraunobola (Cosmopterigidae, 1 individual)
  6. Achyra affinitalis (Crambidae, 1 individual)
  7. Uresiphita ornithopteralis (Crambidae, 4 individuals)
  8. Nacoleia rhoeoalis (Crambidae, 18 individuals)
  9. Eutorna tricasis (Depressariidae, 1 individual)
  10. Elachista velox (Elachistidae, 3 individuals)
  11. Anestia semiochrea (Erebidae, 3 individuals)
  12. Asura lydia (Erebidae, 1 individual)
  13. Threnosia heminephes (Erebidae, 1 individual)
  14. Pantydia sparsa (Erebidae, 1 individual)
  15. Pantydia diemeni (Erebidae, 2 individuals)
  16. Sandava xylistis (Erebidae, 1 individual)
  17. Mesophleps crocina (Gelechiidae, 1 individual)
  18. Orthoptila abruptella (Gelechiidae, 1 individual)
  19. Ardozyga stratifera (Gelechiidae, 1 individual)
  20. Ephysteris subdiminutella (Gelechiidae, 1 individual)
  21. Epiphthora thyellias (Gelechiidae, 5 individuals)
  22. Ectropis excursaria (Geometridae, 1 individual)
  23. Lipogya exprimataria (Geometridae, 7 individuals)
  24. Melanodes anthracitaria (Geometridae, 1 individual)
  25. Mnesampela privata (Geometridae, 1 individual)
  26. Phelotis cognata (Geometridae, 27 individuals)
  27. Psilosticha absorpta (Geometridae, 4 individuals)
  28. Zermizinga sinuata (Geometridae, 1 individual)
  29. Poecilasthena thalassias (Geometridae, 5 individuals)
  30. Dialectica scalariella (Gracillariidae, 6 individuals)
  31. Taractrocera dolon (mistaken identification, actually Ocybadistes walkeri, Hesperiidae, 1 individual)
  32. Genduara punctigera (Lasiocampidae, 3 individuals)
  33. Crocanthes prasinopis (Lecithoceridae, 11 individual)
  34. Crocanthes micradelpha (Lecithoceridae, 2 individuals)
  35. Proteuxoa hypochalchis (Noctuidae, 6 individuals)
  36. Chrysodeixis subsidens (Noctuidae, 1 individual)
  37. Acanthodela protophaes (Oecophoridae, 1 individual)
  38. Acantholena hiemalis (Oecophoridae, 1 individual)
  39. Aeolothapsa malacella (Oecophoridae, 1 individual)
  40. Delexocha ochrocausta (Oecophoridae, 1 individual)
  41. Eulechria eriphila (Oecophoridae, 1 individual)
  42. Eusemocosma pruinosa (Oecophoridae, 1 individual)
  43. Garrha rubella (Oecophoridae, 1 individual)
  44. Garrha leucerythra (Oecophoridae, 8 individuals)
  45. Guestia uniformis (Oecophoridae, 1 individual)
  46. Heterozyga coppatias (Oecophoridae, 2 individuals)
  47. Hoplostega ochroma (Oecophoridae, 54 individuals)
  48. Leistomorpha brontoscopa (Oecophoridae, 2 individuals)
  49. Merocroca automima (misspelled as “automina“, Oecophoridae, 1 individual)
  50. Olbonoma triptycha (Oecophoridae, 9 individuals)
  51. Oxythecta lygrosema (Oecophoridae, 1 individual)
  52. Pachyceraia ochromochla (Oecophoridae, 5 individuals)
  53. Philobota cretacea (Oecophoridae, 2 individuals)
  54. Philobota stella (Oecophoridae, 1 individual)
  55. Philobota xiphostola (Oecophoridae, 2 individuals)
  56. Pseudotheta syrtica (Oecophoridae, 1 individual)
  57. Tachystola stenoptera (Oecophoridae, 1 individual)
  58. Telanepsia coprobora (Oecophoridae, 1 individual)
  59. Telanepsia tidbinbilla (Oecophoridae, 2 individuals)
  60. Oenosandra boisduvalii (Oenosandridae, 1 individual)
  61. Belenois java (Pieridae, 1 individual)
  62. Plutella australiana (Plutellidae, 2 individuals)
  63. Prays autocasis (Praydidae, 1 individual)
  64. Spectrotrota fimbrialis (Pyralidae, 10 individuals)
  65. Salma pyrastis (Pyralidae, 1 individual)
  66. Heteromicta pachytera (Pyralidae, 2 individuals)
  67. Crocydopora cinigerella (Pyralidae, 1 individual)
  68. Endotricha pyrosalis (Pyralidae, 2 individuals)
  69. Crocidosema plebejana (Tortricidae, 1 individual)
  70. Strepsicrates macropetana (Tortricidae, 1 individual)
  71. Asthenoptycha hemicryptana (Tortricidae, 1 individual)
  72. Cnephasia orthias (Tortricidae, 2 individuals)
  73. Epiphyas ashworthana (Tortricidae, 3 individuals)
  74. Meritastis pyrosemana (Tortricidae, 5 individuals)
  75. Meritastis polygraphana (Tortricidae, 10 individuals)
  76. Merophyas divulsana (Tortricidae, 4 individuals)
  77. Scieropepla serina (Xyloryctidae, 1 individual)
  78. Eumenodora encrypta (Xyloryctidae, 6 individuals)
  79. Zelleria cynetica (Yponomeutidae, 1 individual)

Reviewing this list identified the issues listed for bullets 31 and 49. These are being corrected in BOLD. The species known to BOLD as Cnephasia orthias Meyrick, 1910 is not a true Cnephasia but belongs to a group for which a new genus needs to be established. Many Australian sources use the name Rupicolana orthias (Meyrick, 1910) (reflecting its placement in an informal “Rupicolana GROUP”, followed apparently by some bad parsing of this name).

56 of these species are ones that I have identified and recorded from light-trapping and other previous activities in the garden. Most of these are common and familiar species. Their detection and recognition from reference reflects the completeness and quality of the reference barcode library for Australian moths.

As expected, most insects collected are smaller moths, with a large proportion of Gelechioidea and especially Oecophorinae (reflecting Australian biodiversity patterns).

There are 23 species that I had not previously identified using non-DNA methods. As the following thumbnails show, many of these are small species without conspicuous markings. Most or all of them are familiar to me as insects that have been difficult or impossible to identify accurately. Around a third of them are moths I confidently identify to genus but which I cannot progress further because it is unclear how to separate some of the named species (often with only late 19th or early 20th century descriptions) or because it is clear that there is massive undescribed or unmapped diversity in the genus.

New species for the site

Nondescript species

I would characterise 12 of these 23 species as ones that lack conspicuous characters: Cholotis semnostola, Mesophleps crocina, Ephysteris subdiminutella, Epiphthora thyellias, Acantholena hiemalis, Guestia uniformis, Pseudotheta syrtica, Telanepsia coprobora, Telanepsia tidbinbilla, Prays autocasis, Scieropepla serina and Eumenodora encrypta. I would not have expected to identify these insects since doing so would require significant specimen preparation and probably dissection, and even then I would not have been confident that identification would be easy. These are all species that must get massively underreported because they are hard to diagnose by classical methods.

Eumenodora encrypta proved to be reasonably common in the garden (six individuals barcoded, even though this is a species with no records currently on iNaturalist). This species is so cryptic that, prior to a 2013 paper by Lauri Kaila, it was only known from the type specimen and its family placement was uncertain. Kaila referenced multiple specimens from Black Mountain and considered it likely that the species is actually common. One of the specimens sequenced in BOLD and now within this BIN (BOLD:AAM4364) is referenced in Kaila 2013, so the identification is assured.

Blastobasis tarda

Most Blastobasis are rather similar in appearance and I have not attempted to identify those in the garden. The BOLD records from this project include eight assigned to Blastobasis tarda and four belonging to a second BIN (BOLD:AAA9854) that lacks any species identification.

Ardozyga stratifera

I believe I would have identified this moth as Ardozyga catarrhacta, which has a very similar appearance and to which I have identified five of my iNaturalist records. Based on the markings along the costa of the forewing, I believe those other records are indeed catarrhacta, but I have probably missed individuals that could have been assigned to stratifera.

Crocanthes prasinopis

This is one of another group of rather similar species. I have usually identified Crocanthes individuals with this general appearance as Crocanthes glycina (which was not recorded among the 13 barcoded individuals of this genus from my garden). Criteria for separating these species remain unclear to me, but a review of my past identifications is in order.

Eulechria eriphila, Garrha rubella and Garrha leucerythra

These three moths are very similar in outward appearance. I knew that Eulechria (a very diverse genus) includes some insects with this general appearance but I would normally assign all these to an unsorted “Garrha” category. I have 71 iNaturalist observations for moths identified to the genus Garrha, but have not generally progressed far with any other than the most well marked species. Since this genus includes so many outwardly similar forms, this has meant that only 25 of 71 were identified to species. DNA barcoding has clarified some of this unresolved diversity, but I need to spend more time before I would feel confident identifying these species by other methods.

Oxythecta lygrosema

Oxythecta are common and reasonably conspicuous moths here, but the genus includes several rather similar species, and I am now convinced that I have consistently misidentified Oxythecta lygrosema as Oxythecta acceptella. The markings on acceptella seem to be much more crisply defined than on lygrosema.

Pachyceraia ochromochla

This identification seems uncertain. Two distinct BINs are placed under this name, both associated with ANIC specimens, but with rather different appearances. My specimens fall into the larger BIN (BOLD:ABX0360) which seems to hold very nondescript moths.

Philobota xiphostola

Philobota is a massive and frequently abundant genus which seems to shuffle a suite of repeated characters to create new species. Several well-marked local species remain undescribed. Even where individuals have been identified to species, there is unhandled diversity – for example, one of the barcoded specimens in this project falls into BIN BOLD:AAV4780, which is identified from ANIC specimens as Philobota stella. I have identified individuals as belonging to this species. However, the specimens identified with this species name in BOLD fall into three different BINs. Whether these merit separation is unclear and would require further investigation.

In the same way, specimens in BOLD for Philobota xiphostola fall into two BINs. One of these, BOLD:ACF1457, is currently only known from three specimens in ANIC that were collected much closer to the NSW coast, but the one holding my specimen, BOLD:AAV4778, includes many more, mainly from around the ACT. It is pleasing to be able to align local Philobota having this appearance with other matching records.

Plutella australiana

Diamondback moths are abundant throughout Australia and a major global pest. Almost everywhere in the world, these can easily be assigned to Plutella xylostella, but Landry & Hebert 2013 recognised a second Australian species, Plutella australiana, only diagnosable via genitalia or DNA.

Since 2013, Australian diamondback moths have generally been identified only to the complex of the two species. The two specimens barcoded in this project both fell into the BIN for australiana. It is gratifying to be able to place a species name at least on these records.

Asthenoptycha hemicryptana

Miscellaneous brown and blotched tortricids are one of my blindspots – I find them very difficult to identify and have not generally bothered with any other than the most clearly marked species. As an indication of the challenge, one of the specimens in BOLD that has been identified as Asthenoptycha hemicryptana falls into BIN BOLD:AAJ9668. This BIN includes specimens associated with five binomials and four placeholder species names, most of them from ANIC.

My specimen falls into BIN BOLD:AAZ9337, with the identification supported by another ANIC specimen.

Based on this confusion, I still feel cautious about identifying this individual any further than Asthenoptycha.

Moths without species identifications

This section includes comments on some of the BINs that do not yet resolve to species identifications.

Dryadaula

Ten small moths matched BIN BOLD:AAM9461 and are identified to belong to the genus Dryadaula. Until recently this was included in the Tineidae, but recent work has established a new family Dryadaulidae. Yang & Li 2021 gives an update on this change and shows some species.

Gracillariidae

BIN BOLD:ABX2205 includes 65 moths. 64 of these are from my garden, with the remaining insect collected by the CBG team a few kilometers away near ANIC. Most of the specimens lack any recognisable features, but it seems to be a mid- to dark-brown gracillariid with a paler head and thorax and ill-defined pale bands across the distal half of the forewings. I suspect that this is a Caloptilia species. Several can be seen in this weekly sample photo from the same week as some of the barcoded individuals.

Perthida

BIN BOLD:AAY1668 includes two individuals that cluster with other specimens of the leafminer genus Perthida (Incurvariidae).

Conclusion

The Australian DNA barcode library still requires massive curation, even for Lepidoptera, which remains the best sampled taxon.

However, even without strategic selection of specimens to barcode a truly representative selection of moths, these Malaise samples demonstrate how DNA-based surveys can complement light-trapping and citizen science observations.

Categories
Araba Bioscan

Araba Bioscan SLAM 27 January to 3 February 2023

Categories
Araba Bioscan

Araba Bioscan SLAM 22-27 January 2023