Trends and Topics of Statistics Training Courses

Mark Andrews, Eirini Koutoumanou, Matt Castle

Introduction

  • The aim of this work is to analyse what is being taught, to whom, how, etc. in statistics training courses (usually CPD, advanced training for research).
  • Why this might be of general relevance for statistics education:
    • Training courses reveal, and help close, teaching-workplace skills/knowledge gap
    • They are reflect evolving or emerging methods and tools
    • They are a principal means of teacher-training/upskilling

Databases

In total, we analysed 16,571 training courses descriptions:

  • ALLSTAT mailing list: 6843 posts from 1998 to 2025
  • NCRM training courses and events database: 8924 entries from 2003 to 2025
  • Elixir TeSS training course database: 804 entries from 2011 to 2025

All of these descriptions were webscraped from the three websites using Python.

Extracted information

For each training course description, we extracted

  • One line description of course
  • One term summary description of topic
  • Topic keywords
  • Intended audience (e.g. academic/research field) and level
  • Software
  • Duration, delivery method
  • Course provider

Extraction method

  • Information was extracted using a locally running large language model (LLM), Llama 3.3.
  • The LLM ran on a Linux workstation with a RTX A6000 GPU, 10,752 cuda cores and 48GB of VRAM.
  • The script was written in R using the ellmer package.
  • Approximately 2500 descriptions processed per day.
  • LLM API services provided by OpenAI could easily have been used (one line change in R code), approximately £50.

Topics

Topics (ALLSTAT)

Topics (NCRM)

Topics (most recent quintile)

Topics (least recent quintile)

Intended academic/research field

Intended academic/research field (ALLSTAT)

Intended academic/research field (NCRM)

Intended academic/research field (TESS)

Intended level

Software

Software (year quintile 1, least recent)

Software (year quintile 5, most recent)

Delivery method

Delivery method (most recent quintile)

Delivery method (all but most recent quintile)

Duration

Provider