WOMBAT 2025

Welcome! 👋

We acknowledge the Traditional Owners of the land on which we gather, the Wurundjeri people of the Kulin Nation.

We pay our respects to Elders past and present, and recognise their continuing connection to Country, culture, and community here in Naarm (Melbourne).

Learning together

Wombat in the Woi Wurrung language is “warin”.

Safety 🛟

Evacutation procedure

🚨 “Woop woop woop - evacuate now” 🚨

Use the stairs beside the lifts to reach the lobby.

Assembly point is the park to the west.

Code of Conduct (CoC)

In short: be kind, respectful, and considerate.

(full code of conduct is available on our website)

Alert the CoC response team of any issues 👋

Getting help

Organisers are here to help!

Identify organisers with green dots on their name tag.

Email the committee: [email protected]

Organising Committee

Cynthia Huang

General chair (workshop)

Mitchell O’Hara-Wild

General chair (tutorials)

Hannah Comiskey

Visitor logistics

Janith Wanniarachchi

Webmaster

Elio Campitelli

Social media

Maliny Po

Marketing and Graphics

Krisanat Anukarnsakulchularp

Volunteer lead

Hashtag #WOMBAT2025

Share your experience online with hashtag #WOMBAT2025

Event recording

This event is being recorded, and presentations will be published online on YouTube afterwards.

Photos and videos will also be taken during the event.

Opt-out of recording

Not keen on social media? Photo shy? No worries!

Opt-out by adding a red dot to your name tag.

Information online

🌐 Workshop website

See the schedule, speakers, and more online.

https://wombat2025.numbat.space/

🛜 Wi-Fi access

Get online with:

eduroam (for students / academics)
Monash Guest Wi-Fi (for everyone else)

https://www.monash.edu/esolutions/network/guest-wi-fi

🕰️ 5:15PM onwards

📍 State of Grace, 27 King St, Melbourne 3000

Open invitation, come and go as you please!

We’ve reserved the roof top loft area.

{height=“400px”, fig-align=“center”}

Nicola Rennie

Designing for decision-making: How to build effective data visualisations

Data visualisation can be a very efficient method of identifying patterns in data and communicating findings to broad audiences. Good data visualisation requires appreciation and careful consideration of the technical aspects of data presentation. But it also involves a creative element. Authorial choices are made about the “story” we want to tell, and design decisions are driven by the need to convey that story most effectively to our audience. Software systems use default settings for most graphical elements. However, each visualisation has its own story to tell, and so we must actively consider and choose settings for the visualisation we are building. In this talk, Nicola will showcase why you should visualise data, present some guidelines for making more effective charts, before discussing examples of good and not so good charts.

James Goldie

Closing the scrollytelling gap with Closeread

Scrollytelling is an innovative way to tell data-driven stories, but it often demands sophisticated data visualisation engineering skills and budgets beyond individual researchers and small newsrooms. Closeread makes scrollytelling accessible and easy for researchers and analysts using their existing Quarto authoring skills.

In this talk, James discusses some of the factors that motivated Closeread’s design, the first scrollytelling contest with Posit, and new features on the horizon, including scrolling video support and better integration with R and Python tools.

Jenny Richmond

Rethinking data science education in the age of genAI

Generative AI tools, like ChatGPT, haven’t necessarily created new challenges in data science education. Rather, the availability and rapidly growing capabilities of these tools have exposed weaknesses in learning design and assessment that educators have been ignoring for a while.

In this talk, I will draw on my experience in teaching psychology students about computational reproducibility and the value of R-based data workflows. I will argue for the need to rethink what learning to learn looks like and talk about how we might reframe assessment as the process of gathering evidence that learning has occurred.

There is no avoiding the fact that students now need to learn how to learn alongside Generative AI. Data science educators can help them by designing learning experiences that build self-regulated learning skills and normalise what real learning feels like.

Michael Lydeamore

When laziness leads to innovation: making things you didn’t know you needed

Nothing gives me greater joy than building little tools to make my life easier — even if they take ten times longer to make than they’ll ever save. In this talk I’ll share a few of these creations and the mindset behind them: the teleport GitHub Action, Comment Viewer Positron extension, Unilur Quarto filter for generating assignment solutions, and a sneak peek at a gamified teaching tool for ggplot2. Along the way, I’ll argue that sometimes it’s not the smartest hire you need, but the laziest one — a sentiment probably attributed to Bill Gates (though I’ve never bothered to check).

Floyd Everest

Tidy analysis of preferential votes with prefio

Preferential datasets usually represent rankings for an individual voter across multiple columns or rows in a rectangular format. This can make working with these datasets cumbersome and unintuitive.

This talk will introduce prefio, an R package that lets you work with preferences in a tidy way without spanning multiple rows or columns, while providing convenient operations for manipulating preferential data.

There will be an interactive component to the talk, so get ready to cast some votes!

Harriet Mason

Visualising Uncertainty with ggdibbler

Adding uncertainty representation in a data visualisation can help in decision-making. There is an existing wealth of software designed to visualise uncertainty as a distribution or probability. These visualisations are excellent for helping understand the uncertainty in our data, but they may not be effective at incorporating uncertainty to prevent false conclusions. Successfully preventing false conclusions requires us to communicate the estimate and its error as a single “validity of signal” variable, and doing so proves to be difficult with current methods. In this talk, we introduce ggdibbler, a ggplot extension that makes it easier to visualise uncertainty in plots for the purposes of preventing these “false signals”. We illustrate how ggdibbler can be seamlessly integrated into existing visualisation workflows and highlight the effect of these changes by showing the alternative visualisations ggdibbler produces for a choropleth map.

In conversation: Research Software

Research software engineering, open source software, reproducibile research – what do these terms really mean, and what’s special about creating ‘statistical software’ and R packages? In this conversation with Professor Rob Hyndman, and Dr. Nick Tierney, we attempt to demystify and highlight many ways that software design and open source packages can contribute to the development, dissemination and accessibility of data-driven analysis.

Cynthia Huang

Host

Rob Hyndman

Speaker

Nicholas Tierney

Speaker

Saras Windecker

Piloting peer code review in a research consortium community of practice

Code-based research offers greater flexibility than point-and-click statistical software but is vulnerable to a range of errors that are often not addressed through traditional peer review, which rarely examines the underlying code. Formal code review poses practical challenges, including the need for time, expertise, and a supportive environment, particularly given the complexity of analyses and the diverse skills required. To address these challenges, the Australia-Aotearoa Consortium for Epidemic Forecasting & Analytics (ACEFA) is piloting a peer code review program among its members, who produce short-term epidemic forecasts for Australia and New Zealand. The aims are to verify that models correctly implement their documented methods, identify areas for improvement, promote good coding practices, and familiarize researchers with code review as a valuable tool for scientific rigor. The program’s effectiveness will be evaluated through participant surveys, with the intent of refining the process and sharing it as a model for others in the research community.

Tyler Reysenbach

Building a new data team – balancing quick wins and long term investments

Every new data team faces the same dilemma: stakeholders expect immediate results while the team knows that achieving long-term impact needs investment in its capabilities. Should they build that reporting dashboard everyone’s asking for, or spend three months building a new dataset that could inform future priorities? If they choose the quick win, the team risks being pigeonholed into a reporting team, unable to focus on longer-term projects, with more and more regular products requiring maintenance. Focus on building capacity, and the team might lose organisational support before they can prove their value, limiting their impact before they could even get going.

Drawing on experience from working in the Central Analytics Hub at the Department of Prime Minister and Cabinet during the pandemic and working to help address capability gaps to better inform migration policy reform, this talk will explore the practical realities of navigating these tensions. I’ll share stories from both experiences—from delivering urgent daily briefings while building the necessary capability to pivot to changing priorities, to establishing credibility while building the infrastructure needed to inform pressing policy priorities.

This talk will offer honest reflections on the messy reality of building data teams in government, providing practical insights for anyone wrestling with similar challenges.

Ben Harrap

Designing data infrastructure where people come first

The Mayi Kuwayu Study is a longitudinal survey and the largest national study of Aboriginal and Torres Strait Islander culture, health and wellbeing. I joined the study team last year and have been redeveloping the data pipeline to be more transparent, reproducible, and easier to maintain. My original plan was to incorporate best practices, such as using targets and renv for ensuring reproducibility and pointblank and testthat for validation and testing, as well as writing nice code.

It was a good plan, but it didn’t fully consider the people in the study team. Being able to run code from start to finish without error might be reproducible in a technical sense, but is someone actually reproducing it if they don’t understand what the code is doing? Code is only reproducible so long as it’s maintainable. Since I’m the only experienced R-user (for now), each additional package I use is a package someone else has to understand and runs the risk of making the code less reproducible.

In this session I’ll be talking about:

the Mayi Kuwayu Study and team context
development of the data pipeline
the role people played in design decisions
using the development as an opportunity to upskill my team

Closing Discussion

What is one idea, tool or tip you plan to use?
What is something surprising you’ve learnt?
Have you discovered new ways of integrating design & data science?

Want more? 🎁

Upcoming events

NUMBATs Workshops
Business Analytics Courses
- Bachelor of Commerce (Business Analytics)
- Master of Business Analytics (MBAt)
MBAt Internships

Internship project examples

Exploring and interpreting data

How does air quality affect blood donations? – Red Cross Lifeblood
What factors lead to unhappy customers? – Waterman
Analysis of member surveys – Heritage New Zealand Pouhere Taonga

Internship project examples

Modelling / Forecasting

Forecasting stamp duty payments – Department of Treasury & Finance
Optimising patient allocations – Monash Health
Biodiversity impact of 2019/20 bushfires – Vic. National Parks Association

Internship project examples

Automation and tools

Quarterly reporting for business partners – AIA (insurance company)
Membership engagement dashboards – Educate Plus
Tennis racquet recommendation system – Tennis Australia

Want to host a project?

Info for hosts (\(\leftarrow\) follow this link)
Contact me: [email protected]

Thank you!

20 presenters & instructors 🗣️
7 tutorials 🧑‍💻
10 invited talks 🎤
~85 attendees over 2 days 👋
7 committee members
lots of support from Monash EBS

Final Remarks

#️⃣ Hashtag #WOMBAT2025

Share your experience online with hashtag #WOMBAT2025

🌐 Workshop Recordings

Recordings from today will be published on YouTube.

We’ll share them on our website and social media channels (#WOMBAT2025) when they’re ready to watch!

🍻 Post-Workshop Social Hour

5.15PM onwards at State of Grace, 27 King St, Melbourne 3000

Welcome! 👋

Safety 🛟

Organising Committee

Cynthia Huang

Mitchell O’Hara-Wild

Hannah Comiskey

Janith Wanniarachchi

Elio Campitelli

Maliny Po

Krisanat Anukarnsakulchularp

Social media 📣

Information online

🍻 Post-Workshop Social Hour

Closing Discussion

Want more? 🎁

Internship project examples

Internship project examples

Internship project examples

Want to host a project?

Thank you!

Final Remarks