top of page

How to Create a Realistic Mock Dataset in Python for Project Showcases

📌Introduction

Mock datasets are incredibly useful when real data isn’t available — whether you're building dashboards, testing data pipelines, creating teaching materials, or developing portfolio projects. They allow you to prototype quickly, demonstrate analytical skills, and tell compelling data stories without relying on proprietary or sensitive information.


🏌️Use case: Simulating a golf course booking system

In this example, I’ll simulate a full year of booking data for a golf course, including booking times, customer types, group sizes, pricing tiers, and revenue — the kind of dataset that’s ideal for business analysis or operational improvement showcases.


2. Define the Scenario

To list the needed data that aligns with the business rules

  • Booking Date and Time

  • Customer Type (Local, Senior, Tourist)

  • Booking Channel (Online, Walk-in, Phone)

  • Package Duration (60/120/180 min)

  • Bay Assignment (Bay 1–5)

  • Group Size (max 4)

  • Revenue Logic

  • Customer Status (Membership, Gift Card, Normal)

  • Booking Name (Simulated customer name for the booking)

    • 🧑‍💼 Special Rule: Booking Name (Repeat Simulation)

      To make the dataset more realistic:

      • name is assigned to every booking using the faker library.

      • For customers with Membership status, there's a 30% chance the name is reused, simulating repeat customersand allowing you to analyse loyalty patterns or return bookings.

      This business rule helps demonstrate how you can simulate long-term customer behaviour, which is especially valuable in marketing, CRM, and retention analysis.


3. Python Code


🧩 Final Thoughts

Creating mock datasets like this not only strengthens your Python skills but also gives you tangible projects to showcase in interviews, portfolios, or data storytelling demos. Whether you're preparing dashboards, testing workflows, or learning data modelling, custom mock data lets you build exactly what you need — no waiting for the perfect dataset. Try adapting this script to other industries like retail, healthcare, or logistics, and let your imagination drive the data. Happy coding! 🐍📊

Comments


@ 2025 YANITHA. All rights reserved

bottom of page