This document describes the structure and methodology of the Beach Dataset, which is part of the Beach Review Guide (BRG) Project, used in geographical, environmental, and tourism research. The dataset follows principles of transparency, verifiability, tiered confidence classification, and human-in-the-loop quality control.
Each record is checked twice for consistency and accuracy:
Only after successful moderation is a record assigned a Moderation Date and made available for public use or scientific analysis. Unmoderated or rejected records are excluded from all exports and visualisations.
The dataset contains two record types, collected using two related data collection procedures:
These record types are stored separately in the database and serve different analytical purposes. Each field within either record type is assigned a confidence level (L1–L3) as defined below.
The methodology can be applied to any visual or narrative source that provides verifiable evidence of beach conditions. In the current implementation, the project relies mainly on publicly available user-generated videos from online platforms (primarily YouTube) due to their global coverage and accessibility.
Beach Profiles are constructed using the following protocol:
Visit Reports are derived from individual source videos that satisfy the following inclusion criteria:
Each Visit Report is linked to one Beach Profile via beach_id.
The detailed visual criteria used to distinguish categorical values are defined in an internal annotation guide available only to project contributors (observers). External users can treat categorical fields as discrete observational classes without needing access to the internal guidelines.
The Beach Dataset, while methodologically rigorous within its design constraints, is subject to several limitations that affect its scope, representativeness, and interpretability:
The dataset is most suitable for descriptive comparisons, cross-site characterization, and preliminary hypothesis generation within the spatio-temporal limits of the source material.
Within each record type, every field is assigned one of three confidence levels, informed by common practices in geographic data quality assessment:
Table 1| Level | Name | Definition | Scientific usability |
|---|---|---|---|
| L1 | Verified Observation | Directly observable in source material or explicitly stated. Binary, categorical, or measured. No interpretation. | High – suitable for statistical and spatial analysis. |
| L2 | Inferred / Contextual | Derived from verified observation plus external authoritative source (e.g., climatological average) or logical inference from complete coverage. | Moderate – usable with documented uncertainty. |
| L3 | Descriptive / Editorial | Interpretive phrasing based on L1 evidence, formatted as recommendation or narrative. Contains no new factual claims. | Low – excluded from scientific analysis; intended for interface only. |
Table 2 lists fields, their record type affiliation, scientific definition, permissible values, source, and confidence level.
Table 2| Field | Record Type | Definition | Permissible Values | Source | Level |
|---|---|---|---|---|---|
beach_id | Beach Profile | Unique beach identifier (slug) | Alphanumeric, lowercase, hyphenated | User input, validated | L1 |
location | Beach Profile | Administrative region (country, province, island) | Controlled vocabulary | User selection | L1 |
latitude | Beach Profile | Geographic latitude of beach centroid | Decimal degrees (WGS84) | Satellite map (Google Maps, OSM) | L1 |
longitude | Beach Profile | Geographic longitude of beach centroid | Decimal degrees (WGS84) | Satellite map (Google Maps, OSM) | L1 |
length_m | Beach Profile | Beach length along shoreline | Integer (meters); missing values encoded as NULL | Satellite ruler tool | L1 |
width_m | Beach Profile | Average beach width (mean of min/max) | Integer (meters); missing values encoded as NULL | Satellite ruler tool | L1 |
beach_type | Beach Profile | Infrastructure presence level | Wild, Semi-organized, Organized | Source video (ground-level view, consistent across videos) | L1 |
sand_type | Beach Profile | Surface substrate composition | White sand, Golden sand, Dark sand, Gravel, Pebbles, Shells, Rock | Source video (ground-level view, consistent across videos) | L1 |
water_entry | Beach Profile | Slope gradient at shoreline | Gentle slope, Moderate slope, Steep drop, Reef edge, Rocky shelf | Source video (nearshore view, stable feature) | L1 |
water_bottom | Beach Profile | Substrate composition at seabed | Sandy, Silty, Pebbles, Coral, Rocky, Seagrass | Source video (only if water clear and bottom visible, stable) | L1 |
natural_shade | Beach Profile | Canopy cover over beach surface | None, Sparse, Moderate, Full | Source video (daytime, full view, consistent) | L1 |
toilets_present | Beach Profile | Presence of functional toilets | true, NULL | Source video | L1 |
showers_present | Beach Profile | Presence of functional showers | true, NULL | Source video | L1 |
sunbeds_available | Beach Profile | Presence of rentable sunbeds | true, NULL | Source video | L1 |
food_drink_available | Beach Profile | Presence of on-beach vendors or cafes | true, NULL | Source video | L1 |
safety_infrastructure | Beach Profile | Presence of lifeguards, flags, or rescue towers | true, NULL | Source video (only if beach fully scanned across videos) | L1 |
access_difficulty | Beach Profile | Walking effort from nearest road | Roadside, Very Short walk (1-2 min), Short walk (<5 min), Trail (5–15 min), Long hike (>15 min), Boat required | Source video (path shown or described, stable) | L1 |
recommended_transport | Beach Profile | Transport mode demonstrated or advised | Scooter, Car/Taxi, Boat, Walkable | Source video narration or footage (consistent) | L1 |
parking | Beach Profile | Proximity and type of parking | None, Street only, Free lot, Paid lot, Guests only | Source video (parking visible, stable) | L1 |
location_description | Beach Profile | Geographic position relative to landmarks | Free text (≤70 chars) | Map analysis (Google Maps, OSM) | L2 |
visit_year | Visit Report | Year of on-site visit | Integer (e.g., 2024) | Video title/description/comments or upload date | L1 |
visit_month | Visit Report | Month of on-site visit | Integer (1–12); NULL if unconfirmed | Explicit mention only | L1 |
sky_condition | Visit Report | Cloud cover at time of recording | Sunny, Partly cloudy, Cloudy, Rainy, After rain | Source video (sky visible) | L1 |
wind | Visit Report | Observed wind intensity | Calm, Light, Moderate, Strong | Source video (vegetation, flags, water surface) | L1 |
wave_height | Visit Report | Visual wave amplitude | Calm (<0.1 m), Light (0.1–0.2 m), Moderate (0.2–0.5 m), Rough (0.5–1.0 m), Very Rough (>1.0 m) | Source video (shoreline view) | L1 |
water_clarity | Visit Report | Underwater visibility | Crystal clear, Clear, Slightly cloudy, Murky | Source video (only if water shown clearly) | L1 |
water_cleanliness | Visit Report | Presence of debris or pollutants in water | Clean, Some debris, Algae/seaweed, Polluted/muddy | Source video (water column) | L1 |
sand_cleanliness | Visit Report | Surface litter or organic residue | Excellent, Good, Some trash, Poor | Source video (beach surface) | L1 |
crowd_level | Visit Report | Observed density of beach users | Empty, Few people, Moderate, Crowded, Very busy | Source video (beach in frame) | L1 |
visitor_type | Visit Report | Dominant demographic group | International tourists, Families, Couples, Solo travelers, Local tourists | Source video (language, behavior, attire) | L1 |
noise_level | Visit Report | Audible disturbance sources | Quiet, Music from bars, Construction, Boat engines | Source video audio track | L1 |
time_of_day | Visit Report | Approximate recording time | Morning, Day, Evening | Explicit mention or contextual cues | L1 |
air_temp_c | Visit Report | Mean monthly air temperature | Integer (°C); NULL if month unknown | weatherspark.com (nearest station) | L2 |
water_temp_c | Visit Report | Mean monthly sea surface temperature | Integer (°C); NULL if month unknown | seatemperature.org (nearest location) | L2 |
short_description | Beach Profile | Concise factual summary for interface | Free text (≤70 chars, no adjectives) | Editorial synthesis of L1 fields | L3 |
short_sand_water_char | Beach Profile | Phenomenological description of sand/water | Free text (≤70 chars) | Editorial phrasing based on L1 | L3 |
best_for | Beach Profile | User group suitability (conditional) | Free text (e.g., “Snorkelers: coral offshore”) | Editorial interpretation of L1 | L3 |
avoid_if | Beach Profile | User incompatibility statement | Free text (e.g., “You seek peace (boats passing)”) | Editorial contrast based on L1 | L3 |
special_notes | Visit Report | Contextual metadata about video | Free text (e.g., “Copter video”, “Jan 2025”) | Curator annotation | L3 |
Note: NULL values indicate unobserved or unverifiable conditions, not confirmed absence. Absence assertions (e.g., “no toilets”) are only recorded when ≥90% of the beach is visibly scanned and no such feature is present.
The L1 and L2 portions of the Beach Dataset are not publicly downloadable at this time. They may be made available under a CC BY 4.0 license for non-commercial academic research upon request. The L3 layer (editorial content) is excluded from redistribution and is intended solely for user-facing presentation on the Beach Review Guide (BRG) platform.
Commercial use of any part of the dataset—including licensing, embedding in websites or mobile applications, use in commercial products, or resale—requires a separate written agreement with the project author.
To request access to the dataset or inquire about commercial licensing and collaboration, please contact: coconut@beachreviewguide.com.
For academic citation, please use the following format:
Vityasev, Y. M. (2026). Beach Dataset. Beach Review Guide. https://beachreviewguide.com/dataset-description/
The data collection procedures, field definitions, confidence system (L1–L3), and moderation workflow constitute the methodological basis of the BRG project. The custom categorical scales (e.g., wave height, wind intensity, infrastructure, cleanliness) were designed specifically for visual assessment from user-generated videos. The methodology is an original work; using it outside the BRG project requires permission from the author.
Proper attribution is required in any derivative academic work that references this methodology:
“Data collection followed the Beach Review Guide (BRG) methodology (Vityasev, 2026).”