Dataset containing RGB-statistics extracted from photographed fluorescent reference particles stained with Nile red. The most abundantly produced plastic polymers worldwide as well as natural materials with high prevalence in the marine environment were considered for this dataset. The spectral data was used to construct a supervised machine learning model that allows to accurately distinguish plastic from natural particles in a cost- and time-efficient way.,The dataset was built to train and validate the ‘Plastic Detection Model’ (PDM) in R and contains Red, Green and Blue (RGB) statistics extracted from Nile red-stained reference particles (50-1200 μm) photographed under three different microscope filters (UV: Filter System A S, BP 340-380 nm; blue: Filter System I3 S, BP 450-490 nm; and green: Filter system N2.1 S, BP 515-560 nm) (LEICA DM 1000). Image analysis to extract all RGB-values was performed using a macro in ImageJ. The supervised machine learning model (CART algorithm) trained by and validated with this dataset predicts with high accuracy the plastic or non-plastic, natural origin of particles, in a cost- and time-efficient way. RGB statistics of the most abundantly produced plastic polymers worldwide as well as natural materials with high prevalence in the marine environment were compiled into the dataset. The statistics itself were calculated per reference particle as the 10th, 50th and 90th percentile as well as the mean of each of the three different color components extracted from all pixels laying along the maximum Feret diameter of that photographed particle. The dataset contains RGB-statistics calculated through image analysis of 60 plastic and 60 non-plastic particles, where 96 particles (4/5) were randomly selected and used to serve as training data (worksheet tab ‘training data’), while the remaining 24 particles (1/5) were kept as independent validation data (worksheet tab ‘validation data’).