ForestSeg-T1 contains 1,832 high-resolution UAV images from Vietnam National University of Forestry in Hanoi, captured using DJI Phantom 4 RTK and DJI Air 3 at 70–211m altitudes. Each image includes 1–17 manually annotated tree crowns and flight parameters. The dataset supports forest monitoring, biomass estimation, and environmental change detection. Its multi-seasonal imagery enables long-term forest analysis. Evaluations across four time intervals (T1–T4) show that environmental conditions impact segmentation performance, with T4 achieving the best results (AP 60.82%, accuracy 71.32%), emphasizing the importance of temporal diversity and data quality.
The OvaTUS dataset was collected by our research team in collaboration with the National Hospital of Obstetrics and Gynecology (NHOG) in Hanoi, Vietnam. It includes ultrasound images from women who visited the hospital for ovarian tumor assessments and consented to participate in the study. The dataset contains six labeled categories: Unilocular Cyst, Solid Tumor, Hemorrhagic Tumor, Multilocular Cyst, Solid Multilocular Cyst, and Solid Unilocular Cyst. The dataset has been thoroughly annotated with accurate labels by experts and sonographer, and the process of data collection is still ongoing.
The dataset consists of 150 videos from 50 subjects, including 11 females and 39 males. Each video ranges from 100 seconds to 140 seconds in length and contains 2,479 gesture samples divided into 19 classes. The time interval between two gestures is approximately 4 seconds. In total, these 150 videos result in 490,000 RGB frames, corresponding to about 4.5 hours of video.
Although several image and video datasets have been collected for student activity recognition, they are mainly annotated at the frame level or sequence level, which can be used for object detection-based and video classification-based approaches. Therefore, to promote research in continuous student activity recognition, we have prepared a new dataset named CStudentAct. The CStudentAct Dataset is an extension of the StudentAct Dataset
Our Vietnamese forestry dataset (VietForest) is a valuable resource comprising 156 classes of plant species predominantly found in the forests of Phu Tho province. This dataset encompasses a diverse range of flora, including 18 orders, 45 families, and 122 genera, totaling 26,251 images. The dataset showcases local plant species, many of which hold significant ecological importance and some of which are listed for conservation efforts. The dataset not only aids in understanding the local habitat but also provides detailed biological information associated with each species
The VnBeeTracking Dataset is built for the purpose of serving the tracking honeybee problem.
The videos in the dataset are collected at the beehive entrance at the Research Center for Tropical Bees and Beekeeping, Vietnam National University of Agriculture in April 2022.
We used a data acquisition system consisting of an Nvidia jetson nano development kit and an IMX477 HQ camera with a 6mm CS-Mount lens which are placed in a housing surveillance weatherproof outdoor camera box.
The VnPollenbee Dataset is built for the purpose of serving the problem of detecting pollen bees.
The dataset contains more than 2000 imagesconsisting of 1,758 pollen bearing and 59,068 non-pollen bearing
bees.
The dataset was collected at the bee farm of the Vietnam Agricultural Academy by a data acquisition system
consisting of an
Nvidia jetson nano development kit and an IMX477 HQ camera with a 6mm CS-Mount lens,
all devices are placed in a housing surveillance weatherproof outdoor camera box. We adjust the camera along
with the downward-facing view.
We attach the housing surveillance weatherproof outdoor camera box within only one stage of the hive body.
Our data acquisition system:
The StudentAct dataset is meant to aid research efforts in the general area of developing, testing and evaluating algorithms for human activity recognition. The Hanoi University of Science and Technology (HUST) has copyright in the collection of activity video and associated data and serves as a distributor of the StudentAct dataset.
The 3000VnPersonSearch dataset includes pairs of image and description. The images are person bounding boxes that are extracted from video frames. The videos are captured by both moving cameras and fixed-position cameras with different fields of view. They are captured during day and night time with street lamp light. The capture scenarios are mostly crowded street and outdoor festival scenes, so the occlusion and pose variance also appear
This dataset contains total 15 videos and is recorded on three days by two static non-overlapping cameras with HD resolution (1920 × 1080), at 20 frames per second (fps) in indoor and outdoor environment conditions. There are 7 to 10 different people in each video. The result of splitting the frames is 11876 images that have been labeled with the boundingboxes and corresponding identifiers. There are 28567 boundingbox. Dataset can be used for human detection, tracking, and re-identification problems