Continuous monitoring of wild fish populations and their interaction with farmed fish is important, not only for marine biologists, but also for the aquaculture sector. Nowadays many efforts are made in trying to recognize fish species under-water effectively. Different techniques of computer vision and deep learning are being proposed to solve this problem, but only few benchmarks of fish species images to train and test these algorithms are publicly available. Furthermore, most of these datasets do not include species that are related to aquaculture. Big datasets of images are usually created manually, which can be highly time consuming. This paper presents preliminary results of a system developed to create large image datasets of fish species semi-automatically. The system combines simple techniques of image processing with the state of the art of deep neural networks in an iterative process to extract, label and annotate images from video sources. In order to validate the system, video samples were taken inside and outside cages of a fish farm in Norway. A set of experiments were conducted for two species: salmon and saithe. A small dataset of 200 images for each of the species, as a test case, was successfully created.