Buffy Stickmen Dataset V3 is a figure containing the image of the person and the body to identify the outline of the person in the image. The head, torso, left and right arms, and the left and right arms represent the outline of the human body.
MPII Human Shape Human body model data is a series of 3D models and tools for human contours and shapes. The model was learned from the plane scan database CAESAR.


BaBao Face is a face dataset, face keypoint annotation, labeling and fine-tuning key points in the face range, each point has an accurate position for precise expression changes and face key point recognition.
CMU Frontal Face Images is a dataset of frontal face grayscale images and face position annotations. Although there are no more than 180 face images, there are detailed face location annotations.
The face key point calibration competition calibrates 15 key position points for the faces in the image. The image is 96×96 pixels, and each key point corresponds to a two-dimensional position coordinate to identify the position of the key point.




17_Category_Flower is image data of different kinds of flowers, including 17 different kinds of flowers, 80 pictures of each kind of flowers, and the flowers are common flowers in the UK.
The COIL-20 dataset is a collection of color images containing 20 images taken from different angles, one image every 5 degrees, and 72 images per object. Each image size is uniformly processed to 128×128.
MNIST is one of the most popular deep learning datasets, a handwritten digital dataset containing a set of 60,000 sample training sets and a test set of 10,000 samples. This is a good database for trying to learn techniques and deep recognition patterns in real data, while spending the least amount of time and effort on data preprocessing.


UCI Beast-cancer numerical data set (breast-cancer-wisconsin)

Natural language processing

The question answer data set on the Quar website of the US Knowledge Q&A site can be used for duplicate problem detection.
Stanford Sentiment Treebank is a semantic lexical data annotated by Stanford University, artificially annotating the semantic tree structure of 9,645 English sentences.

Market and social media

News data (2008-06-08 to 2016-07-01) and Red Dow Jones Industrial Average (DJIA) stock index data from the Reddit WorldNews Channel website.

Modeling and machine learning

The NIPS 2003 workshop feature extraction attribute selects the competition data, which contains 5 data sets (columns of numbers) with a large data set to judge the performance of the Feature Selection or Feature Extraction algorithms.
Several Large-scale large-scale classification modeling data in the UCI data set, named SUSY, HIGGS, are used to test the time and space complexity of the classification algorithm.
The UCI machine learning data set in the multi-class data set is used to arrange the combined two-category data set to test the prediction effect of the two-class model.
The classic binary classification data set in the UCI machine learning data set, including the classic two-class problem test data set such as Iris, Hert Dieses, and German Credit.