We exploit a bag of visual words approach with random point patterns to remote sensing image classification. The image is divided into patches of equal size, and the local maximums of its histogram are determined. For every intensity in the image, a point pattern is formed based on patch centers and according to local histogram maximums. Further, we investigate the type of the generated random point pattern and mark corresponding fragments with a label. The created point patterns can be of three types: cluster, regular, or random. Therefore, image patches are labeled as a cluster, regular, random, cluster-regular, cluster-random, regular-random, cluster-regular-random, or none when patch’s center does not belong to any point pattern. We form a representation in which the frequencies of patch types are used to differentiate images between the classes. Also, a spatial extension, which considers the relative arrangement of patch types based on their co-occurrences, is used. In addition, a median frequencies vector of the detected patch intensity maximums as the image spectral feature description is added. The introduction of such features significantly reduces the size of a codebook while delivering a classification accuracy comparable to matching the known methods.