Image Colorization Algorithm based on Dense Neural Network

doi:10.23940/ijpe.19.01.p27.270280

Image Colorization Algorithm based on Dense Neural Network

Na Zhang, Pinle Qin, Jianchao Zeng^,, and Yulong Song

School of Data Science and Technology, North University of China, Taiyuan, 030051, China

Corresponding authors: * E-mail address:zjc@nuc.edu.cn

Accepted: 2018-12-24 Online: 2019-01-1

About authors

Na Zhang received her undergraduate degree in School of Information and Statistics from Guangxi University of Finance and Economics, Nanning, Guangxi, P.R. China, in 2015. Currently, she is pursuing master’s degree in NUC, and her areas of interest are digital image processing, medical image processing and computer vision.

Pinle Qin received the PhD degree in computer application technology from Dalian University of Technology (DLUT), Dalian, Liaoning, P.R. China, in 2008. He is currently an associate professor with the School of Data Science and Technology, North University of China (NUC). His current research interests include computer vision, medical image processing and deep learning.

Jianchao Zeng received the PhD degree from Xi’an Jiaotong University, Xi’an, Shanxi, P.R. China, in 1990. He is currently the vice president of North University of China (NUC). His current research interests include computer vision, medical image processing and deep learning.E-mail: zjc@nuc.edu.cn.

Yulong Song is pursing master`s degree in NUC, and his research areas are digital image processing and computer vision.

Abstract

In most scenes, color images have richer information than grayscale images. This paper presents a method of grayscale image pseudo coloring that constructed and trained an end-to-end deep learning model based on dense neural network aims to extract all kinds of information and features (such as classification information and detail feature information). Entering a grayscale picture to the trained networkcan generate a full and vibrant vivid color picture. By constantly training the entire network on a wide variety of data sets, you will get the most adaptable, high-performance pseudo color network. The experiments show that the method proposed has a higher utilization of features and can obtain a satisfactory coloring effect. Compared with the current advanced pseudo color methods, it has also made remarkable improvements, and to a certain extent, the problem during the coloring processing have been improved, such as color overflow, loss of details, low contrast etc.

Keywords: image coloring; densely connected convolutional networks; grayscale image; feature utilization; information loss

PDF (962KB) Metadata Related articles Export EndNote| Ris| Bibtex Favorite

Cite this article

Na Zhang, Pinle Qin, Jianchao Zeng, and Yulong Song. Image Colorization Algorithm based on Dense Neural Network. International Journal of Performability Engineering, 2019, 15(1): 270-280 doi:10.23940/ijpe.19.01.p27.270280

1. Introduction

Color information is an important part of the image, which can show an abundant space hierarchy by combining with the semantic and surface texture information of the scene in the image. Research shows that the human eye is highly sensitive to the color intensity and its transformation, while the sensitivity to the grayscale value transformation is far less. In terms of human psychology, color images can give observers a more pleasant and enjoyable feeling, which helps to understand the content of images and obtain more comprehensive and rich information from them to enhance the use value of images.

Grayscale image coloring (i.e., pseudo-color processing) is a technology that is generated under the above requirements by assigning a color to grayscale values[1] by a specified rule to restore, enhance, or change the color information of the image. Currently, there are mainly two kinds of methods for image coloring: User-guided color spreading and Data-driven image colorization.

The algorithms based on user-guided color spreading is when the user conducts the guiding coloring of the key part or area, and then the algorithm performs color expansion according to set rule or conversion specification. Levin et al. [2] proposed a global optimization colorization algorithm. After the user`s coloring, a color image matched with the user’s scribbles was generated, and the priori definition of the image was supported. Lagodziński et al. [3]proposed a novel colorization method that uses the morphological distance transform and the image structure to propagate the colors scribbled by users in grayscale images automatically. The method mentioned above can achieve good results, but different color regions need to be clearly indicated.It usually requires intensive user interaction, and the shade of colors are not well characterized and implemented; color overflow will also easy to appear due to the inappropriate label and the gray-values which are extremely similar etc. Based on the optimized colorization method, Shah et al. [4] used three correlation thresholds to evaluate their performance in information loss. LI et al. [5] also used the colorization method based on thresholds. Such methods have a high dependence on the selected threshold and have a limited number of colors, which make the color shows relatively curt.

The method of pseudo coloring based on data-driven include many categories, including example-based approach. Welsh et al. [6] proposed a method that compared the brightness and texture information between the grayscale image and the example color image to realize the coloring of the grayscale image. Liu et al. [7] proposed that the reference instance images for colorization related to the target grayscale image may be directly searched from the Internet. Liu, Michael et al. [8] and Morimoto et al. [9] achieved grayscale images coloring by color conversion and image analysis. This method works well when the target grayscale image is similar to the reference image but finding reference images and matching is really a time-consuming job. When the colorization target or scene is very complicated or rare, the coloring effect is more difficult to be guaranteed. Irony et al. [10]used the texture feature matching to do the image segmentation of the reference image and the grayscale image firstly. With reference to the reference image, paint the part in the grayscale image with the same color when the texture was similar, although good results can be achieved, the segmentation processing will increase the image processing burden.

The development of deep learning methods and the advent of high-performance GPUs have opened new directions for data-driven image colorization methods. This kind of method uses neural network.By building different network architecture, it can extract and analyse the content and features of the image and look for the mapping relation between the grayscale image and the color image and train the corresponding model and realize the image colorization. Cheng et al.[11]used large-scale data modelling, post-processing based on joint bilateral filtering and adaptive image clustering to integrate image global information. Deshpande et al. [12] used the LEARCH framework to train the quadratic objective function in the chromaticity diagram and achieve image colorization by minimizing the objective function. The structure of such network is relatively simple, and the colorization effect is limited. Zhang et al.[13]proposed to extract image features by VGG convolutional neural network[14] and predicted the color histogram of each pixel to color the image. Later, they put forward a new idea to extract the information by using U-Net network [15] and combined it with the user interaction [16]. Lizuka et al. [17] constructed a dual-stream structure network; they extracted the global classification information and local characteristic information of the image simultaneously and fused the two types of information to predict the pixel color. Compared with the previous methods, these three methods have made great improvement. However, due to the subsampled and up-sampling they have used during the image processing, there is a certain degree of information loss. Qin et al. [18] used theResidual neural network [19] to extract the detail features, and then combined with the guidance of classification information. Their method has helped to reduce the information loss in some extent, but there are still some problems such as incomplete details coloring and color overflow. In summary, the existing grayscale image colorization techniques mainly have the following problems:

The information loss on details. Natural images are often rich in details; a good grayscale image colorization algorithm should have the ability to show a variety of details as much as possible. Due to the limited efficiency of the corresponding mapping, the existing algorithms may lose some information in the processing of feature extraction, which leads to some part of the image cannot be set as appropriate color especially the small objects. The influence on image colorization is that the content that should be independent often becomes "submerged" in the main color of the whole image; this is very detrimental to the full utilization of image information.

Color overflow. Each object in color images has its own different color. However, the existing algorithms for grayscale image colorization often have the problem of “color overflow” on a certain extent. That is, there exists color leakage; the color of object A spreads to object B. This phenomenon is often due to that the mapping of the colorization algorithms is not sensitive on the subtle grayscale difference, and part of the low-level semantic information and object contour have been lost during the feature extracting.

The weak contrast. One of the main purposes of pseudo-color technology is to make full use of the high sensitivity of the human eye on different color values and intensities of the color image so as to achieve more efficient information acquisition. The existing pseudo-color algorithms show a weak contrast on a certain degree. In some cases, it tends to give the image an overall color tone. Different objects are difficult to separate by color, which is similar to adding a layer mask or a filter on the whole image, making the colored image has a lower richness and sharpness.

Discordant of colorization. Due to the above problems, the colorization result is not very satisfactory; the whole picture is not very naturally. We can carefully draw a conclusion that the main problems of the existing pseudo-color algorithms are mainly that the information utilization rate is not high when establishing the mapping relationship between the gray image and the color image, especially since the algorithm based on deep learning has obvious gradient vanishing problem when the network is deepened, and during the abstract processing on low-level features the information may have lost gradually.

This paper proposed a new method based on Densenets [20] to solve the mentioned problems. The method is based on the idea of skip-connection and has high utilization rate of information; the loss of the low-level semantic information and object contour information is also small. In this paper, we used its characteristics to extract the texture and detail features in the image. At the same time, we obtained the classification information of the image through the VGG sub-network. The network combines the texture details and classification information for feature re-extraction and predicts the color according to the integrated features to obtain the output. We compared the output image with the original color image and calculated the mean square error. After several optimizations training, we got a final colorization model to achieve the conversion from grayscale image to color image. The network does not change the size of the feature map when extracting the detail information. The feature information that is gradually discarded or lost in the traditional network is reused through Densenets, which effectively reduces the gradient vanishing problem and enhances the transmission and utilization of features. As a result, the information utilization ratio of the original image by using the network is higher, the obtained coloring image is better, and the details are also more complete and abundant.

2. Related work

Convolution neural network. Convolution neural network has already become a research hotspot in image recognition, speech analysis, natural language processing and other fields. It has feature sharing, which can effectively reduce the complexity of the network. It can also play a very effective role in feature extraction and establishing feature mappings, especially in image analysis and processing, images can be input directly into the network, which can avoid the complexity of data reconstruction in feature extraction and classification tasks. At present, many excellent networks based on the convolutional neural network have already emerged such as AlexNet[21], VGG, GoogleNet [22], ResNet and other structures; the Top5 error rate of image classification task has been reduced below 5%.

Skip-connection. In the deep learning network, as the depth of the network deepens, the gradient vanishing problem will become more obvious. Aiming to solve this problem, many researchers have put forward the targeted solutions, such as ResNet, Highway Networks [23], Stochastic depth [24], FractalNets[25], etc. The network structure of such algorithms varies, but the core is same: creating short connection paths between early and later levels. Among them, Residual neural network (ResNet) has been adopted by more researchers due to its better performance and structural simplicity. It introduces a short connection between the output of the residual block and the input, rather than only stacking the network simply. The mapping relationship can be expressed as F(x)+x, as shown in Figure 1.

Figure 1.

New window| Download| PPT slide

Figure 1. Residual structure

Densely Connected Convolutional Networks. Unlike ResNet that lets the network get deeper or wider, DenseNets tries to achieve better performance and potential by feature re-use to generate the compression modelling, which is easy to train and parameter-efficient. By adding connections between feature maps in different levels, the subsequent changes in the input layer will increase, and the network will be more effective. This is also the main difference between DenseNets and ResNet. The dense block structure is shown as Figure 2.

Figure 2.

New window| Download| PPT slide

Figure 2. The dense block structure

In a convolution neural network with l levels, each layer l corresponds to a nonlinear mapping H_l(.). Each mapping usually contains the following items: Batch Normalization, ReLU Activation, Pooling and Convolution. When an image X₀ enters the network, after the l^th layer, the output will be X_l. The traditional feed forward network only connects the output of the (l-1)^th layer to the l^th layer directly, that is ${{X}_{l}}=H\left( {{X}_{l-1}} \right)$. ResNet added a skip-connection, as shown in Equation (1).

(1)${{X}_{\text{l}}}={{H}_{l}}\left( {{X}_{l-1}} \right)+{{X}_{l-1}}$

In DenseNets, the network connects the convolutional layer directly to all subsequent levels, i.e. l^th layer receives the features from all the previous convolution layers as a new input, as seen in Equation (2):

(2)${{X}_{\text{l}}}=H\left( \left[ {{X}_{1}},{{X}_{2}},\cdots ,{{X}_{l-1}} \right] \right)$

Where $\left[ {{X}_{1}},{{X}_{2}},\cdots ,{{X}_{l-1}} \right]$ is processing to connect the feature maps from 0, 1, 2, $\dots$ ,(l-1) layers. Compared with Resnet, DenseNets makes full use of the feature information in the previous layer without adding too many parameters, improving the performance of the network effectively.

3. The Image Colorization Algorithm Based on Dense Neural Network

3.1. The Network Structures

The existing grayscale image colorization networks based on deep learning mainly extract the detail texture feature of the image by constructing convolution neural network; the coloring effect is acceptable. However, if there is no proper way to learn the image global context information correctly (such as whether the scene is indoor or outdoor, etc.), the network may have obvious errors. Lizuka et al., Qin et al. incorporated the category information of the images into the network and used the information to co-train the model, which played an informational guiding role for the entire colorization network.

This paper draws on the advantages of the method of Lizuka et al. Qin et al. constructed a dual-stream structure based on the dense neural network, learning the classification information and the detail texture information of the image at the same time. In this paper, we used CIE Lab color space. When the image enters the network, it only learns and predicts the color information of the channel a and b. Then, it combines with the information in channel L from the grayscale image to achieve the coloring. The network structure is shown as Figure 3.

Figure 3.

New window| Download| PPT slide

Figure 3. The network structure

The whole colorization network is composed of three parts: the feature extraction part based on Densenets (the blue part, including the gray transition connection layer), the VGG-based classification guidance part (the red and yellow parts), the fusion and output part (the part circled by dotted lines, green part and output). The detailed structure of the Dense block in the first part is shown in Figure 4 .

Figure 4.

New window| Download| PPT slide

Figure 4. The structure of the dense block

Images with category labels will be converted from BGR color space to Lab color space before entering the network. The entire network model is supervised learning model. The input —channel L of the original image— entered the feature extracting part and the classification part of the network, and then generated the prediction of the ab channel and the prediction of classification correspondingly. Compared with the original ab channel information and the classification labels, we can calculate the colorization loss and the classification error. Then, all losses are feedback to the network and weights are adjusted to train the entire network. The detailed structure of the network is shown in Table 1.

Table 1. The detailed structure of the network

New window| CSV

3.1.1. Feature Extraction

The image of H X W X1 enters the feature extraction section based on dense neural network. It passes through a layer of convolution before entering four dense blocks one after another. The convolution layers in each dense block are closely connected with the subsequent convolution layers, and every layer in each block outputs 12 feature maps. In view of the denseness of Densenets, we put a 1x1 convolution before each 3x3 convolution (as the dense block part shown in Table 1 (a)). This processing decreases the number of input feature maps, reduces the dimensionality, cuts back the amount of computation required to a large extent, as well as blends the features of each channel.

Between every two Dense blocks, we added a 1x1 convolution (the Transition layer). This way, the output feature map of the previous dense block can be reduced (in this paper, the network is set to be reduced to half), so as to avoid the network being too large effectively, as well as reduce the computation burden of the next dense block.

After the image passes through the above network, a large number of features and texture information will be extracted. Because the convolutions in dense blocks are connected to all the previous layers, low-level information is used, which effectively reduced the information loss and improved the problem of gradient vanishing.

3.1.2. Classification Guidance

When the image enters the classification guidance network, the network will extract the classification information of the image. The fully connected layer fc1 reconstructs the extracted features into a 1x4096 eigenvector, and then integrates them via fc2 and fc3 to obtain a 1x64 eigenvector. This helps the entire network determine the category of the image content.

3.1.3. Fusion and Output

After the feature extraction network and the classification guiding network both complete the information extraction, two parts of information will be reconstructed into a feature map with the same dimension and fused in the fusion layer. After the feature re-extraction by Dense-block4, the network will finally generate a H X W X2 output after a convolution operation.

3.2. Loss

As described in Section 3.1, the loss of the network will serve as an important reference for adjusting weights. Loss consists of two parts: one is the feature extraction loss (L1) from the feature extraction network and the other is the classification loss (L2) from the classification guidance network. Feature extraction loss and classification loss are fed back to the network independently and do not interact with each other.

During the training, the network read n images per batch. After obtaining the output prediction, we compared the color prediction with the original image and use the Mean Squared Error (MSE) to measure the disparity between the network output and the true value. This is shown in Equation (3) and Equation (4):

(3)$MSE=\frac{1}{w\times h}\sum\limits_{i=1}^{w}{\sum\limits_{j=1}^{h}{{{\left( {{Y}_{p}}\left( i, j \right)-{{X}_{p}}\left( i, j \right) \right)}^{2}}}}$

(4)${{L}_{1}}=\frac{1}{n}\sum\limits_{k=1}^{n}{MS{{E}_{k}}}$

Where w and h are the width and high of the sample, ${{Y}_{p}}$ is the color of the ab channel of the original image, while ${{X}_{p}}$ is the prediction value by the network, and n refers to the number of images contained in a training batch.

As for the classification guidance part, let the classification information of the input image to be the guidance label y^label, while the predicted classification of the network is y^out. Then, use Cross-Entropy to measure the disparity between the classification prediction of the network and the real classification, as shown in Equation (5):

(5)${{L}_{2}}=\frac{1}{n}\sum\limits_{i=1}^{n}{[-\sum\limits_{i}{y_{i}^{\text{label}}\log (\text{ }y_{i}^{\text{out}})}]}$

When calculating the log function value on $y_{i}^{\text{out}}$, if the value of $y_{i}^{\text{out}}$ is 0, $\log (\text{ }y_{i}^{\text{out}})$ will be positive infinity. So, when the number is smaller than $1e-10$, set it equal to $1e-10$.

3.3. The Result evaluation——Information Loss

As stated in chapter 1, the main purpose of coloring the grayscale images is to obtain more information from the colored results than the grayscale images. Whether the result is clearand the amount of information contained is sufficientcan be regarded as an important index to measure the merits of the colorization algorithm.

Image information entropy is a widely used index to evaluate the information adequacy in an image. The amount of information contained in an image can be judged by calculating its information entropy. The larger the information entropy value, the higher the grayscale/chromaticity levels does the picture has; correspondingly, the amount of color information is more adequate. By using information entropy, we can judge whether the coloring results is consistent with subjective feelings of the human eyes from the objective point of view. It is calculated as Equation (6):

(6)$InEn=-\sum\limits_{i=0}^{c}{P\left( i \right)}{{\log }_{2}}\left( P\left( i \right) \right)$

Where $InEn$ is the image information entropy value of the picture, $P\left( i \right)$ represents the appearance frequency of the color whose value is i in the whole image, and c represents the range of color values in Lab color space. By comparing the information entropy between the results of the colorization network and the original image and calculating the discrepancy between them, this paper adopts the evaluation criteria based on information entropy. A smaller the discrepancymeans the color information loss is fewer. That is to say, the effectiveness of the colorization network is better.

The network proposed mainly estimates the value of ab channel based on the Lab color space; it does not calculate and process the value of channel L (represent the grayscale information of the image) repeatedly. Therefore, to improve the efficiency, we only consider the information contained in the ab channels when evaluating the colorization effect, and compare the information entropy of the ab channels between the output image and the original image. This is shown as Equation (7):

(7)$Info\_loss=InE{{n}_{C}}-InE{{n}_{O}}$

Where $Info\_loss$, $InE{{n}_{C}}$, $InE{{n}_{O}}$ denote the degree of loss of color information, the entropy of the original image, and the information entropy of the output image in this paper. It can be seen from the above definition that the smaller the Info_loss value, the better the colorization result.

4. Experiments and Analysis

4.1. Experimental Data Set and Environment

As a supervised colorization method, the network proposed needs a large number of color images with classified labels as training data sets. Therefore, we chose the MIT Places Database (containing 205 scenes and more than 2.5 million images) [26] and ImageNet (containing 1000 scenes and more than 1.2 million images) [27] to train the network. HDF5is used to process the data set and generate a data file of “.h5” type, which is no longer necessary to read a large number of single pictures in sequence to facilitate the operation and maintenance. The colorization network proposed needs a lot of matrix calculation. To improve the training efficiency, we used GPU (Graphics Processing Unit) to do the training; the GPU type is NVIDIA Tesla M40. In the implementation of the method, we used Python programming environment, and chose TensorFlow [28] to build the network.

4.2. Coloring Result and the Comparison with Advanced Methods

4.2.1. The Coloring Effect Comparison that Have Reference of the Original Image

To verify the effectiveness of the proposed algorithm, we choose some representative images and compared them with the existing excellent algorithms such as Zhang et al., Lizuka et al., Qin et al.This is shown in Figure 5. We compared the coloring effect from the following aspect: the information loss in details (the numbers under each picture in Figure 5 is the corresponding Info_loss value calculated according to the formula (7)), whether the color overflows, color contrast, and the overall harmony. In the comparison, the coloring results of Zhang et al. and Lizuka et al. are all derived from the coloring models of their external publicity websites. The coloring effect of Qin et al. is based on their final coloring model.

Figure 5.

New window| Download| PPT slide

Figure 5. The comparison of coloring effect (with the guidance of original image)

The information loss in details. As group (1) shows in Figure 5, none of Zhang et al., Lizuka et al., Qin et al. have colored the plant next to the window in green properly. In group (2), Zhang et al., Lizuka et al. did not distinguish between the sparse parts of grassland when coloring the lower left corner. In group (4), neither of Zhang et al., Lizuka et al., Qin et al. did give due green color to the plants in the picture frame. In group (7), Lizuka et al., Qin et al. did not color the plant properly. Our method performed well in these details.

Whether the color overflows. In group (2), Zhang et al. spread a little bit green of the trees to the roof boundary. Qin et al. appeared abnormal blue on the right side of the roof. In group (3), Zhang et al. have some anomalous green around the image, and a color leak appears beside the skirt of the woman dancer in the result of Qin et al. In group (7), Zhang et al. have appeared unclear boundary and color leakage phenomenon beside the left side of the green leaf and at the roof of the room.

Color contrast. In the above groups, the overall coloring results of Zhang et al., Lizuka et al., Qin et al. tend to be biased toward the overall color tone, and the contrast of colors in different objects in the image is not strong, In particular, the walls and floors of the three in group (1), the ceiling of Zhang et al., Lizuka et al. in group (4). The sky and sand of Zhang et al., Lizuka et al. in group (5), The walls, ceilings and floors of Zhang et al., Lizuka et al. in group (6), (7). It can be seen that the method of Qin et al.have solved the problems of low contrast and border leakage to a certain extent, but the contrast of the overall effect is still not as significant as the method proposed in this article.

The overall harmony. Combined with the mentioned directions, our method is superior to Zhang et al., Lizuka et al., and Qin et al. In addition, it can be seen from the comparison that the subjective feeling is always better when the Info_loss value is smaller, which also shows that the loss of information can indeed play a role in judging the coloring effect. This is consistent as the subjective observation feelings of the human eyes. Overall, the Info_loss value of our method is smaller than that of the other three methods, which also objectively verifies the superiority of our method.

4.2.2. The Coloring Effect Comparison of Old Photos and Grayscale Image

To verify the universality of our algorithm, we selected some old photos and grayscale images to compare the coloring effects, as shown in Figure 6. From the coloring effect comparison of old photos and grayscale images, we can see that our method has less color overflow, better contrast and more abundant details than the algorithm of Zhang et al., Lizuka et al., and Qin et al. (Such as in group (1) our method gave the tree trunk the proper color brown).

Figure 6.

New window| Download| PPT slide

Figure 6. The comparison of coloring effect (old photos and grayscale images)

4.2.3. Algorithm Comparison (with Qin et al.)

As we can see, compared with the prior advance colorization method based on deep learning. the colorization method based on Residual neural network proposed by Qin et al. have reduced the information loss to a certain degree; the coloring effect is also improved. We compared them by calculating the Info_loss value of the two methods. We chose 5000 pictures randomly, and then calculated the average. The mean of Qin et al.’s method is 2.31481063099, and the mean of the method proposed in this paper is 1.92628092468. Figure 7 shows the result comparison. The blue line in Figure 7 refers to the method proposed in this paper, while the red line refers to the method of Qin et al. The horizontal axis indicates the value of Info_loss and the vertical axis indicates the frequency of images.

Figure 7.

New window| Download| PPT slide

Figure 7. The comparison of the two methods

5. Conclusion and Prospection

5.1. Conclusion

In this paper, we proposed a gray-scale image colorization algorithm based on dense neural network. The algorithm includes sub-networks and VGG sub-networks, which are composed of dense blocks to extract detailed texture information and classification information respectively. Two kinds of information fused together to generate the output of the network as the prediction of color picture. The experimental results show that the proposed method is better than the existing excellent grayscale image colorization algorithm in terms of detail information richness and contrast, and the color overflow is also significantly reduced, using in old photos and grayscale images coloring can also get good performance.

5.2. Shortage and Future Research Direction

Due to the denseness of the network, the performance requirements of the running equipment are high, and the network training needs a long time. At the same time, the coloring effect of our method may not ideal for an image that has not been learned since the data set hasn’t cover all the image categories. In the next phase of research, we plan to enhance the universality and utility of our approach by optimizing the whole network architecture and training more type of images.

Reference

By original order

By published year

By cited times in this paper

By Journal Impact factor

[1]

Gonzalez

and R.

Woods

“Digital Image Processing, ”

Publishing House of Electronics Industry, Beijing, 2011

[Cited within: 1]

[2]

Levin

, D.

Lischinski

, and Y.

Weiss

“Colorization using Optimization, ”

ACM SIGGRAPH, Vol. 23, No. 3, pp. 689-694, 2004

[Cited within: 1]

[3]

Lagodziński

and B.

Smolka

“Medical Image Colorization, ”

Journal of Medical Informatics & Technologies, Vol. 11, pp. 47-57, November 2007

[Cited within: 1]

[4]

A.A.

Shah

, M.

Gandhi

, K. M.

Shah

“Medical Image Colorization using Optimization Technique, ”

Acta Medica Okayama, Vol. 62, No. 141, pp. 235-248, 2013

[Cited within: 1]

[5]

, L.

Zhu

, L.

Zhang

, Y.

Liu

, A.

Wang

“Pseudo-Colorization of Medical Images based on Two-Stage Transfer Model, ”

Chinese Journal of Stereology & Image Analysis, Vol. 18, No. 2, pp. 135-144, February 2013

URL [Cited within: 1]

To deal with the drawbacks of the traditional transfer function designing with single-stage multi-channel model, a novel pseudo-colorization model based on two-stage transfer function was developed.Thus, the complex function selection and control point tuning could be avoided.Four modes in the primary grayscale transform stage were proposed, in which the low-level, high-level, middle-level of the original image's grayscale could be stretched or compressed, therefore the area of interest was enhanced.In the secondary colorizing stage, two methods were proposed and the rainbow method was selected, consequently the visual effect and distinguishing ability were improved.The two-stage model has exceptional flexibility and expansibility, ready for multiple applications, and the generated pseudo-color images are quite natural and clear, more conducive to the doctors for diagnosis.

[6]

Welsh

, M.

Ashikhmin

, and K.

Mueller

“Transferring Color to Grayscale Images, ”

Acm Transactions on Graphics, Vol.21, No. 3, pp. 277-280, July 2002

[Cited within: 1]

[7]

Liu

, L.

Wan

, Y.

, T.T.

Wong

, S.

Lin

“Intrinsic Colorization, ”

Acm Siggraph Asia, Vol. 27, No. 5, pp. 1-9, December 2008

[Cited within: 1]

[8]

Liu

, C.

Michael

, U.

Matt

, R.

Szymon

“AutoStyle: Automatic Style Transfer from Image Collections to Users’ Images, ”

Eurographics Symposium on Rendering Eurographics Association, Vol. 33, No. 4, pp. 21-31, July 2014

DOI:10.1111/cgf.12409 URL [Cited within: 1]

AbstractStylizing photos, to give them an antique or artistic look, has become popular in recent years. The available stylization filters, however, are usually created manually by artists, resulting in a narrow set of choices. Moreover, it can be difficult for the user to select a desired filter, since the filters’ names often do not convey their functions. We investigate an approach to photo filtering in which the user provides one or more keywords, and the desired style is defined by the set of images returned by searching the web for those keywords. Our method clusters the returned images, allows the user to select a cluster, then stylizes the user's photos by transferring vignetting, color, and local contrast from that cluster. This approach vastly expands the range of available styles, and gives each filter a meaningful name by default. We demonstrate that our method is able to robustly transfer a wide range of styles from image collections to users’ photos.

[9]

Morimoto

, Y.

Taguchi

, T.

Naemura

“Automatic Colorization of Grayscale Images using Multiple Images on the Web, ”

inProceedings of the SIGGRAPH 2009: TalksACM, pp. 1-1, New York, USA, August 2009

DOI:10.1145/1599301.1599333 URL [Cited within: 1]

Colorization is the process of adding color to monochrome images and video. It is used to increase the visual appeal of images such as old black and white photos, classic movies, and scientific visualizations. Since colorizing grayscale images involves assigning three-dimensional (RGB) pixel values to an image whose elements are characterized by one feature (luminance) only, the colorization problem does not have a unique solution. Hence, human interaction is typically required in the colorization process. Although existing colorization methods attempt to minimize the amount of user intervention, they require users to manually sellect a similar image to the target image or input a set of color seeds for different regions of the target image. In this paper, we present an entirely automatic colorization method using multiple images collected from the Web. The method generates various and natural colorized images from an input monochrome image by using the information of the scene structure.

[10]

Irony

, D.

Cohen-Or

, D.

Lischinski

“Colorization by Example, ”

Eurographics Symposium on Rendering Techniques, Konstanz, Germany,Vol.5, No. 10, pp. 201-210, June 2005

[Cited within: 1]

[11]

Cheng

, Q.

Yang

, B.

Sheng

“Deep Colorization, ”

in Proceedings of the2015 IEEE International Conference on Computer Vision (ICCV), pp. 415-423, Santiago, Chile, December 2015

[Cited within: 1]

[12]

Deshpande

, J.

Rock

, D.

Forsyth

“Learning Large-Scale Automatic Image Colorization, ”

in Proceedings of the2015 IEEE International Conference on Computer Vision (ICCV), pp. 567-575, Santiago, Chile, 2015

DOI:10.1109/ICCV.2015.72 URL [Cited within: 1]

We describe an automated method for image colorization that learns to colorize from examples. Our method exploits a LEARCH framework to train a quadratic objective function in the chromaticity maps, comparable to a Gaussian random field. The coefficients of the objective function are conditioned on image features, using a random forest. The objective function admits correlations on long spatial scales, and can control spatial error in the colorization of the image. Images are then colorized by minimizing this objective function. We demonstrate that our method strongly outperforms a natural baseline on large-scale experiments with images of real scenes using a demanding loss function. We demonstrate that learning a model that is conditioned on scene produces improved results. We show how to incorporate a desired color histogram into the objective function, and that doing so can lead to further improvements in results.

[13]

Zhang

, P.

Isola

, A. A.

Efros

“Colorful Image Colorization, ”

in Proceedings of the European Conference on Computer Vision (ECCV), pp. 649-666, Amsterdam, The Netherlands, October 2016

[Cited within: 1]

[14]

Simonyan

and A.

Zisserman

“Very Deep Convolutional Networks for Large-Scale Image Recognition, ”

arXiv Preprint arXiv: 1409.1556, September 2014

[Cited within: 1]

[15]

Ronneberger

, P.

Fischer

, T.

Brox

“U-Net: Convolutional Networks for Biomedical Image Segmentation, ”

in Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234-241, Munich, Germany, October 2015

[Cited within: 1]

[16]

Zhang

, J.Y.

Zhu

, P.

Isola

, X.

Geng

, A. S.

Lin

, T.

, et al.,

“Real-Time User-Guided Image Colorization with Learned Deep priors, ”

Acm Transactions on Graphics, Vol. 36, No. 4, pp. 119:1-119:11, July 2017

DOI:10.1145/3072959.3073703 URL [Cited within: 1]

Abstract: We propose a deep learning approach for user-guided image colorization. The system directly maps a grayscale image, along with sparse, local user "hints" to an output colorization with a Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates user edits by fusing low-level cues along with high-level semantic information, learned from large-scale data. We train on a million images, with simulated user inputs. To guide the user towards efficient input selection, the system recommends likely colors based on the input image and current user inputs. The colorization is performed in a single feed-forward pass, enabling real-time use. Even with randomly simulated user inputs, we show that the proposed system helps novice users quickly create realistic colorizations, and offers large improvements in colorization quality with just a minute of use. In addition, we demonstrate that the framework can incorporate other user "hints" to the desired colorization, showing an application to color histogram transfer. Our code and models are available at this https URL

[17]

Lizuka

, E.

Simo-Serra

, H.

Ishikawa

“Let There Be Color!: Joint End-to-End Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification, ”

Acm Transactions on Graphics, Vol.35, No. 4, pp. 1-11, July 2016

DOI:10.1145/2897824.2925974 URL [Cited within: 1]

react-text: 432 A brain computer interface (BCI) system provides a communication channel between a brain and a computer bypassing the need for muscular means. Electroencephalography based BCI systems that utilize the P300 speller paradigm are commonly used but due to the nature of the P300 speller paradigm, these systems are prone to erroneous classification. In this study, a novel approach addressing the... /react-text react-text: 433 /react-text [Show full abstract]

[18]

Qin

, Z.

Cheng

, Y.

Cui

, J.

Zhang

, Q.

Miao

“Research on Image Colorization Algorithm based on Residual Neural Network, ”

in Proceedings of theCCF Chinese Conference on Computer Vision, pp. 608-621, Tianjin, China, October 2017

DOI:10.1007/978-981-10-7299-4_51 URL [Cited within: 1]

Abstract In order to colorize the grayscale images efficiently, an image colorization method based on deep residual neural network is proposed. This method combines the classified information and features of the image, uses the whole image as the input of the network and forms a non-linear mapping from grayscale images to the colorful images through the deep network. The network is trained by using the MIT Places Database and ImageNet and colorizes the grayscale images. The experiment result shows that different data sets have different colorization effects on grayscale images, and the complexity of the network determines the colorization effect of grayscale images. This method can colorize the grayscale images efficiently, which has better visual effect.

[19]

, X.

Zhang

, S.

Ren

, J.

Sun

“Deep Residual Learning for Image Recognition, ”

in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, Las Vegas, USA, June 2016

DOI:10.1109/CVPR.2016.90 URL [Cited within: 1]

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers 8 deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

[20]

Huang

, Z.

Liu

, K.Q.

Weinberger

“Densely Connected Convolutional Networks, ”

in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 1-9, Honolulu, Hawaii, USA, July 2017

[Cited within: 1]

[21]

Krizhevsky

, I.

Sutskever

, G.E.

Hinton

“ImageNet Classification with Deep Convolutional Neural Networks, ”

in Proceedings of the 25th International Conference on Neural Information Processing Systems, pp. 1097-1105, Nevada, December 2012

[Cited within: 1]

[22]

Szegedy

, W.

Liu

, Y.

Jia

, P.

Sermanet

, S.

Reed

, D.

Anguelov

, et al.,

“Going Deeper with Convolutions, ”

in Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-9, Boston, MA, USA, June 2015

[Cited within: 1]

[23]

R.K.

Srivastava

, K.

Greff

, J.

Schmidhuber

“Highway Networks, ”

arXiv Preprint arXiv:1505.00387, May 2015

[Cited within: 1]

[24]

Huang

, Y.

Sun

, Z.

Liu

, D.

Sedra

, K.Q.

Weinberger

“Deep Networks with Stochastic Depth, ”

in Proceedings of the 2016 European Conference on Computer Vision (ECCV), pp. 646-661, Amsterdam, Netherlands, October 2016

[Cited within: 1]

[25]

Larsson

, M.

Maire

, G.

Shakhnarovich

“FractalNet: Ultra-Deep Neural Networks without Residuals, ”

arXiv Preprint arXiv:1605.07648, May 2016

[Cited within: 1]

[26]

Zhou

, A. L.

Garcia

, J.

Xiao.A

Torralba, and A. Oliva, “Learning Deep Features for Scene Recognition using Places Database, ”

in Proceedings of the 27 ^th International Conference on Neural Information Processing Systems, pp . 487-495, Montréal, CANADA, December 2014

[Cited within: 1]

[27]

Deng

, W.

Dong

, R.

Socher

, L.J.

, K.

, F.

“ImageNet: A Large-Scale Hierarchical Image Database, ”

in Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248-255, Florida, USA, August 2009

[Cited within: 1]

[28]

Abadi

, A.

Agarwal

, P.

Barham

. E.

Brevdo

, and Z.F.

Chen

, C.

Citro

et al.,

“Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, ”

arXiv Preprint arXiv:1603.04467, March 2016

[Cited within: 1]

“Digital Image Processing, ”

2011

... Grayscale image coloring (i.e., pseudo-color processing) is a technology that is generated under the above requirements by assigning a color to grayscale values[1] by a specified rule to restore, enhance, or change the color information of the image. Currently, there are mainly two kinds of methods for image coloring: User-guided color spreading and Data-driven image colorization. ...

“Colorization using Optimization, ”

2004

... The algorithms based on user-guided color spreading is when the user conducts the guiding coloring of the key part or area, and then the algorithm performs color expansion according to set rule or conversion specification. Levin et al. [2] proposed a global optimization colorization algorithm. After the user`s coloring, a color image matched with the user’s scribbles was generated, and the priori definition of the image was supported. Lagodziński et al. [3]proposed a novel colorization method that uses the morphological distance transform and the image structure to propagate the colors scribbled by users in grayscale images automatically. The method mentioned above can achieve good results, but different color regions need to be clearly indicated.It usually requires intensive user interaction, and the shade of colors are not well characterized and implemented; color overflow will also easy to appear due to the inappropriate label and the gray-values which are extremely similar etc. Based on the optimized colorization method, Shah et al. [4] used three correlation thresholds to evaluate their performance in information loss. LI et al. [5] also used the colorization method based on thresholds. Such methods have a high dependence on the selected threshold and have a limited number of colors, which make the color shows relatively curt. ...

“Medical Image Colorization, ”

2007

“Medical Image Colorization using Optimization Technique, ”

2013

“Pseudo-Colorization of Medical Images based on Two-Stage Transfer Model, ”

2013

“Transferring Color to Grayscale Images, ”

2002

... The method of pseudo coloring based on data-driven include many categories, including example-based approach. Welsh et al. [6] proposed a method that compared the brightness and texture information between the grayscale image and the example color image to realize the coloring of the grayscale image. Liu et al. [7] proposed that the reference instance images for colorization related to the target grayscale image may be directly searched from the Internet. Liu, Michael et al. [8] and Morimoto et al. [9] achieved grayscale images coloring by color conversion and image analysis. This method works well when the target grayscale image is similar to the reference image but finding reference images and matching is really a time-consuming job. When the colorization target or scene is very complicated or rare, the coloring effect is more difficult to be guaranteed. Irony et al. [10]used the texture feature matching to do the image segmentation of the reference image and the grayscale image firstly. With reference to the reference image, paint the part in the grayscale image with the same color when the texture was similar, although good results can be achieved, the segmentation processing will increase the image processing burden. ...

“Intrinsic Colorization, ”

2008

“AutoStyle: Automatic Style Transfer from Image Collections to Users’ Images, ”

2014

“Automatic Colorization of Grayscale Images using Multiple Images on the Web, ”

2009

“Colorization by Example, ”

2005

“Deep Colorization, ”

2015

... The development of deep learning methods and the advent of high-performance GPUs have opened new directions for data-driven image colorization methods. This kind of method uses neural network.By building different network architecture, it can extract and analyse the content and features of the image and look for the mapping relation between the grayscale image and the color image and train the corresponding model and realize the image colorization. Cheng et al.[11]used large-scale data modelling, post-processing based on joint bilateral filtering and adaptive image clustering to integrate image global information. Deshpande et al. [12] used the LEARCH framework to train the quadratic objective function in the chromaticity diagram and achieve image colorization by minimizing the objective function. The structure of such network is relatively simple, and the colorization effect is limited. Zhang et al.[13]proposed to extract image features by VGG convolutional neural network[14] and predicted the color histogram of each pixel to color the image. Later, they put forward a new idea to extract the information by using U-Net network [15] and combined it with the user interaction [16]. Lizuka et al. [17] constructed a dual-stream structure network; they extracted the global classification information and local characteristic information of the image simultaneously and fused the two types of information to predict the pixel color. Compared with the previous methods, these three methods have made great improvement. However, due to the subsampled and up-sampling they have used during the image processing, there is a certain degree of information loss. Qin et al. [18] used theResidual neural network [19] to extract the detail features, and then combined with the guidance of classification information. Their method has helped to reduce the information loss in some extent, but there are still some problems such as incomplete details coloring and color overflow. In summary, the existing grayscale image colorization techniques mainly have the following problems: ...

“Learning Large-Scale Automatic Image Colorization, ”

2015

“Colorful Image Colorization, ”

2016

“Very Deep Convolutional Networks for Large-Scale Image Recognition, ”

2014

“U-Net: Convolutional Networks for Biomedical Image Segmentation, ”

2015

“Real-Time User-Guided Image Colorization with Learned Deep priors, ”

2017

“Let There Be Color!: Joint End-to-End Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification, ”

2016

“Research on Image Colorization Algorithm based on Residual Neural Network, ”

2017

“Deep Residual Learning for Image Recognition, ”

2016

“Densely Connected Convolutional Networks, ”

2017

... This paper proposed a new method based on Densenets [20] to solve the mentioned problems. The method is based on the idea of skip-connection and has high utilization rate of information; the loss of the low-level semantic information and object contour information is also small. In this paper, we used its characteristics to extract the texture and detail features in the image. At the same time, we obtained the classification information of the image through the VGG sub-network. The network combines the texture details and classification information for feature re-extraction and predicts the color according to the integrated features to obtain the output. We compared the output image with the original color image and calculated the mean square error. After several optimizations training, we got a final colorization model to achieve the conversion from grayscale image to color image. The network does not change the size of the feature map when extracting the detail information. The feature information that is gradually discarded or lost in the traditional network is reused through Densenets, which effectively reduces the gradient vanishing problem and enhances the transmission and utilization of features. As a result, the information utilization ratio of the original image by using the network is higher, the obtained coloring image is better, and the details are also more complete and abundant. ...

“ImageNet Classification with Deep Convolutional Neural Networks, ”

2012

... Convolution neural network. Convolution neural network has already become a research hotspot in image recognition, speech analysis, natural language processing and other fields. It has feature sharing, which can effectively reduce the complexity of the network. It can also play a very effective role in feature extraction and establishing feature mappings, especially in image analysis and processing, images can be input directly into the network, which can avoid the complexity of data reconstruction in feature extraction and classification tasks. At present, many excellent networks based on the convolutional neural network have already emerged such as AlexNet[21], VGG, GoogleNet [22], ResNet and other structures; the Top5 error rate of image classification task has been reduced below 5%. ...

“Going Deeper with Convolutions, ”

2015

“Highway Networks, ”

2015

... Skip-connection. In the deep learning network, as the depth of the network deepens, the gradient vanishing problem will become more obvious. Aiming to solve this problem, many researchers have put forward the targeted solutions, such as ResNet, Highway Networks [23], Stochastic depth [24], FractalNets[25], etc. The network structure of such algorithms varies, but the core is same: creating short connection paths between early and later levels. Among them, Residual neural network (ResNet) has been adopted by more researchers due to its better performance and structural simplicity. It introduces a short connection between the output of the residual block and the input, rather than only stacking the network simply. The mapping relationship can be expressed as F(x)+x, as shown in Figure 1. ...

“Deep Networks with Stochastic Depth, ”

2016

“FractalNet: Ultra-Deep Neural Networks without Residuals, ”

2016

Torralba, and A. Oliva, “Learning Deep Features for Scene Recognition using Places Database, ”

2014

... As a supervised colorization method, the network proposed needs a large number of color images with classified labels as training data sets. Therefore, we chose the MIT Places Database (containing 205 scenes and more than 2.5 million images) [26] and ImageNet (containing 1000 scenes and more than 1.2 million images) [27] to train the network. HDF5is used to process the data set and generate a data file of “.h5” type, which is no longer necessary to read a large number of single pictures in sequence to facilitate the operation and maintenance. The colorization network proposed needs a lot of matrix calculation. To improve the training efficiency, we used GPU (Graphics Processing Unit) to do the training; the GPU type is NVIDIA Tesla M40. In the implementation of the method, we used Python programming environment, and chose TensorFlow [28] to build the network. ...

“ImageNet: A Large-Scale Hierarchical Image Database, ”

2009

“Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, ”

2016

〈

〉