Username   Password       Forgot your password?  Forgot your username? 


A Novel Double-Layer Framework for Joint Segmentation and Recognition of Multiple Actions

Volume 14, Number 1, January 2018, pp. 101-110
DOI: 10.23940/ijpe.18.01.p11.101110

Cuiwei Liua, Yaguang Lub, Xiangbin Shia,b, Deyuan Zhanga and Fang Liua

aComputer Science, Shenyang Aerospace University, Shenyang, 110136, China
bSchool of Information, Liaoning University, Shenyang, 110036, China

(Submitted on October 2, 2017; Revised on November 15, 2017; Accepted on December 10, 2017)


This paper aims to address the problem of joint segmentation and recognition of multiple actions in a long-term video. Since features obtained from a single frame cannot describe human motion in a period, some literatures initially divide a long-term video into many video clips with fixed length and represent a long-term video as a sequence of video clips. However, a fixed-length video clip may contain frames from two adjacent actions, which would significantly affect the performance of action segmentation and recognition. In this paper, we develop a double-layer framework for segmenting and recognizing multiple actions in a long-term video. In the first layer, a novel unsupervised method based on the directions of velocity is proposed to initially divide an input video into a series of clips with unfixed length. The second layer takes a sequence of video clips as input, and employs a joint segmentation and recognition method to group video clips into several segments while simultaneously labeling the action category for each segment. Experiments conducted on the IXMAS action dataset verify the effectiveness of the proposed approach.


References: 32

1. J. K. Aggarwal and Q. Cai, “Human Motion Analysis: A Review,” Computer Vision & Image Understanding, vol.73, no. 3, pp. 428-440, 1999
2. M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” IEEE International Conference on Computer Vision, vol.29, no.12, pp.1395-1402, Beijing, China, Oct 2005
3. A. Briassouli, T. Vagia, and K. Ioannis, “Human motion analysis via statistical motion processing and sequential change detection,” EURASIP Journal on Image & Video Processing, vol. 2009, no. 1, pp. 1-16, 2009
4. E. J. Y. C. Cahuina and G. Camara Chavez, “A new method for static video summarization using local descriptors and video temporal segmentation,” 26th Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 226-233, Arequipa, Peru, August 2013
5. Q. Chen, Y. Cai, L. Brown, A. Datta, Q. Fan, R. Feris, and et al., “Spatio-temporal fisher vector coding for surveillance event detection,” Proceedings of the 21st ACM international conference on Multimedia, pp. 589-592, Barcelona, Catalonia, Spain, October 2013
6. Y. Cheng, Q. Fan, S. Pankanti, and A. Choudhary, “Temporal Sequence Modeling for Video Event Detection,” 27th IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, June 2014
7. K. Crammer and Y. Singer, “On the algorithmic implementation of multiclass kernel-based vector machines,” Journal of Machine Learning Research, vol. 2, no.2, pp. 265-292, 2002
8. N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” 18th IEEE Conference on Computer Vision and Pattern Recognition, pp. 886-893, San Diego, CA, USA, June 2005
9. N. Dalal, B. Triggs, and C. Schmid, “Human Detection Using Oriented Histograms of Flow and Appearance,” 9th European Conference on Computer Vision, pp. 428-441, Graz, Austria, May 2006
10. P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior recognition via sparse spatio-temporal features,” IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65-72, Beijing, China, Oct 2005
11. M. Hoai, Z. Z. Lan, and F. D. L. Torre, “Joint segmentation and classification of human actions in video,” 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3265-3272, Colorado Springs, Colorado, USA, June, 2011
12. M. J. Rubin and W. A. Richards, “Boundaries of Visual Motion,” AI Memos, vol. 835, 1985
13. J. Lei, G. Li, J. Zhang, Q. Gou, and D. Tu, “Continuous action segmentation and recognition using hybrid convolutional neural network-hidden Markov model model,” Iet Computer Vision, vol.10, no.6 , pp.537-544, 2016
14. S. Li, K. Li, and Y. Fu, “Temporal Subspace Clustering for Human Motion Segmentation,” IEEE International Conference on Computer Vision, pp. 4453-4461, Santiago, Chile, December 2015
15. G. Lu, M. Kudo, and J. Toyama, “Temporal segmentation and assignment of successive actions in a long-term video,” Pattern Recognition Letters, vol. 34, no. 15, pp. 1936-1944, 2013
16. F. Lv and R. Nevatia, “Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching,” 20th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, Minneapolis, Minnesota, USA, June 2007
17. D. Marr and L. Vaina, “Representation and recognition of the movements of shapes,” Proceedings of the Royal Society of London, Series B, Biological Sciences, vol. 214, pp. 501-524, 1982
18. M. Marszalek, I. Laptev, and C. Schmid, “Actions in context,” 22th IEEE Conference on Computer Vision and Pattern Recognition, pp.2929-2936, Miami, Florida, USA, June 2009
19. J. C. Niebles, C. W. Chen, and F. F. Li, “Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification,” 11th European Conference on Computer Vision, pp.392-405, Heraklion, Crete, Greece, September 2010
20. A. S. Ogale, A. Karapurkar, G. Guerra-Filho, and Y. Aloimonos, “View invariant identification of pose sequences for action recognition,” VACE, 2004
21. R. Polana and R. Nelson, “Low level recognition of human motion (or how to get your man without finding his body parts),” Proceedings of the 1994 IEEE Workshop on Motion of Non-Rigid and Articulated Objects, pp. 77-82, Austin, Texas, USA, 1994
22. Y. Rui and P. Anandan, “Segmenting Visual Actions Based on Spatio-Temporal Motion Patterns,” IEEE Conference on Computer Vision and Pattern Recognition, pp.111-118, Hilton Head, SC, USA, June 2000
23. C. Sch, I. Lapte, and B. Caputo, “Recognizing Human Actions: A Local SVM Approach,” International Conference on Pattern Recognition, vol.3, no.17, pp.32-36, Cambridge, UK, Aug 2004
24. L. Shao, L. Ji, Y. Liu, and J. Zhang, “Human action segmentation and recognition via motion and shape analysis,” Pattern Recognition Letters, vol.33, no.4, pp. 438-445, 2012
25. T. Syeda-Mahmood, “Segmenting actions in velocity curve space,” 16th International Conference on Pattern Recognition, pp. 1936-1944, Quebec, Canada, August 2002
26. K. Tang, “Learning latent temporal structure for complex event detection,” 25th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1250-1257, Providence, Rhode Island, USA, June 2012
27. S. N. Vitaladevuni, V. Kellokumpu, and L. S. Davis, “Action Recognition Using Ballistic Dynamics,” 21th IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, Anchorage, Alaska, USA, June 2008
28. A. Vögele and R. Klein, “Efficient unsupervised temporal segmentation of human motion,” Proceedings of the 2014 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 167-176, Copenhagen, Denmark, July 2014
29. D. Weinland, E. Boyer, and R. Ronfard, “Action Recognition from Arbitrary Views using 3D Exemplars,” IEEE International Conference on Computer Vision, pp.1-7, Rio de Janeiro, Brazil, October 2007
30. D. Weinland, R. Ronfard, and E. Boyer, “Free viewpoint action recognition using motion history volumes,” Computer Vision & Image Understanding, vol.104, no.2, pp.249-257, 2006
31. H. Wang and C. Schmid, “Action Recognition with Improved Trajectories,” IEEE International Conference on Computer Vision: 3-6 December 2013; Sydney, Australia, pp. 3551-3558, 2013
32. J. Wang, X. Nie, Y. Xia, Y. Wu, and S. C. Zhu, “Cross-view action modeling, learning and recognition,” 27th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2649-2656, Columbus, Ohio, USA, June 2014


Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader


This site uses encryption for transmitting your passwords.