Video Action Recognition by Combining Spatial-Temporal Cues with Graph Convolutional Networks
Li, Tao1; Xiong, Wenjun2; Zhang, Zheng2; Pei, Lishen3
2023-08-30
发表期刊INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE
ISSN0218-0014
摘要Video action recognition relies heavily on the way spatio-temporal cues are combined in order to enhance recognition accuracy. This issue can be addressed with explicit modeling of interactions among objects within or between videos, such as the graph neural network, which has been shown to accurately model and represent complicated spatial- temporal object relations for video action classification. However, the visual objects in the video are diversified, whereas the nodes in the graphs are fixed. This may result in information overload or loss if the visual objects are too redundant or insufficient for graph construction. Segment level graph convolutional networks (SLGCNs) are proposed as a method for recognizing actions in videos. The SLGCN consists of a segment-level spatial graph and a segment-level temporal graph, both of which are capable of simultaneously processing spatial and temporal information. Specifically, the segment-level spatial graph and the segment-level temporal graph are constructed using 2D and 3D CNNs to extract appearance and motion features from video segments. Graph convolutions are applied in order to obtain informative segment-level spatial-temporal features. A variety of challenging video datasets, such as EPIC-Kitchens, FCVID, HMDB51 and UCF101, are used to evaluate our method. In experiments, it is demonstrated that the SLGCN can achieve performance comparable to the state-of-the-art models in terms of obtaining spatial-temporal features.
关键词Video action recognition graph convolutional networks spatial-temporal graphs feature combination
DOI10.1142/S021800142350009X
收录类别SCIE
语种英语
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:001170344400001
出版者WORLD SCIENTIFIC PUBL CO PTE LTD
原始文献类型Article ; Early Access
EISSN1793-6381
引用统计
被引频次[WOS]:0   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.library.ouchn.edu.cn/handle/39V7QQFX/169780
专题国家开放大学
通讯作者Li, Tao
作者单位1.Open Univ Henan, Dept Informat Engn, Zhengzhou 450046, Peoples R China;
2.Open Univ Henan, Resource Construct & Management Ctr, Zhengzhou 450046, Peoples R China;
3.Henan Univ Econ & Law, Dept Informat Engn, Zhengzhou 450046, Peoples R China
推荐引用方式
GB/T 7714
Li, Tao,Xiong, Wenjun,Zhang, Zheng,et al. Video Action Recognition by Combining Spatial-Temporal Cues with Graph Convolutional Networks[J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE,2023.
APA Li, Tao,Xiong, Wenjun,Zhang, Zheng,&Pei, Lishen.(2023).Video Action Recognition by Combining Spatial-Temporal Cues with Graph Convolutional Networks.INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE.
MLA Li, Tao,et al."Video Action Recognition by Combining Spatial-Temporal Cues with Graph Convolutional Networks".INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (2023).
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Li, Tao]的文章
[Xiong, Wenjun]的文章
[Zhang, Zheng]的文章
百度学术
百度学术中相似的文章
[Li, Tao]的文章
[Xiong, Wenjun]的文章
[Zhang, Zheng]的文章
必应学术
必应学术中相似的文章
[Li, Tao]的文章
[Xiong, Wenjun]的文章
[Zhang, Zheng]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。